The goal of this project is to use genetic markers to predict the geographical origin of an individual, which can be “North America”, “Central America”, or “South America”. We will compare the results of classification using three approaches:
- multinomial regression (i.e. logistic regression with more than two classes) using the nnet package in R.
- linear discriminant analysis (LDA) using the class library in R.
- naive Bayes classifier using the naivebayes R library.
This project has been done using the R programming language in a group of three people, at ENSIMAG.