First of all, a big Thank you to my colleague Ayman Makhoukhi for the collaboration on this project, since it is an academic project, that we did during our Bachelor in Data Science
, and we had the opportunity to collaborate in such projects, in order to practice various approches of Machine Learning
in making prediction models.
To solve this classification problem, we used the Decision tree
classifier.
So, what is a decision tree classifier ?
Briefly, this classifier is a flowchart-like
structure in which each internal node
represents a test
on an attribute (e.g. whether a coin flip comes up heads or tails), each branch
represents the outcome of the test, and each leaf node
represents a class label (decision taken after computing all attributes). The paths from root to leaf represent classification rules
.
In order to create this classifier, we need an Initial Data, first to train the model with it, and second to test its performance after training it, for that reason, the dataset chosen for this project, is from Kaggle.com
.
Link to Dataset
For better understanding of the code, each step has a comment with it, to make it easier for you to follow along.
Decision_tree.Rmd file : this one contains comments in french.
Decision_tree.R file : this one contains comments in English.