- Load data and describe data.
- Split data to train, test and validation dataset (in case of data snooping).
- Train common classification model, like KNN, Decision Tree, Random Forest, SVM and find the best one.
- Plot to show the results
Herein I apply most supervised classification models to the abalones dataset and predict the age as young,medium or old. I use most common classification models such as K-Nearest-Neighbor (KNN), Decision Tree, Random Forest and Support Vector Machines (SVM). My work involves applying pre-processing techniques to the data, splitting the data into training and testing data, splitting training data in validation and hyper-parameter data, applying cross validation techniques and finding the best tuning parameters for different models. Finally, I apply my best model to the testing data and get accuracy which validates my results. The best model of this data set is Random Forest.The accuracy of training dataset is 66.31%, and the accuracy of test dataset is 63.63%.