We took up a binary classification task and used scikit-learn to achieve it. We extensively studied the tradeoffs between performance, accuracy and the level of preprocessing required for the following algorithms (and ensemble methods).
- Decision Tree
- SVM (with linear and RBF kernels)
- k-Nearest Neighbors
- Naive-Bayes
- Random Forest
- Adaboost
A technical report for the project detailing the results can be found here. The report includes precision-recall plots for the aforementioned learning algorithms. All of the plots can be found under plots/
- Adithya Bhat - responsible for pre-processing and feature extraction
- Srinivasan Ravichandran - responsible for implementing scikit-learn machine learning algorithms