This repository implements the basic machine learning classifiers for the problem of Yelp reviews classification. We assume the problem to be a binary classification problem.
The models implemented are :
Naive Bayes in the nbc folder
Logistic Regression, Support Vector Machine (linear) in the lr_svm folder,
Decision Trees, Bagged Decision Trees, Random Fforests, Boosted Decision Trees in the ensembles folder.
Please look at the individual folders README files for instructions on how to run.
Notice that all algorithms are implemented from scratch without using off-the-shelf libraries.
The reason why you are seeing "hw" is actually because the implementations were part of the homeworks in a course I took.
The book to follow when implementing these algorithms is "Principles of Data Mining" by David Hand et al.