Classification of news articles into 8 different categories.
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
CountVectorProcessing.py
CountVectorProcessing1.py
DecisionTree.py
Naive_Bayes_text.py
PreProcess.py
README
README.md
Report.pdf
classification.py
classificationensemble.py
classificationmetrics.py
classificationutils.py
doc2vec.py
interfacefactory.py
test_input.csv
test_predict_random.csv
textfeatures.py
train_input.csv
train_output.csv
trainclassificationmodel.py
trainmodel.py
word2vec.py

README.md

TextClassification

Classification of news articles into 8 different categories. This project classifies text into 8 different categories using 3 different classifiers - Naive Bayes, Decision Tree classfier and Ensemble methods using Word 2 vector model.
Naive bayes implementation from scratch gets the highest accuracy amongst the hand implemented classifiers. The final predictions on Kaggle are made using Linear SVM and Naive Bayes from Sci-kit learn. Scored 1st postion on Public Leaderboard and 2nd on private learderboard (unseen test data) with an accuracy of 97%.

https://inclass.kaggle.com/c/comp-551-miniproject-2-reddit-classification