Skip to content

Latest commit

 

History

History
9 lines (9 loc) · 933 Bytes

README.md

File metadata and controls

9 lines (9 loc) · 933 Bytes

Text Classification Using Naive Bayes

In this project Multinomial Naive Bayes(sklearn's MultinomialNB as well as Multinomial Naive Bayes implemented from scratch) has been used for text classification using python 3.
Dataset available at - http://archive.ics.uci.edu/ml/datasets/Twenty+Newsgroups
Given a text document we aim to predict the news group category (out of the the 20 given categories) it belongs to.
(If you want quicker run time you can change the directory from 20_newsgroups to mini_newsgroups which consists of a smaller dataset)

Features

You can fit the Multinomial Naive Bayes classifier over the training data, make predictions and get the score(mean accuracy) for testing data.
Our model gives similar results on comparison with sklearn's MultinomialNB.
The model has been trained on 15,000 documents and 5,000 articles have been used for testing purposes.