Skip to content

Latest commit

 

History

History
4 lines (4 loc) · 541 Bytes

README.md

File metadata and controls

4 lines (4 loc) · 541 Bytes

fake-news-detector

Fake news (text) classification system. Also performs bigram-level analysis on the corpus to calculate some statistics. Dataset was provided by the university. Multiple representations are tested (Bag of Words, tf-idf, Doc2Vec from scratch). A variety of classifiers are explored, such as Complement Naive Bayes (https://scikit-learn.org/stable/modules/naive_bayes.html#complement-naive-bayes), SVM and logistic regression.

Libraries used

pandas, matplotlib, nltk, numpy, wordcloud, gensim, scikit-learn, pandarallel