Text classifier tool.
Train and test data from https://www.cs.umb.edu/~smimarog/textmining/datasets/
EN_5BBC_labels is source data from public data set on BBC news articles: D. Greene and P. Cunningham. "Practical Solutions to the Problem of Diagonal Dominance in Kernel Document Clustering", Proc. ICML 2006. [PDF] [BibTeX].http://mlg.ucd.ie/datasets/bbc.html Get Cleaned up version : https://storage.googleapis.com/dataset-uploader/bbc/bbc-text.csv
EN_5H_labels contains around 200k news headlines from the year 2012 to 2018 obtained from HuffPost. Data from https://www.huffpost.com/
EN_4_labels(CTMH) is a collection of more than 1 million news articles. News articles have been gathered from more than 2000 news sources by ComeToMyHead in more than 1 year of activity.Data from http://www.di.unipi.it/~gulli/AG_corpus_of_news_articles.html.
- Developed based on exercises from the NLP Udemy course by Lazy Programmer
- Course URL:
- https://deeplearningcourses.com/c/natural-language-processing-with-deep-learning-in-python
- https://udemy.com/natural-language-processing-with-deep-learning-in-python