The aim of this project was to enter the world of Text Mining and NLP and try to develop a model that had the ability of detecting toxic sentences or comments. The data used was collected from a late Kaggle competition (https://www.kaggle.com/c/jigsaw-toxic-comment-classification-challenge) and it provided a train and test dataset. But why stop in the data that was given? After installing the Twitter API, streaming data was collected to evaluate and classify real tweets. Is it really over? Hell no. Why not classify movie scripts according to its toxicity? I scraped 1000 movie scripts and applied the trained model to access how offensive was a movie. Well, I guess Lord of The Rings is safe to watch near kids.
-
Notifications
You must be signed in to change notification settings - Fork 0
vmanita/NLP-Toxic-comments-classification
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
About
Development of a model to classify the toxicity/offensiveness of comments, streaming tweets and movie scripts. Spoiler alert: The Lord of the Rings is safe to watch near kids.
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published