Skip to content

Development of a model to classify the toxicity/offensiveness of comments, streaming tweets and movie scripts. Spoiler alert: The Lord of the Rings is safe to watch near kids.

Notifications You must be signed in to change notification settings

vmanita/NLP-Toxic-comments-classification

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 

Repository files navigation

NLP-Toxic-comments-classification

The aim of this project was to enter the world of Text Mining and NLP and try to develop a model that had the ability of detecting toxic sentences or comments. The data used was collected from a late Kaggle competition (https://www.kaggle.com/c/jigsaw-toxic-comment-classification-challenge) and it provided a train and test dataset. But why stop in the data that was given? After installing the Twitter API, streaming data was collected to evaluate and classify real tweets. Is it really over? Hell no. Why not classify movie scripts according to its toxicity? I scraped 1000 movie scripts and applied the trained model to access how offensive was a movie. Well, I guess Lord of The Rings is safe to watch near kids.

About

Development of a model to classify the toxicity/offensiveness of comments, streaming tweets and movie scripts. Spoiler alert: The Lord of the Rings is safe to watch near kids.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages