Skip to content

Using NLP and machine learning, make a model to identify toxic comments from the Talk edit pages on Wikipedia.

Notifications You must be signed in to change notification settings

manishgupta-ind/Wikipedia-Toxicity-using-NLP-and-ML

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 

Repository files navigation

Wikipedia-Toxicity-using-NLP-and-ML

Using NLP and machine learning, make a model to identify toxic comments from the Talk edit pages on Wikipedia. I worked on this project for completion of Natural Language Processing module of my PGP course in Data Science from Simplilearn-Purdue University.

Problem Statement: Wikipedia is the world’s largest and most popular reference work on the internet with about 500 million unique visitors per month. It also has millions of contributors who can make edits to pages. The Talk edit pages, the key community interaction forum where the contributing community interacts or discusses or debates about the changes pertaining to a particular topic. Wikipedia continuously strives to help online discussion become more productive and respectful. My task was to build a predictive model that identifies toxic comments in the discussion and marks them for cleanup by using NLP and machine learning. Post that, help identify the top terms from the toxic comments.

Please contact me for any feedback/ suggestion/ collaboration at: mgupta.power@gmail.com

About

Using NLP and machine learning, make a model to identify toxic comments from the Talk edit pages on Wikipedia.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published