Wikipedia-Toxicity-using-NLP-and-ML

Using NLP and machine learning, make a model to identify toxic comments from the Talk edit pages on Wikipedia. I worked on this project for completion of Natural Language Processing module of my PGP course in Data Science from Simplilearn-Purdue University.

Problem Statement: Wikipedia is the world’s largest and most popular reference work on the internet with about 500 million unique visitors per month. It also has millions of contributors who can make edits to pages. The Talk edit pages, the key community interaction forum where the contributing community interacts or discusses or debates about the changes pertaining to a particular topic. Wikipedia continuously strives to help online discussion become more productive and respectful. My task was to build a predictive model that identifies toxic comments in the discussion and marks them for cleanup by using NLP and machine learning. Post that, help identify the top terms from the toxic comments.

Please contact me for any feedback/ suggestion/ collaboration at: mgupta.power@gmail.com

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
README.md		README.md
Wikipedia Toxicity.ipynb		Wikipedia Toxicity.ipynb
train.csv		train.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Wikipedia-Toxicity-using-NLP-and-ML

About

Releases

Packages

Languages

manishgupta-ind/Wikipedia-Toxicity-using-NLP-and-ML

Folders and files

Latest commit

History

Repository files navigation

Wikipedia-Toxicity-using-NLP-and-ML

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages