Skip to content

aryansi225/chat_toxicity_classifier

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Chat Toxicity Classifier

Classifier to get toxicity metrics

This is a Flask application that spits the probability of a comment entered, being of following categories: Toxic, Severe Toxic, Obscene, Threat, Insult, Identity Hate.

The model was created using keras and ipython notebook for the same is in the scripts folder.

Following are the steps followed in the notebook:

  1. The data was taken from kaggle Toxic Comment Classification Challenge (https://www.kaggle.com/c/jigsaw-toxic-comment-classification-challenge/data) dataset.
  2. Preprocessing is done on train and test data to turn comments into word index of equal length by truncation and padding.
  3. Using word -> vector of Glove a embedding matrix is created.
  4. A simple bidirectional LSTM with 2 fully connected layer is created.

The models or pickeled objects are not in models folder since it would increase the size of repository, but it can be easily created by running the notebook.

LIVE DEMO HERE -> https://chattoxicity.appspot.com/

Screenshot

image

image

Dependencies

Flask, Tensorflow, Keras

References

https://www.kaggle.com/jhoward/improved-lstm-baseline-glove-dropout?sortBy=relevance&group=everyone&search=toxic+comment+&page=1&pageSize=20&turbolinks%5BrestorationIdentifier%5D=e88bae67-bc31-400d-a502-053b547cb912

My Original Contribution & Learnings

Contribution => Reimplemented the code after understanding the above kaggle kernel mentioned in the reference. Used the generated model in a flask application which was built so that prediction for an input can be made interactive. Deployed on GCP using App Engine.

Major Learnings => Learnt how to use transfer learning using Glove. Learnt how to built Flask application and serve a saved keras model. Learnt how to deploy on GCP using App Engine.

About

Classifier to get toxicity metrics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published