SpeechTox is an application which detects toxicity of words used in a content. It is developed using HTML, CSS, JS and Django, and integrates a Machine Learning model.
The machine learning model has been trained on the following Kaggle dataset. The application is used to predict the following categories given any text: toxic, severe_toxic, obscene, threat, insult and identity_hate. The model is trained using Logistic Regression.
The application serves the following purposes:
- Given any text, it lists the categories of language content it fits in.
- Given a song title and artist, it again displays the categories of language content it fits in.
- Given a billboard playlist and number of items, it classifies the songs in the playlist into two categories: decent and indecent.