Online platforms receive thousands of comments โ some of them harmful.
This project detects toxic behavior in text using machine learning.
Each comment can belong to one or more categories:
- toxic
- severe_toxic
- obscene
- threat
- insult
- identity_hate
๐ This is a multi-label classification problem
- Convert text into numerical features
- Train models to predict multiple labels
- Use One-vs-Rest strategy for handling multi-label outputs
- Logistic Regression
- Multinomial Naive Bayes
- Wrapped with OneVsRestClassifier
- Accuracy Score
- ROC-AUC Score
- Classification Report
- ROC Curve
- Python
- Scikit-learn
- NLP preprocessing (tokenization, cleaning)
git clone https://github.com/your-username/toxic-comment-classification.git
cd toxic-comment-classification
pip install -r requirements.txt
python app.py