An AI-powered system to automatically detect toxic, abusive, hateful, and harmful content using Machine Learning + Rule-Based filtering.
- Toxic vs Safe classification
- Hate speech detection using rule-based filtering
- TF-IDF + Logistic Regression
- Real-time text moderation
- Industry-style hybrid moderation approach
ML models can fail on rare hate words. This system combines:
- Rule-based filters (for explicit slurs)
- ML model (for contextual toxicity)
- Python
- Scikit-learn
- NLTK
- Pandas
pip install -r requirements.txt
python train.py
python app.py