SafeText is a modern, production-ready web application for real-time detection of hate speech and inappropriate content using advanced BERT-based deep learning models.
- Binary Classification - Classifies content as Appropriate (0) or Inappropriate (1)
- High Performance - 96.01% accuracy with 97.67% precision on test data
- BERT-Based Model - Fine-tuned bert-base-uncased with heavy text preprocessing
- Modern Web Interface - Responsive UI with real-time analysis
- Analytics Dashboard - Live statistics and analysis history with auto-refresh
- RESTful API - JSON endpoints for programmatic integration
- Production Ready - Optimized preprocessing pipeline aligned between training and inference
- Low False Positives - Only 11.5% false positive rate on appropriate content
The system uses a labeled hate speech dataset for training:
- Records: ~25,000 labeled posts
- Categories: Binary classification (Appropriate vs Inappropriate)
- Appropriate: General harmless conversation
- Inappropriate: Offensive, hateful, or harmful content
- Distribution: ~17% Appropriate, ~83% Inappropriate
-
Clone/Download the project
cd saftext -
Create virtual environment (recommended)
python -m venv venv source venv/bin/activate # On Windows: venv\Scripts\activate
-
Install dependencies
pip install -r requirements.txt
-
Setup dataset
python setup_data.py
-
Train the model (first time only)
python train_model.py
- Takes ~15-30 minutes depending on GPU availability
- Generates model weights in
models/folder
-
Run the application
python app.py
- Access at: http://localhost:5000
- Text input with 1,000 character limit
- Real-time character counter
- Analyze button with visual feedback
- Results display with confidence bar and color coding
- Clear button to reset
- Keyboard shortcut:
Ctrl+Enterto analyze
- Statistics cards: Total analyzed, appropriate count, inappropriate count
- Recent history table with last 10 analyses and timestamps
- Auto-refresh: Updates every 5 seconds
- Live classification trends tracking
Architecture: BERT-base-uncased
- Parameters: 110M
- Transformer Layers: 12
- Hidden Size: 768
- Attention Heads: 12
- Max Input Length: 128 tokens
- Output Classes: 2 (Appropriate, Inappropriate)
- Fine-tuning Epochs: 3
- Batch Size: 8
- Learning Rate: 1e-5 (AdamW optimizer)
- Gradient Clipping: 1.0
Preprocessing Pipeline:
The same preprocessing is applied during training and inference:
- Remove HTML entities (
&, , etc.) - Remove URLs (
http://,https://,www.) - Remove @username mentions
- Convert to lowercase
- Remove punctuation characters
- Normalize whitespace
- Tokenize with NLTK (fallback to
str.split) - Remove English stopwords + "rt" (retweet marker)
- Apply Porter stemming (e.g., "running" → "run")
Performance Metrics:
- Accuracy: 96.01%
- Precision: 97.67%
- Recall: 97.53%
- F1-Score: 97.60%
Classification Results:
- Appropriate Class: 88% precision/recall (833 samples)
- Inappropriate Class: 98% precision/recall (4,124 samples)
Confusion Matrix (Test Set):
Predicted Appropriate Predicted Inappropriate
True Appropriate 737 96
True Inappropriate 102 4,022
Interpretation:
- 737/833 appropriate texts correctly identified (88%)
- Only 96/833 appropriate texts wrongly flagged (11.5% false positive rate)
- 4,022/4,124 inappropriate texts correctly caught (97.5%)
- Only 102/4,124 inappropriate texts missed (2.5% false negative rate)
Edit environment variables in .env:
# Flask Configuration
FLASK_ENV=production
SECRET_KEY=saftext-secret-2025
# Model Configuration
DEVICE=cuda # or cpu
CONFIDENCE_THRESHOLD=0.75
MAX_TEXT_LENGTH=1000python train_model.pyThe training script will:
- Load the labeled dataset
- Split into train/validation/test sets (64/16/20)
- Fine-tune BERT for binary classification
- Evaluate on test set
- Save best model weights
- Generate performance metrics
This project is open source. Free to use and modify.