A real-time web application that classifies text into six emotions using TF-IDF and Logistic Regression. Built with Streamlit, the app provides predictions, confidence scores, and performance metrics.
The dataset includes 3 CSV files:
training.csvtest.csvvalidation.csv
Each file contains:
text: the userβs messagelabel: an integer from 0 to 5 representing emotion
Emotion Mapping:
| Label | Emotion |
|---|---|
| 0 | π’ Sadness |
| 1 | π Joy |
| 2 | β€οΈ Love |
| 3 | π Anger |
| 4 | π¨ Fear |
| 5 | π² Surprise |
-
Preprocessing:
- Lowercasing
- Punctuation removal
- Extra whitespace cleanup
-
Vectorization:
- TF-IDF with 5000 features
- Removes English stopwords
-
Model:
LogisticRegression(max_iter=1000)from scikit-learn- Trained on the
training.csvset - Evaluated on the
test.csvset
-
Evaluation Metrics:
- Accuracy score
- Classification report
- Confusion matrix (visualized via Plotly)
-
Streamlit UI:
- Styled layout
- Real-time predictions
- Example sentence testing
- Confidence bar display
Install using pip:
pip install -r requirements.txt