This project focuses on classifying emotions from tweets using various machine learning models, including fine-tuned BERT, custom BERT, Random Forest, SVM, Logistic Regression, and Naive Bayes. The project includes a Streamlit web application for easy interaction and visualization of the classification results.
The primary goal of this project is to classify emotions into categories such as sadness, joy, love, anger, and fear. The models have been trained and evaluated to provide accurate predictions based on the input text.
-
Clone the repository:
git clone https://github.com/yourusername/emotion-classification.git cd emotion-classification -
Install the required packages:
pip install -r requirements.txt
-
Run the Streamlit app:
streamlit run app.py
Below is the table of performance metrics for each model evaluated on the validation dataset:
| Model | Train Accuracy | Test Accuracy | F1 Score (Test Data) | Precision (Test) Data) | Recall (Test Data) |
|---|---|---|---|---|---|
| Custom BERT Model + Fine-tuned | 0.881 | 0.788 | 0.606148 | 0.820222 | 0.598914 |
| Pretrained BERT Model + Fine-tuned | 0.89 | 0.79 | 0.93 | 0.84 | 0.56 |
| Naive Bayes | 0.67 | 0.60 | 0.60 | 0.76 | 0.67 |
| Random Forest | 0.867 | 0.75 | 0.84 | 0.86 | 0.82 |
| SVM | 0.88 | 0.78 | 0.84 | 0.87 | 0.82 |
| Logistic Regression | 0.85 | 0.76 | 0.79 | 0.87 | 0.74 |
The fine-tuned BERT model has been trained on the dataset to classify emotions. The model and tokenizer are loaded and used for prediction in the Streamlit app.
The custom BERT model, which has been pre-trained on a different dataset, is also included in the project. This model is used as an alternative for emotion classification.
The Random Forest model uses a TF-IDF vectorizer for feature extraction. This model provides a simple yet effective way of classifying emotions.
The Support Vector Machine (SVM) model also uses a TF-IDF vectorizer. It is known for its robustness in handling high-dimensional data.
The Logistic Regression model, combined with a TF-IDF vectorizer, offers another method for classifying emotions from text.
The Naive Bayes model, using a TF-IDF vectorizer, provides a probabilistic approach to emotion classification.
The Streamlit app allows users to input a tweet and select a model for emotion classification. The app displays the predicted emotion and the confidence score.
For any inquiries or feedback, please contact:
- Name: Deepanshu Miglani
- Education: B.tech CSE(AIML) , UPES, Dehradun
- Email: deepanshumiglani0408@gmail.com / Deepanshu.106264@stu.upes.ac.in
- GitHub: deepanshum0408
Dr. Sahinur Rahman Laskar
Assistant Professor
School of Computer Science, UPES, Dehradun, India
Email: sahinurlaskar.nits@gmail.com / sahinur.laskar@ddn.upes.ac.in

