<a href="https://colab.research.google.com/github/Rakib911Hossan/Emotion_Detection-AI_Project/blob/main/Emotion__Detection_distilroberta_ML_(1).ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **j-hartmann/emotion-english-distilroberta-base**

 The datasets represent a diverse collection of text types. Specifically, they contain emotion labels for texts from Twitter, Reddit, student self-reports, and utterances from TV dialogues. As MELD (Multimodal EmotionLines Dataset) extends the popular EmotionLines dataset, EmotionLines itself is not included here.




*  Crowdflower (2016)
*   Emotion Dataset, Elvis et al. (2018)
*   GoEmotions, Demszky et al. (2020)
*   List item
*   ISEAR, Vikash (2018)
*   MELD, Poria et al. (2019)
*   SemEval-2018, EI-reg, Mohammad et al.




The model is trained on a balanced subset from the datasets listed above (2,811 observations per emotion, i.e., nearly 20k observations in total). 80% of this balanced subset is used for training and 20% for evaluation. The evaluation accuracy is 66% (vs. the random-chance baseline of 1/7 = 14%).

In [None]:
# -*- coding: utf-8 -*-
"""
Fine-tuning j-hartmann/emotion-english-distilroberta-base for Emotion Classification

"""

# ==================== 1. INSTALL AND IMPORT LIBRARIES ====================
print("Installing required libraries...")
!pip install transformers datasets accelerate -q

import pandas as pd
import numpy as np
import torch
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import LabelEncoder
from sklearn.metrics import accuracy_score, precision_recall_fscore_support, classification_report
from sklearn.utils.class_weight import compute_class_weight
from transformers import (
    AutoTokenizer,
    AutoModelForSequenceClassification,
    Trainer,
    TrainingArguments,
    pipeline
)
from datasets import Dataset, DatasetDict
import warnings
warnings.filterwarnings('ignore')

print("✓ Libraries imported successfully")


Installing required libraries...
✓ Libraries imported successfully


In [None]:
# ==================== 2. LOAD DATASET ====================
print("\nLoading dataset...")
from google.colab import files
uploaded = files.upload()

df = pd.read_csv("emotion_dataset_raw.csv")
print(f"✓ Dataset loaded: {df.shape[0]} samples")
print(f"✓ Columns: {list(df.columns)}")
print(f"\nEmotion distribution:\n{df['Emotion'].value_counts()}")

# ==================== 3. PREPARE DATA ====================
print("\nPreparing data...")

# Use Clean_Text column (fallback to Text if Clean_Text doesn't exist)
if 'Clean_Text' in df.columns:
    df = df[['Clean_Text', 'Emotion']].dropna()
    df = df.rename(columns={'Clean_Text': 'text', 'Emotion': 'label'})
else:
    df = df[['Text', 'Emotion']].dropna()
    df = df.rename(columns={'Text': 'text', 'Emotion': 'label'})


Loading dataset...


Saving emotion_dataset_raw.csv to emotion_dataset_raw.csv
✓ Dataset loaded: 34792 samples
✓ Columns: ['Emotion', 'Text']

Emotion distribution:
Emotion
joy         11045
sadness      6722
fear         5410
anger        4297
surprise     4062
neutral      2254
disgust       856
shame         146
Name: count, dtype: int64

Preparing data...


In [None]:
# Encode labels
label_encoder = LabelEncoder()
df['label'] = label_encoder.fit_transform(df['label'])

# Save label mapping for later use
label_mapping = {idx: label for idx, label in enumerate(label_encoder.classes_)}
print(f"✓ Label mapping: {label_mapping}")

# Calculate class weights for handling imbalance
class_weights = compute_class_weight(
    class_weight='balanced',
    classes=np.unique(df['label']),
    y=df['label']
)
class_weights = torch.tensor(class_weights, dtype=torch.float)
print(f"✓ Class weights computed: {class_weights.numpy()}")

✓ Label mapping: {0: 'anger', 1: 'disgust', 2: 'fear', 3: 'joy', 4: 'neutral', 5: 'sadness', 6: 'shame', 7: 'surprise'}
✓ Class weights computed: [ 1.0121014   5.0806074   0.8038817   0.39375284  1.9294587   0.64698005
 29.787672    1.0706549 ]


In [None]:
# ==================== 4. TRAIN-TEST SPLIT ====================
print("\nSplitting dataset...")
train_df, test_df = train_test_split(
    df,
    test_size=0.2,
    random_state=42,
    stratify=df['label']
)
print(f"✓ Train set: {len(train_df)} samples")
print(f"✓ Test set: {len(test_df)} samples")


Splitting dataset...
✓ Train set: 27833 samples
✓ Test set: 6959 samples


In [None]:
# ==================== 5. LOAD MODEL AND TOKENIZER ====================
print("\nLoading model and tokenizer...")
model_name = "j-hartmann/emotion-english-distilroberta-base"
tokenizer = AutoTokenizer.from_pretrained(model_name)

# Load model with correct number of labels
num_labels = len(label_encoder.classes_)
model = AutoModelForSequenceClassification.from_pretrained(
    model_name,
    num_labels=num_labels,
    ignore_mismatched_sizes=True  # Important: allows resizing the classification head
)

# Update model's label mapping
model.config.id2label = label_mapping
model.config.label2id = {label: idx for idx, label in label_mapping.items()}

print(f"✓ Model loaded with {num_labels} labels")
print(f"✓ Tokenizer loaded")


Loading model and tokenizer...


tokenizer_config.json:   0%|          | 0.00/294 [00:00<?, ?B/s]

config.json: 0.00B [00:00, ?B/s]

vocab.json: 0.00B [00:00, ?B/s]

merges.txt: 0.00B [00:00, ?B/s]

tokenizer.json: 0.00B [00:00, ?B/s]

special_tokens_map.json:   0%|          | 0.00/239 [00:00<?, ?B/s]

pytorch_model.bin:   0%|          | 0.00/329M [00:00<?, ?B/s]

Some weights of RobertaForSequenceClassification were not initialized from the model checkpoint at j-hartmann/emotion-english-distilroberta-base and are newly initialized because the shapes did not match:
- classifier.out_proj.weight: found shape torch.Size([7, 768]) in the checkpoint and torch.Size([8, 768]) in the model instantiated
- classifier.out_proj.bias: found shape torch.Size([7]) in the checkpoint and torch.Size([8]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


✓ Model loaded with 8 labels
✓ Tokenizer loaded


In [None]:
# ==================== 6. TOKENIZE DATA ====================
print("\nTokenizing data...")

def tokenize_function(examples):
    return tokenizer(
        examples['text'],
        padding='max_length',
        truncation=True,
        max_length=128
    )

# Create Hugging Face datasets
train_dataset = Dataset.from_pandas(train_df[['text', 'label']])
test_dataset = Dataset.from_pandas(test_df[['text', 'label']])

# Tokenize datasets
train_dataset = train_dataset.map(tokenize_function, batched=True)
test_dataset = test_dataset.map(tokenize_function, batched=True)

# Set format for PyTorch
train_dataset.set_format('torch', columns=['input_ids', 'attention_mask', 'label'])
test_dataset.set_format('torch', columns=['input_ids', 'attention_mask', 'label'])

print("✓ Data tokenized and formatted")


Tokenizing data...


Map:   0%|          | 0/27833 [00:00<?, ? examples/s]

Map:   0%|          | 0/6959 [00:00<?, ? examples/s]

✓ Data tokenized and formatted


In [None]:
# ==================== 7. CUSTOM TRAINER WITH WEIGHTED LOSS ====================
class WeightedTrainer(Trainer):
    def __init__(self, *args, class_weights=None, **kwargs):
        super().__init__(*args, **kwargs)
        self.class_weights = class_weights

    def compute_loss(self, model, inputs, return_outputs=False, num_items_in_batch=None):
        labels = inputs.pop("labels")
        outputs = model(**inputs)
        logits = outputs.logits

        # Apply class weights to loss
        loss_fct = torch.nn.CrossEntropyLoss(weight=self.class_weights.to(model.device))
        loss = loss_fct(logits, labels)

        return (loss, outputs) if return_outputs else loss

# ==================== 8. DEFINE METRICS ====================
def compute_metrics(eval_pred):
    predictions, labels = eval_pred
    predictions = np.argmax(predictions, axis=1)

    # Calculate metrics
    accuracy = accuracy_score(labels, predictions)
    precision, recall, f1, _ = precision_recall_fscore_support(
        labels, predictions, average='weighted', zero_division=0
    )

    return {
        'accuracy': accuracy,
        'precision': precision,
        'recall': recall,
        'f1': f1
    }

In [None]:
# ==================== 9. TRAINING CONFIGURATION ====================
print("\nSetting up training configuration...")

training_args = TrainingArguments(
    output_dir='./results',
    num_train_epochs=3,
    per_device_train_batch_size=16,
    per_device_eval_batch_size=16,
    learning_rate=2e-5,
    weight_decay=0.01,
    eval_strategy='epoch',
    save_strategy='epoch',
    load_best_model_at_end=True,
    metric_for_best_model='f1',
    logging_dir='./logs',
    logging_steps=50,
    warmup_steps=100,
    fp16=torch.cuda.is_available(),  # Use mixed precision if GPU available
    report_to='none',  # Disable wandb/tensorboard
    save_total_limit=2,  # Keep only 2 best checkpoints
)

# Initialize trainer with class weights
trainer = WeightedTrainer(
    model=model,
    args=training_args,
    train_dataset=train_dataset,
    eval_dataset=test_dataset,
    compute_metrics=compute_metrics,
    class_weights=class_weights
)

print("✓ Training configuration set")

# ==================== 10. TRAIN MODEL ====================
print("\n" + "="*50)
print("STARTING TRAINING")
print("="*50 + "\n")

trainer.train()

print("\n✓ Training completed!")


Setting up training configuration...
✓ Training configuration set

STARTING TRAINING



Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.877,0.871414,0.68688,0.716259,0.68688,0.69473
2,0.6754,0.84791,0.701106,0.727055,0.701106,0.704718


Epoch,Training Loss,Validation Loss,Accuracy,Precision,Recall,F1
1,0.877,0.871414,0.68688,0.716259,0.68688,0.69473
2,0.6754,0.84791,0.701106,0.727055,0.701106,0.704718
3,0.5606,0.85438,0.721081,0.730406,0.721081,0.72358



✓ Training completed!


In [None]:
# ==================== 11. EVALUATE MODEL ====================
print("\n" + "="*50)
print("EVALUATING MODEL ON TEST SET")
print("="*50 + "\n")

# Get predictions
predictions = trainer.predict(test_dataset)
y_pred = np.argmax(predictions.predictions, axis=1)
y_true = test_df['label'].values

# Calculate detailed metrics
accuracy = accuracy_score(y_true, y_pred)
precision, recall, f1, _ = precision_recall_fscore_support(
    y_true, y_pred, average='weighted', zero_division=0
)

print(f"Test Accuracy:  {accuracy:.4f}")
print(f"Test Precision: {precision:.4f}")
print(f"Test Recall:    {recall:.4f}")
print(f"Test F1-Score:  {f1:.4f}")

print("\n" + "-"*50)
print("DETAILED CLASSIFICATION REPORT")
print("-"*50 + "\n")
print(classification_report(
    y_true,
    y_pred,
    target_names=label_encoder.classes_,
    zero_division=0
))
# ==================== 12. SAVE MODEL ====================
print("\nSaving model and tokenizer...")
output_dir = "fine_tuned_emotion_model"
trainer.save_model(output_dir)
tokenizer.save_pretrained(output_dir)

# Save label encoder
import pickle
with open(f"{output_dir}/label_encoder.pkl", "wb") as f:
    pickle.dump(label_encoder, f)

print(f"✓ Model saved to '{output_dir}/'")


EVALUATING MODEL ON TEST SET



Test Accuracy:  0.7211
Test Precision: 0.7304
Test Recall:    0.7211
Test F1-Score:  0.7236

--------------------------------------------------
DETAILED CLASSIFICATION REPORT
--------------------------------------------------

              precision    recall  f1-score   support

       anger       0.68      0.73      0.71       860
     disgust       0.35      0.47      0.40       171
        fear       0.78      0.74      0.76      1082
         joy       0.82      0.72      0.77      2209
     neutral       0.75      0.89      0.82       451
     sadness       0.70      0.70      0.70      1345
       shame       0.74      1.00      0.85        29
    surprise       0.60      0.68      0.63       812

    accuracy                           0.72      6959
   macro avg       0.68      0.74      0.70      6959
weighted avg       0.73      0.72      0.72      6959


Saving model and tokenizer...
✓ Model saved to 'fine_tuned_emotion_model/'


In [None]:
# ==================== 13. INFERENCE EXAMPLE ====================
print("\n" + "="*50)
print("INFERENCE EXAMPLES")
print("="*50 + "\n")

# Load the pipeline
emotion_classifier = pipeline(
    "text-classification",
    model=output_dir,
    tokenizer=output_dir,
    device=0 if torch.cuda.is_available() else -1
)

# Test examples
test_texts = [
    "I'm feeling nervous but kind of excited.",
    "This is the best day of my life!",
    "I'm so angry right now, this is unacceptable!",
    "I feel really sad and alone.",
    "That's absolutely disgusting.",
    "Wow, I didn't expect that at all!",
    "Everything is just okay, nothing special.",
    "I feel so embarrassed about what happened."
]

for text in test_texts:
    result = emotion_classifier(text)[0]
    print(f"Text: {text}")
    print(f"→ Emotion: {result['label']} (confidence: {result['score']:.4f})\n")

print("="*50)
print("✓ FINE-TUNING COMPLETE!")
print("="*50)


INFERENCE EXAMPLES



Device set to use cuda:0


Text: I'm feeling nervous but kind of excited.
→ Emotion: fear (confidence: 0.8338)

Text: This is the best day of my life!
→ Emotion: joy (confidence: 0.8077)

Text: I'm so angry right now, this is unacceptable!
→ Emotion: anger (confidence: 0.9928)

Text: I feel really sad and alone.
→ Emotion: sadness (confidence: 0.9918)

Text: That's absolutely disgusting.
→ Emotion: disgust (confidence: 0.9585)

Text: Wow, I didn't expect that at all!
→ Emotion: surprise (confidence: 0.9768)

Text: Everything is just okay, nothing special.
→ Emotion: surprise (confidence: 0.6909)

Text: I feel so embarrassed about what happened.
→ Emotion: shame (confidence: 0.9983)

✓ FINE-TUNING COMPLETE!
