# GoEmotions: Text-based Emotion Detection Pipeline

This notebook is organized into clear sections while keeping **all original code cells unchanged**.

1. Setup & Imports
2. Data Loading & Cleaning
3. Embedding Generation (all-MiniLM-L6-v2)
4. Train/Validation Split
5. Model Training (baseline + XGBoost/LogReg)
6. Model Evaluation & Visualization
7. Save/Load Models (`xgboost_model.pkl`, `logistic_regression_model.pkl`)
8. Demo function: `predict_emotion("sample text")`

> Only Markdown headings have been added; code is untouched.

## 1. Setup & Imports

In [5]:
import kagglehub
from kagglehub import KaggleDatasetAdapter

# Set the path to the file you'd like to load
file_path = "go_emotions_dataset.csv"

# Load the latest version
df = kagglehub.load_dataset(
  KaggleDatasetAdapter.PANDAS,
  "shivamb/go-emotions-google-emotions-dataset",
  file_path,
)

  df = kagglehub.load_dataset(


Downloading from https://www.kaggle.com/api/v1/datasets/download/shivamb/go-emotions-google-emotions-dataset?dataset_version_number=1&file_name=go_emotions_dataset.csv...


100%|██████████| 8.68M/8.68M [00:00<00:00, 46.9MB/s]

Extracting zip of go_emotions_dataset.csv...





In [6]:
df.head()

Unnamed: 0,id,text,example_very_unclear,admiration,amusement,anger,annoyance,approval,caring,confusion,...,love,nervousness,optimism,pride,realization,relief,remorse,sadness,surprise,neutral
0,eew5j0j,That game hurt.,False,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,1,0,0
1,eemcysk,>sexuality shouldn’t be a grouping category I...,True,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2,ed2mah1,"You do right, if you don't care then fuck 'em!",False,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,1
3,eeibobj,Man I love reddit.,False,0,0,0,0,0,0,0,...,1,0,0,0,0,0,0,0,0,0
4,eda6yn6,"[NAME] was nowhere near them, he was by the Fa...",False,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,1


In [7]:
df['example_very_unclear'].value_counts()

Unnamed: 0_level_0,count
example_very_unclear,Unnamed: 1_level_1
False,207814
True,3411


In [8]:
df = df[df['example_very_unclear'] == False]

# Training

## Loading ST

In [9]:
from sentence_transformers import SentenceTransformer
import numpy as np
import torch

device = 'cuda' if torch.cuda.is_available() else 'cpu'
model_name = 'sentence-transformers/all-MiniLM-L6-v2'
model = SentenceTransformer(model_name, device=device)

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


modules.json:   0%|          | 0.00/349 [00:00<?, ?B/s]

config_sentence_transformers.json:   0%|          | 0.00/116 [00:00<?, ?B/s]

README.md: 0.00B [00:00, ?B/s]

sentence_bert_config.json:   0%|          | 0.00/53.0 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/612 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/90.9M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/350 [00:00<?, ?B/s]

vocab.txt: 0.00B [00:00, ?B/s]

tokenizer.json: 0.00B [00:00, ?B/s]

special_tokens_map.json:   0%|          | 0.00/112 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/190 [00:00<?, ?B/s]

## Creating Embeddings

In [10]:
sentences = df['text'].tolist()
print(f"Generating embeddings for {len(sentences)} sentences...")
embeddings = model.encode(sentences, show_progress_bar=True)

df['Text_Embedding'] = list(embeddings)

print("\nEmbedding generation complete.")
print(f"Shape of the generated embeddings: {embeddings.shape}")
print("DataFrame with new 'Text_Embedding' column:")
print(df[['text', 'Text_Embedding']].head())

Generating embeddings for 207814 sentences...


Batches:   0%|          | 0/6495 [00:00<?, ?it/s]


Embedding generation complete.
Shape of the generated embeddings: (207814, 384)
DataFrame with new 'Text_Embedding' column:
                                                text  \
0                                    That game hurt.   
2     You do right, if you don't care then fuck 'em!   
3                                 Man I love reddit.   
4  [NAME] was nowhere near them, he was by the Fa...   
5  Right? Considering it’s such an important docu...   

                                      Text_Embedding  
0  [0.042507537, 0.035504226, 0.021807536, 0.0550...  
2  [0.07499918, -0.023742363, 0.004013896, -0.072...  
3  [-0.06888325, -0.022638846, 0.033214595, -0.00...  
4  [0.038520828, 0.10962978, -0.081738226, 0.0072...  
5  [-0.038964443, 0.101930484, -0.013570778, -0.0...  


## ML Models

In [12]:
df = df.drop(columns=["text","id","example_very_unclear"])
df.head()

Unnamed: 0,admiration,amusement,anger,annoyance,approval,caring,confusion,curiosity,desire,disappointment,...,nervousness,optimism,pride,realization,relief,remorse,sadness,surprise,neutral,Text_Embedding
0,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,1,0,0,"[0.042507537, 0.035504226, 0.021807536, 0.0550..."
2,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,1,"[0.07499918, -0.023742363, 0.004013896, -0.072..."
3,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,"[-0.06888325, -0.022638846, 0.033214595, -0.00..."
4,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,1,"[0.038520828, 0.10962978, -0.081738226, 0.0072..."
5,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,"[-0.038964443, 0.101930484, -0.013570778, -0.0..."


## Logistic Regression (OVR)

In [16]:
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.multiclass import OneVsRestClassifier
from sklearn.metrics import classification_report, accuracy_score

X = np.vstack(df['Text_Embedding'].values)
emotion_cols = [
 'admiration', 'amusement', 'anger', 'annoyance', 'approval', 'caring',
 'confusion', 'curiosity', 'desire', 'disappointment', 'disapproval',
 'disgust', 'embarrassment', 'excitement', 'fear', 'gratitude', 'grief',
 'joy', 'love', 'nervousness', 'optimism', 'pride', 'realization',
 'relief', 'remorse', 'sadness', 'surprise', 'neutral'
]

y = df[emotion_cols].values
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42
)

clf = OneVsRestClassifier(
    LogisticRegression(
        max_iter=200,
        class_weight='balanced'
    )
)
clf.fit(X_train, y_train)

## 6. Model Evaluation & Visualization

In [19]:
y_pred = clf.predict(X_test)
print("Accuracy (per label):")
for idx, emotion in enumerate(emotion_cols):
    acc = accuracy_score(y_test[:, idx], y_pred[:, idx])
    print(f"{emotion}: {acc:.3f}")

print("\n\nClassification Report:")
print(classification_report(y_test, y_pred, target_names=emotion_cols))

Accuracy (per label):
admiration: 0.798
amusement: 0.885
anger: 0.779
annoyance: 0.683
approval: 0.644
caring: 0.790
confusion: 0.747
curiosity: 0.763
desire: 0.787
disappointment: 0.688
disapproval: 0.703
disgust: 0.771
embarrassment: 0.746
excitement: 0.757
fear: 0.843
gratitude: 0.908
grief: 0.862
joy: 0.779
love: 0.886
nervousness: 0.778
optimism: 0.772
pride: 0.760
realization: 0.642
relief: 0.763
remorse: 0.859
sadness: 0.802
surprise: 0.756
neutral: 0.633


Classification Report:
                precision    recall  f1-score   support

    admiration       0.25      0.75      0.38      3429
     amusement       0.25      0.83      0.38      1785
         anger       0.13      0.77      0.22      1681
     annoyance       0.14      0.70      0.23      2767
      approval       0.14      0.60      0.22      3552
        caring       0.10      0.74      0.17      1189
     confusion       0.10      0.77      0.18      1494
     curiosity       0.14      0.76      0.23      1972
   

  _warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))


## XG Boost

## 1. Setup & Imports

In [23]:
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.multiclass import OneVsRestClassifier
import xgboost as xgb
from sklearn.metrics import classification_report, accuracy_score

X = np.vstack(df['Text_Embedding'].values)
emotion_cols = [
 'admiration', 'amusement', 'anger', 'annoyance', 'approval', 'caring',
 'confusion', 'curiosity', 'desire', 'disappointment', 'disapproval',
 'disgust', 'embarrassment', 'excitement', 'fear', 'gratitude', 'grief',
 'joy', 'love', 'nervousness', 'optimism', 'pride', 'realization',
 'relief', 'remorse', 'sadness', 'surprise', 'neutral'
]

y = df[emotion_cols].values
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42
)

xgb_params = {
    'max_depth': 4,
    'n_estimators': 200,
    'learning_rate': 0.1,
    'subsample': 0.8,
    'colsample_bytree': 0.8,
    'objective': 'binary:logistic',  # for each label
    'eval_metric': 'logloss',        # for ETA and display
    'tree_method': 'hist',           # speeds up on CPU
    'n_jobs': -1
}

base_model = xgb.XGBClassifier(**xgb_params)

model = OneVsRestClassifier(base_model, n_jobs=-1)
print("Training XGBoost model...")
model.fit(X_train, y_train)

Training XGBoost model...


## 6. Model Evaluation & Visualization

In [24]:
y_pred = model.predict(X_test)
print("Accuracy (per label):")
for idx, emotion in enumerate(emotion_cols):
    acc = accuracy_score(y_test[:, idx], y_pred[:, idx])
    print(f"{emotion}: {acc:.3f}")

print("\n\nClassification Report:")
print(classification_report(y_test, y_pred, target_names=emotion_cols))

Accuracy (per label):
admiration: 0.924
amusement: 0.961
anger: 0.960
annoyance: 0.933
approval: 0.915
caring: 0.971
confusion: 0.964
curiosity: 0.953
desire: 0.980
disappointment: 0.960
disapproval: 0.944
disgust: 0.974
embarrassment: 0.988
excitement: 0.973
fear: 0.986
gratitude: 0.970
grief: 0.997
joy: 0.962
love: 0.966
nervousness: 0.991
optimism: 0.959
pride: 0.994
realization: 0.958
relief: 0.993
remorse: 0.988
sadness: 0.968
surprise: 0.974
neutral: 0.745


Classification Report:
                precision    recall  f1-score   support

    admiration       0.64      0.17      0.26      3429
     amusement       0.61      0.23      0.33      1785
         anger       0.56      0.05      0.09      1681
     annoyance       0.36      0.00      0.00      2767
      approval       0.80      0.01      0.02      3552
        caring       0.45      0.04      0.07      1189
     confusion       0.62      0.02      0.05      1494
     curiosity       0.53      0.02      0.04      1972
   

  _warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))


## Saving the model

## 1. Setup & Imports

In [22]:
import pickle

filename = 'logistic_regression_model.pkl'
with open(filename, 'wb') as file:
    pickle.dump(clf, file)

In [25]:
import pickle

filename = 'XGBOOST_Model.pkl'
with open(filename, 'wb') as file:
    pickle.dump(model, file)

adding thresholds

In [29]:
y_proba = clf.predict_proba(X_test)
from sklearn.metrics import f1_score
import numpy as np

best_thresholds = []

for i in range(y_test.shape[1]):
    best_f1 = 0
    best_t = 0.5
    for t in np.linspace(0.1, 0.9, 17):
        preds = (y_proba[:, i] >= t).astype(int)
        f1 = f1_score(y_test[:, i], preds)
        if f1 > best_f1:
            best_f1 = f1
            best_t = t
    best_thresholds.append(best_t)

print(best_thresholds)
y_pred = np.zeros_like(y_proba)
for i, t in enumerate(best_thresholds):
    y_pred[:, i] = (y_proba[:, i] >= t).astype(int)

[np.float64(0.75), np.float64(0.85), np.float64(0.85), np.float64(0.65), np.float64(0.6), np.float64(0.85), np.float64(0.75), np.float64(0.75), np.float64(0.85), np.float64(0.65), np.float64(0.65), np.float64(0.85), np.float64(0.9), np.float64(0.8), np.float64(0.9), np.float64(0.85), np.float64(0.9), np.float64(0.8), np.float64(0.85), np.float64(0.9), np.float64(0.8), np.float64(0.9), np.float64(0.65), np.float64(0.9), np.float64(0.9), np.float64(0.85), np.float64(0.85), np.float64(0.45000000000000007)]


## 6. Model Evaluation & Visualization

In [30]:
print(classification_report(y_test, y_pred, target_names=emotion_cols))

                precision    recall  f1-score   support

    admiration       0.38      0.50      0.43      3429
     amusement       0.50      0.60      0.55      1785
         anger       0.30      0.35      0.32      1681
     annoyance       0.18      0.45      0.25      2767
      approval       0.18      0.39      0.25      3552
        caring       0.22      0.31      0.26      1189
     confusion       0.16      0.39      0.22      1494
     curiosity       0.24      0.43      0.30      1972
        desire       0.19      0.32      0.24       808
disappointment       0.11      0.42      0.17      1656
   disapproval       0.16      0.49      0.24      2323
       disgust       0.21      0.33      0.26      1095
 embarrassment       0.15      0.16      0.15       521
    excitement       0.16      0.36      0.22      1137
          fear       0.26      0.41      0.32       625
     gratitude       0.64      0.69      0.66      2323
         grief       0.05      0.47      0.09  

  _warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))


In [35]:
y_proba = model.predict_proba(X_test)
best_thresholds = []

for i in range(y_test.shape[1]):
    best_f1 = 0
    best_t = 0.5
    for t in np.linspace(0.1, 0.9, 17):
        preds = (y_proba[:, i] >= t).astype(int)
        f1 = f1_score(y_test[:, i], preds)
        if f1 > best_f1:
            best_f1 = f1
            best_t = t
    best_thresholds.append(best_t)

print(best_thresholds)
y_pred = np.zeros_like(y_proba)
for i, t in enumerate(best_thresholds):
    y_pred[:, i] = (y_proba[:, i] >= t).astype(int)

[np.float64(0.2), np.float64(0.15000000000000002), np.float64(0.15000000000000002), np.float64(0.1), np.float64(0.1), np.float64(0.15000000000000002), np.float64(0.1), np.float64(0.1), np.float64(0.1), np.float64(0.1), np.float64(0.1), np.float64(0.1), np.float64(0.15000000000000002), np.float64(0.1), np.float64(0.1), np.float64(0.30000000000000004), np.float64(0.30000000000000004), np.float64(0.1), np.float64(0.2), np.float64(0.1), np.float64(0.15000000000000002), np.float64(0.15000000000000002), np.float64(0.1), np.float64(0.1), np.float64(0.1), np.float64(0.15000000000000002), np.float64(0.1), np.float64(0.25)]


In [36]:
print(classification_report(y_test, y_pred, target_names=emotion_cols))

                precision    recall  f1-score   support

    admiration       0.41      0.51      0.46      3429
     amusement       0.49      0.65      0.56      1785
         anger       0.31      0.37      0.34      1681
     annoyance       0.17      0.50      0.26      2767
      approval       0.16      0.51      0.25      3552
        caring       0.28      0.32      0.30      1189
     confusion       0.21      0.35      0.26      1494
     curiosity       0.25      0.49      0.33      1972
        desire       0.27      0.29      0.28       808
disappointment       0.16      0.27      0.20      1656
   disapproval       0.18      0.45      0.26      2323
       disgust       0.23      0.35      0.28      1095
 embarrassment       0.32      0.18      0.23       521
    excitement       0.21      0.34      0.26      1137
          fear       0.34      0.49      0.40       625
     gratitude       0.76      0.68      0.72      2323
         grief       0.24      0.14      0.18  

  _warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))


## Inferencing loop

## 1. Setup & Imports

In [None]:
import pickle
from sentence_transformers import SentenceTransformer
import numpy as np
import torch

device = 'cuda' if torch.cuda.is_available() else 'cpu'
model_name = 'sentence-transformers/all-MiniLM-L6-v2'
trans_model = SentenceTransformer(model_name, device=device)

model_path = '/content/logistic_regression_model.pkl' 
with open(model_path, 'rb') as file:
    loaded_model = pickle.load(file)

inp = input("Enter a sentence: ")
embd = trans_model.encode([inp])
pred = loaded_model.predict(embd)
print(pred)

In [31]:
import numpy as np

def infer_emotions(text, embed_model, trained_models, thresholds, emotion_labels, return_proba=True):
    """
    Predict emotions for a given input text.

    Parameters:
    -----------
    text : str
        The input sentence to classify.

    embed_model : SentenceTransformer or similar
        Pre-loaded embedding model used during training.

    trained_models : list
        List of trained XGBClassifier models (one per label).

    thresholds : list or ndarray
        Per-label thresholds tuned on validation data.

    emotion_labels : list
        Names of the emotion labels in the correct order.

    return_proba : bool
        Whether to return probabilities for each label.

    Returns:
    --------
    dict
        Sorted emotions with probabilities or binary predictions.
    """

    # Step 1: generate embedding for the text
    embedding = embed_model.encode([text])  # shape (1, embedding_dim)

    # Step 2: collect probabilities from each model
    probs = []
    for clf in trained_models:
        p = clf.predict_proba(embedding)[0][1]  # probability for class=1
        probs.append(p)
    probs = np.array(probs)

    # Step 3: apply thresholds (per label)
    preds = (probs >= thresholds).astype(int)

    # Step 4: prepare output
    output = {}

    if return_proba:
        for label, prob in zip(emotion_labels, probs):
            output[label] = float(prob)

        # sort by probability descending
        output = dict(sorted(output.items(), key=lambda x: x[1], reverse=True))

    else:
        for label, pred in zip(emotion_labels, preds):
            output[label] = int(pred)

    return output

In [34]:
text = "I am feeling so happy and grateful today!"
result = infer_emotions(
    text,
    embed_model=trans_model,
    trained_models=[clf,model],
    thresholds=0.8,
    emotion_labels=emotion_cols
)

print(result)


{'admiration': 0.021510508527344506, 'amusement': 0.00248474208638072}
