# SOTA INVESTIGATION FOR SENTIMENT ANALYSIS

This is a investigation on different models for sentiment-analysis text classification.

The models used will be:
- Phi 3.5 3.8B parameters Q4_K_M, zero-shot & few-shot
- Llama 3.2 1B parameters Q6_K_L, zero-shot & few-shot
- RoBERTa

Also some state-of-the-art LLMs will be tested, but not with the whole length of the dataset
- Deepseek R1
- Gemini 2.5 pro
- ChatGPT 4o

These last models will be tested via their web-chatbot version, due to limited resources for API use.

## Load resources

In [None]:
from emotion_classifier import *
import os

In [2]:
with open(os.path.join("secrets", "credentials.txt"), "r") as file:
    for line in file.readlines():
        key, value = line.split(":")

        if key == "huggingface":
            access_token = value

## Execute experiments

In [None]:
available_datasets = list(DatasetHandler.DATASET_CONFIGS.keys())
print(f"The available datasets are: {available_datasets}")

In [None]:
print("Available templates:")
PromptTemplate.TEMPLATES

### GO_EMOTIONS

In [None]:
dataset = DatasetHandler("go_emotions", split="test", sample=0.1, seed=28)
print(len(dataset.data))

In [None]:
dataset.label2id

Preparing prompts for LLM models:

In [7]:
system_prompt = "You're a helpful and obedient assistant that responds to questions according to the instructions given."

In [8]:
zeroshot_user_pre_prompt = f"""
Classify text into ALL applicable emotion labels from this list: {", ".join(dataset.labels)}, replying only with the final comma-separated predominant emotion labels detected. The text is: 
"""

In [9]:
fewshot_user_pre_prompt = f"""
Classify text into ALL applicable emotion labels from this list: {", ".join(dataset.labels)}, based on the following examples:

Text: [NAME], loved the story. Personally I don’t think this was petty I think it was quite fair and right. Good job dude.
Emotions: admiration, love

Text: Oh that's right they just came back from the vacation. This is a vacation business trip. What a fuckin' joke.
Emotions: amusement

Text: Players like this makes me rage quit tbh.
Emotions: anger, annoyance

Text: If you had a giant rock on your land, and you couldn't stop people climbing, wouldn't you be pissed?
Emotions: annoyance, confusion, curiosity

Text: This very thing got me diagnosed as ADD. I'm of the drugs now and feeling way much better. 
Emotions: approval, relief

Text: Get rid. This is a toxic relationship that is making you unhappy. You are making all efforts for nothing in return.
Emotions: caring, realization

Text: Isn't it the same thing? Or at least equally bad?
Emotions: confusion

Text: It's great that you're a recovering addict, that's cool. Have you ever tried DMT?
Emotions: curiosity, admiration

Text: I know. And it's very selfish of me to say, but I just wish it was different people that cared.
Emotions: desire

Text: It's really very upsetting to have parents who dismiss this condition.
Emotions: disappointment

Text: We got nowhere with that because you only drop one-liners and refuse to actually engage.
Emotions: disapproval

Text: Apparently, he can't just have sex with anyone because he needed to rape someone.
Emotions: disgust

Text: I was teased for being a virgin when I was a 6th grader- in 2005
Emotions: embarrassment

Text: I’d love to get an update on what happened! (When you are ready)
Emotions: excitement, love

Text: I've also heard that intriguing but also kinda scary
Emotions: fear

Text: I didn't know that, thank you for teaching me something today!
Emotions: gratitude

Text: [NAME] death is just so..... senseless. Why? WHY??? The based gods have forsaken us
Emotions: grief

Text: A surprise turn of events! I'm so glad you heard such great things about yourself!
Emotions: joy, surprise

Text: I need to just eliminate anxiety and not have cravings
Emotions: nervousness

Text: I'm going to hold out hope for something minor even though it looked really bad. Just going to wait for the official news.
Emotions: optimism

Text: Of course I love myself because I'm awesome.
Emotions: pride

Text: Oh yeah, I forgot about the mean one.
Emotions: realization

Text: Resetting a dislocated knee hurts like hell but it feels a lot better immediately after.
Emotions: relief

Text: I sincerely apologize...I’ll delete and please take this upvote
Emotions: remorse

Text: It's hard to make friends. :( I sit alone.
Emotions: sadness

Text: I can’t believe that’s real
Emotions: surprise

Text: I'd say it's pretty uncommon now.
Emotions: neutral

Reply only with the final comma-separated predominant emotion labels detected. The text to classify is: 
"""

RoBERTa Fine-tuned with go_emotions

In [None]:
# Transformers model evaluation
model = TransformersStrategy("SamLowe/roberta-base-go_emotions")
classifier = EmotionClassifier(dataset, model)

classifier.evaluate()
_ = classifier.create_performance_report(print_report=True)

Phi 3.5 mini quantized - ZERO SHOT

In [None]:
# Llama.cpp model evaluation
prompt_template = PromptTemplate( system_prompt, zeroshot_user_pre_prompt )
model = LlamaCppStrategy(
    model_name = "Phi-3.5-mini-instruct-Q4_K_M-zeroshot",
    model_path = "models/Phi-3.5-mini-instruct-Q4_K_M.gguf", 
    n_ctx = 512,
    prompt_template = prompt_template
)
classifier = EmotionClassifier(dataset, model)
classifier.evaluate(verbose=True)
report = classifier.create_performance_report(print_report=True, exp_name="zeroshot")

Phi 3.5 mini quantized - FEW SHOT

In [None]:
# Llama.cpp model evaluation
prompt_template = PromptTemplate( system_prompt, fewshot_user_pre_prompt )
model = LlamaCppStrategy(
    model_name = "Phi-3.5-mini-instruct-Q4_K_M-few_shot",
    model_path = "models/Phi-3.5-mini-instruct-Q4_K_M.gguf", 
    n_ctx = 1500,
    prompt_template = prompt_template
)
classifier = EmotionClassifier(dataset, model)
classifier.evaluate(verbose=True)
report = classifier.create_performance_report(print_report=True, exp_name="fewshot")

Llama-3.2-1B-Instruct quantized zero-shot

In [None]:
prompt_template = PromptTemplate( system_prompt, zeroshot_user_pre_prompt )
model = LlamaCppStrategy(
    model_name = "Llama-3.2-1B-Instruct-Q6_K_L.gguf",
    model_path = "models/Llama-3.2-1B-Instruct-Q6_K_L.gguf", 
    n_ctx = 512,
    prompt_template = prompt_template
)
classifier = EmotionClassifier(dataset, model)
classifier.evaluate(verbose=True)
report = classifier.create_performance_report(print_report=True, exp_name="zeroshot")

Llama-3.2-1B-Instruct quantized few-shot

In [None]:
prompt_template = PromptTemplate( system_prompt, fewshot_user_pre_prompt )
model = LlamaCppStrategy(
    model_name = "Llama-3.2-1B-Instruct-Q6_K_L.gguf",
    model_path = "models/Llama-3.2-1B-Instruct-Q6_K_L.gguf", 
    n_ctx = 1024,
    prompt_template = prompt_template
)
classifier = EmotionClassifier(dataset, model)
classifier.evaluate(verbose=True)
report = classifier.create_performance_report(print_report=True, exp_name="fewshot")

DeepSeek R1 Distill Qwen 1.5B quantized zero-shot

In [11]:
# As indicated in the DeepSeek R1 Distill Qwen 1.5B model, we shouldn't add a system prompt because it was trained without it, it'd only add noise
system_prompt = ""
user_pre_prompt = f"""
You are an assistant that classifies text into ALL applicable emotions from: {", ".join(dataset.labels)}. 
From this text, select the predominant emotions and respond with the final comma-separated predominant emotion names only:
"""

In [None]:
prompt_template = PromptTemplate( system_prompt, user_pre_prompt )
model = LlamaCppStrategy(
    model_name = "DeepSeek-R1-Distill-Qwen-1.5B-Q8_0-zero_shot",
    model_path = "models/DeepSeek-R1-Distill-Qwen-1.5B-Q8_0.gguf",
    prompt_template = prompt_template
)

Gemma 2 2B quantized zero-shot

In [12]:
system_prompt = f"""
You are an assistant that classifies text into ALL applicable emotions from: {", ".join(dataset.labels)}. 
"""
user_pre_prompt = f"""
From this text, select the predominant emotions and respond with the final comma-separated predominant emotion names only:
"""

In [None]:
prompt_template = PromptTemplate( system_prompt, user_pre_prompt )
model = LlamaCppStrategy(
    model_name = "gemma2-2b-zero_shot",
    model_path = "models/dolphin-2.9.4-gemma2-2b-Q6_K_L.gguf",
    prompt_template = prompt_template
)
classifier = EmotionClassifier(dataset, model)
classifier.evaluate(verbose=True)
report = classifier.create_performance_report(print_report=True)

### Dair-ai emotion

In [None]:
# Maintain a data length similar to the one used in "GO_EMOTIONS"
dataset = DatasetHandler("dair_emotion", split="test", sample=0.3, seed=50) 
print(len(dataset.data))

In [None]:
dataset.data.to_pandas()["label"].value_counts()

RoBERTa Fine-tuned with go_emotions

In [None]:
# Transformers model evaluation
model = TransformersStrategy("bhadresh-savani/roberta-base-emotion")
classifier = EmotionClassifier(dataset, model)

classifier.evaluate()
_ = classifier.create_performance_report(print_report=True)

Preparing prompts for LLM models

In [18]:
system_prompt = "You're a helpful and obedient assistant that responds to questions according to the instructions given."

In [19]:
zeroshot_user_pre_prompt = f"""
Classify text into only ONE emotion label from this list: {", ".join(dataset.labels)}, replying only with the single emotion label detected. The text is: 
"""

In [20]:
fewshot_user_pre_prompt = f"""
Classify text into only ONE emotion label from this list: {", ".join(dataset.labels)}, based on the following examples:

Text: i feel like damaged goods no one will want me now 
Emotion: sadness

Text: im feeling extremely fabulous with my jacket and shoes aint no bitches gonna bring me down hahah
Emotion: joy

Text: ive been feeling from my adoring fans that would be teh whole like of you who are my friends here i felt brave and excited and ventrured forth with guitar in hand to a local open mic night
Emotion: love

Text: i feel so frustrated because i had a long weekday and i dont really have plenty of rest and right now he keeps on coming in the room
Emotion: anger

Text: i am feeling uncertain of the merits of posting to this blog with the frequency or earnestness i had been over the previous year
Emotion: fear

Text: i feel as though i am on another adventure and i am more curious about it than anything else
Emotion: surprise

The text to classify is: 
"""

Llama-3.2-1B-Instruct quantized - ZERO SHOT

In [None]:
prompt_template = PromptTemplate( system_prompt, zeroshot_user_pre_prompt )
model = LlamaCppStrategy(
    model_name = "Llama-3.2-1B-Instruct-Q6_K_L.gguf",
    model_path = "models/Llama-3.2-1B-Instruct-Q6_K_L.gguf", 
    n_ctx = 256,
    prompt_template = prompt_template
)
classifier = EmotionClassifier(dataset, model)
classifier.evaluate(verbose=True)
report = classifier.create_performance_report(print_report=True, exp_name="zeroshot")

Llama-3.2-1B-Instruct quantized - FEW SHOT

In [None]:
prompt_template = PromptTemplate( system_prompt, fewshot_user_pre_prompt )
model = LlamaCppStrategy(
    model_name = "Llama-3.2-1B-Instruct-Q6_K_L.gguf",
    model_path = "models/Llama-3.2-1B-Instruct-Q6_K_L.gguf", 
    n_ctx = 512,
    prompt_template = prompt_template
)
classifier = EmotionClassifier(dataset, model)
classifier.evaluate(verbose=True)
report = classifier.create_performance_report(print_report=True, exp_name="fewshot")

Phi 3.5 mini quantized - ZERO SHOT

In [None]:
# Llama.cpp model evaluation
prompt_template = PromptTemplate( system_prompt, zeroshot_user_pre_prompt )
model = LlamaCppStrategy(
    model_name = "Phi-3.5-mini-instruct-Q4_K_M-few_shot",
    model_path = "models/Phi-3.5-mini-instruct-Q4_K_M.gguf", 
    n_ctx = 256,
    prompt_template = prompt_template
)
classifier = EmotionClassifier(dataset, model)
classifier.evaluate(verbose=True)
report = classifier.create_performance_report(print_report=True, exp_name="zeroshot")

Phi 3.5 mini quantized - FEW SHOT

In [None]:
# Llama.cpp model evaluation
prompt_template = PromptTemplate( system_prompt, fewshot_user_pre_prompt )
model = LlamaCppStrategy(
    model_name = "Phi-3.5-mini-instruct-Q4_K_M-few_shot",
    model_path = "models/Phi-3.5-mini-instruct-Q4_K_M.gguf", 
    n_ctx = 512,
    prompt_template = prompt_template
)
classifier = EmotionClassifier(dataset, model)
classifier.evaluate(verbose=True)
report = classifier.create_performance_report(print_report=True, exp_name="fewshot")

## Print results

In [None]:
model_report = model.create_performance_report(print_report=True)

## Inference and model testing

In [None]:
sentiment_template = f"Respond to this message only replying with the single emotion label you can detect from these emotion labels: {model.dataset_labels} for the following piece of text: "
model.run_inference(sentiment_template + "My mom said she wants me to go and never return back again")