# Natural Language Processing - Assignment 2
## Sentiment Analysis (Practical Component)

*   This notebook contains the code for Parts (b)-(d) of the assignment.

*   The model used in this notebook is: '**utahnlp/sst2_t5-small_seed-1**'

* All written explanations and final answers appear in the accompanying PDF submission.

In [7]:
!pip install transformers
!pip install datasets

from collections import defaultdict, Counter
import json

from matplotlib import pyplot as plt
import numpy as np
import torch

def print_encoding(model_inputs, indent=4):
    indent_str = " " * indent
    print("{")
    for k, v in model_inputs.items():
        print(indent_str + k + ":")
        print(indent_str + indent_str + str(v))
    print("}")



### (b) Loading a T5-Small Model Fine-Tuned on SST-2
In this section, we load the Hugging Face model utahnlp/sst2_t5-small_seed-1, a T5-Small model fine-tuned on the SST-2 sentiment analysis dataset.
We initialize the tokenizer and model, and prepare them for use in downstream sentiment prediction tasks.

In [52]:
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM

model_name = "utahnlp/sst2_t5-small_seed-3"

tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSeq2SeqLM.from_pretrained(model_name)

generation_config.json:   0%|          | 0.00/142 [00:00<?, ?B/s]

### (c) Sentiment Predictions on Example Sentences

In this section, we use the fine-tuned T5-Small SST-2 model to generate sentiment predictions for the four provided sentences.

In [69]:
import torch

def predict_label(text: str) -> int:
    """
    Returns 1 for positive, 0 for negative.
    """
    inputs = tokenizer(text, return_tensors="pt")
    with torch.no_grad():
        outputs = model.generate(**inputs, max_new_tokens=3)

    prediction_text = tokenizer.decode(outputs[0], skip_special_tokens=True).strip().lower()
    # print("Raw prediction:", repr(prediction_text))

    if "positive" in prediction_text:
        return 1
    elif "negative" in prediction_text:
        return 0



Sentence 1:  ”This movie is awesome”

In [60]:
s = "This movie is awesome"
prediction1 = predict_label(s)


Raw prediction: 'positive'


Sentence 2: "I didn't like the movie so much"

In [61]:
s = "I didn't like the movie so much"
prediction1 = predict_label(s)

Raw prediction: 'negative'


Sentence 3: "I'm not sure what I think about this movie."

In [62]:
s = "I'm not sure what I think about this movie."
prediction1 = predict_label(s)

Raw prediction: 'negative'


Sentece 4: "Did you like the movie?"

In [63]:
s = "Did you like the movie?"
prediction1 = predict_label(s)

Raw prediction: 'positive'


### (d) Evaluating Accuracy on the SST-2 Dataset

In this section, we load the SST-2 dataset from the GLUE benchmark and evaluate the accuracy
of the fine-tuned T5-Small model (`utahnlp/sst2_t5-small_seed-1`) on the validation split.


In [19]:
from datasets import load_dataset

sst2 = load_dataset("glue", "sst2")
sst2

README.md: 0.00B [00:00, ?B/s]

sst2/train-00000-of-00001.parquet:   0%|          | 0.00/3.11M [00:00<?, ?B/s]

sst2/validation-00000-of-00001.parquet:   0%|          | 0.00/72.8k [00:00<?, ?B/s]

sst2/test-00000-of-00001.parquet:   0%|          | 0.00/148k [00:00<?, ?B/s]

Generating train split:   0%|          | 0/67349 [00:00<?, ? examples/s]

Generating validation split:   0%|          | 0/872 [00:00<?, ? examples/s]

Generating test split:   0%|          | 0/1821 [00:00<?, ? examples/s]

DatasetDict({
    train: Dataset({
        features: ['sentence', 'label', 'idx'],
        num_rows: 67349
    })
    validation: Dataset({
        features: ['sentence', 'label', 'idx'],
        num_rows: 872
    })
    test: Dataset({
        features: ['sentence', 'label', 'idx'],
        num_rows: 1821
    })
})

In [70]:
from datasets import load_dataset

sst2 = load_dataset("glue", "sst2")

correct = 0
total = 0

for example in sst2["validation"]:
    text = example["sentence"]
    true_label = example["label"]
    pred_label = predict_label(text)

    if pred_label == true_label:
        correct += 1
    total += 1

accuracy = correct / total
accuracy


0.9048165137614679

### Checking Class Balance in SST-2

We compute how many positive and negative samples exist in the SST-2 validation set.


In [72]:

# Count labels: 0 = negative, 1 = positive
label_counts = Counter(example["label"] for example in sst2["validation"])

label_counts, {k: v / sum(label_counts.values()) for k, v in label_counts.items()}


(Counter({1: 444, 0: 428}), {1: 0.5091743119266054, 0: 0.4908256880733945})