# Question 5: Sentiment Analysis with T5

This notebook implements sentiment analysis using a T5-Small model fine-tuned on SST-2.

## Part (b): Model Selection

We are using the following model from Hugging Face:

**Model:** `lightsout19/t5-sst2`

**Description:**
- Base Architecture: T5 (Text-to-Text Transfer Transformer)
- Fine-tuning Dataset: SST-2 (Stanford Sentiment Treebank-2)
- Task: Binary sentiment classification
- Model Page: https://huggingface.co/lightsout19/t5-sst2

### Install and Import Dependencies

In [1]:
!pip install -U transformers torch

Defaulting to user installation because normal site-packages is not writeable
Collecting transformers
  Downloading transformers-4.57.3-py3-none-any.whl.metadata (43 kB)
Collecting torch
  Downloading torch-2.8.0-cp39-none-macosx_11_0_arm64.whl.metadata (30 kB)
Collecting filelock (from transformers)
  Downloading filelock-3.19.1-py3-none-any.whl.metadata (2.1 kB)
Collecting huggingface-hub<1.0,>=0.34.0 (from transformers)
  Downloading huggingface_hub-0.36.0-py3-none-any.whl.metadata (14 kB)
Collecting regex!=2019.12.17 (from transformers)
  Downloading regex-2025.11.3-cp39-cp39-macosx_11_0_arm64.whl.metadata (40 kB)
Collecting tokenizers<=0.23.0,>=0.22.0 (from transformers)
  Downloading tokenizers-0.22.1-cp39-abi3-macosx_11_0_arm64.whl.metadata (6.8 kB)
Collecting safetensors>=0.4.3 (from transformers)
  Downloading safetensors-0.7.0-cp38-abi3-macosx_11_0_arm64.whl.metadata (4.1 kB)
Collecting tqdm>=4.27 (from transformers)
  Downloading tqdm-4.67.1-py3-none-any.whl.metadata (57 kB)

In [2]:
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch



### Load Model and Tokenizer

In [3]:
# Load the T5-SST2 model and tokenizer
model_name = "lightsout19/t5-sst2"
print(f"Loading model: {model_name}")

tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)

print("Model loaded successfully!")
print(f"Number of labels: {model.config.num_labels}")

Loading model: lightsout19/t5-sst2


tokenizer_config.json: 0.00B [00:00, ?B/s]

spiece.model:   0%|          | 0.00/792k [00:00<?, ?B/s]

tokenizer.json: 0.00B [00:00, ?B/s]

added_tokens.json:   0%|          | 0.00/51.0 [00:00<?, ?B/s]

special_tokens_map.json: 0.00B [00:00, ?B/s]

config.json: 0.00B [00:00, ?B/s]

model.safetensors:   0%|          | 0.00/243M [00:00<?, ?B/s]

Model loaded successfully!
Number of labels: 2


### Define Prediction Function

In [10]:
def predict_sentiment(text):
    """
    Predict sentiment for a given text.
    
    Args:
        text (str): Input text to analyze
        
    Returns:
        dict: Dictionary containing predicted label and confidence scores
    """
    # Tokenize input
    inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=512)
    
    # Get model prediction
    with torch.no_grad():
        outputs = model(**inputs)
        logits = outputs.logits
        
    # Get probabilities
    probabilities = torch.softmax(logits, dim=-1)[0]
    predicted_class = torch.argmax(probabilities).item()
    
    # Map labels (LABEL_0 = negative, LABEL_1 = positive)
    label_map = {0: "negative", 1: "positive"}
    
    return {
        "text": text,
        "predicted_label": label_map[predicted_class],
        "confidence": probabilities[predicted_class].item(),
        "negative_score": probabilities[0].item(),
        "positive_score": probabilities[1].item()
    }

## Part (c): Predict Sentiment for 4 Sentences

Now we'll predict sentiment for the four required sentences.

### Sentence 1: "This movie is awesome"

In [11]:
sentence1 = "This movie is awesome"
result1 = predict_sentiment(sentence1)

print("=" * 60)
print(f"Input: {result1['text']}")
print(f"Predicted Sentiment: {result1['predicted_label'].upper()}")
print(f"Confidence: {result1['confidence']:.4f}")
print(f"\nDetailed Scores:")
print(f"  Negative: {result1['negative_score']:.4f}")
print(f"  Positive: {result1['positive_score']:.4f}")
print("=" * 60)

Input: This movie is awesome
Predicted Sentiment: POSITIVE
Confidence: 0.9992

Detailed Scores:
  Negative: 0.0008
  Positive: 0.9992


### Sentence 2: "I didn't like the movie so much"

In [12]:
sentence2 = "I didn't like the movie so much"
result2 = predict_sentiment(sentence2)

print("=" * 60)
print(f"Input: {result2['text']}")
print(f"Predicted Sentiment: {result2['predicted_label'].upper()}")
print(f"Confidence: {result2['confidence']:.4f}")
print(f"\nDetailed Scores:")
print(f"  Negative: {result2['negative_score']:.4f}")
print(f"  Positive: {result2['positive_score']:.4f}")
print("=" * 60)

Input: I didn't like the movie so much
Predicted Sentiment: NEGATIVE
Confidence: 0.9897

Detailed Scores:
  Negative: 0.9897
  Positive: 0.0103


### Sentence 3: "I'm not sure what I think about this movie."

In [13]:
sentence3 = "I'm not sure what I think about this movie."
result3 = predict_sentiment(sentence3)

print("=" * 60)
print(f"Input: {result3['text']}")
print(f"Predicted Sentiment: {result3['predicted_label'].upper()}")
print(f"Confidence: {result3['confidence']:.4f}")
print(f"\nDetailed Scores:")
print(f"  Negative: {result3['negative_score']:.4f}")
print(f"  Positive: {result3['positive_score']:.4f}")
print("=" * 60)

Input: I'm not sure what I think about this movie.
Predicted Sentiment: NEGATIVE
Confidence: 0.9766

Detailed Scores:
  Negative: 0.9766
  Positive: 0.0234


### Sentence 4: "Did you like the movie?"

In [14]:
sentence4 = "Did you like the movie?"
result4 = predict_sentiment(sentence4)

print("=" * 60)
print(f"Input: {result4['text']}")
print(f"Predicted Sentiment: {result4['predicted_label'].upper()}")
print(f"Confidence: {result4['confidence']:.4f}")
print(f"\nDetailed Scores:")
print(f"  Negative: {result4['negative_score']:.4f}")
print(f"  Positive: {result4['positive_score']:.4f}")
print("=" * 60)

Input: Did you like the movie?
Predicted Sentiment: POSITIVE
Confidence: 0.9577

Detailed Scores:
  Negative: 0.0423
  Positive: 0.9577


### Summary of All Predictions

In [15]:
# Create summary table
print("\n" + "=" * 80)
print("SUMMARY OF SENTIMENT PREDICTIONS")
print("=" * 80)
print(f"{'Sentence':<50} {'Prediction':<12} {'Confidence':>10}")
print("-" * 80)

for i, result in enumerate([result1, result2, result3, result4], 1):
    text = result['text'][:47] + "..." if len(result['text']) > 50 else result['text']
    print(f"{text:<50} {result['predicted_label']:<12} {result['confidence']:>10.4f}")

print("=" * 80)


SUMMARY OF SENTIMENT PREDICTIONS
Sentence                                           Prediction   Confidence
--------------------------------------------------------------------------------
This movie is awesome                              positive         0.9992
I didn't like the movie so much                    negative         0.9897
I'm not sure what I think about this movie.        negative         0.9766
Did you like the movie?                            positive         0.9577
