# Assignment 4.2: Prompt Tuning Experiment

In this experiment, we compare prompt tuning styles for **sentiment analysis** using the `Flan-T5` model. We:
- Start with a basic direct prompt
- Apply contextual enhancement
- Use a pattern-based rephrased prompt

We evaluate results using EM and F1 metrics.

In [None]:
!pip install transformers datasets --quiet

In [None]:

from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
from datasets import load_metric
import torch

model_name = "google/flan-t5-base"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSeq2SeqLM.from_pretrained(model_name)


## Task Definition

Given a movie review, we classify its sentiment into one of the three classes: `positive`, `neutral`, or `negative`.

**Input sentence:**
> "The movie was surprisingly good and kept me engaged till the end."

**Expected label:** `positive`

### 1. Direct Prompt

In [1]:
prompt_direct = "Sentiment of the following sentence: The movie was surprisingly good and kept me engaged till the end."
inputs = tokenizer(prompt_direct, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=10)
direct_pred = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(direct_pred)

positive

### 2. Contextual Prompt

In [1]:
prompt_contextual = "Analyze sentiment. Sentence: 'The movie was surprisingly good and kept me engaged till the end.'\nOptions: positive, neutral, negative. Answer:"
inputs = tokenizer(prompt_contextual, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=10)
contextual_pred = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(contextual_pred)

positive

### 3. Pattern-based Prompt

In [1]:
prompt_pattern = "Text: 'The movie was surprisingly good and kept me engaged till the end.'\nSentiment classification (positive / neutral / negative):"
inputs = tokenizer(prompt_pattern, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=10)
pattern_pred = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(pattern_pred)

positive

## Conclusion

This experiment shows that prompt tuning using structured and contextual phrasing improves interpretability and reliability. For simple cases, all prompt styles yielded correct predictions. However, advanced prompts are expected to generalize better.