## Task 4: GPT-2 Classification via Few‐Shot Prompting

In this section we will use the pre‐processed reviews saved on Google Drive to construct our few-shot prompts. Rather than hard-coding examples, we’ll:

1. **Load** the processed dataset from Drive.  
2. **Sample** three representative review–sentiment pairs (with a fixed random seed for reproducibility).  
3. **Display** the samples.  
4. **Use** them as 1-shot, 2-shot, and 3-shot prompts for GPT-2 to classify new reviews.



---


In [None]:
#Mount Google Drive to access the processed data
from google.colab import drive
drive.mount('/content/drive')


Mounted at /content/drive


This block:
- Loads the processed reviews DataFrame from Google Drive and selects three review–sentiment pairs using a fixed seed to ensure reproducible sampling.
- Prints each example with its review text and sentiment label, then converts them into a list of tuples for use as few‐shot prompt examples.

In [None]:
import pandas as pd

# Load processed reviews
reviews_df = pd.read_csv('/content/drive/MyDrive/preprocessed_reviews.csv')

# Sample three examples reproducibly
sample_examples = reviews_df.sample(n=3, random_state=42)[['review', 'sentiment']]

# Print each example clearly, one after another
print("Few-Shot Examples:")
for i, row in enumerate(sample_examples.itertuples(index=False), start=1):
    print(f"\nExample {i}:")
    print(f"Review   : {row.review}")
    print(f"Sentiment: {row.sentiment}")

# Convert to list of tuples for prompting
few_shot_examples = list(sample_examples.itertuples(index=False, name=None))


Few-Shot Examples:

Example 1:
Review   : really like summerslam look arena curtain just look overall interesting reason anyways good summerslam wwf do not lex luger main event yokozuna time ok huge fat man vs strong man I m glad time change terrible main event just like match luger terrible match card razor ramon vs ted dibiase steiner brother vs heavenly body shawn michael vs curt hene event shawn name big monster body guard diesel irs vs 123 kid bret hart take doink take jerry lawler stuff hart lawler interesting ludvig borga destroy marty jannetty undertaker take giant gonzalez terrible match smoking gunn tatanka take bam bam bigelow headshrinker yokozuna defend world title lex luger match boring terrible ending deserve 810
Sentiment: positive

Example 2:
Review   : television show appeal quite different kind fan like farscape doesi know youngster 3040 year oldfan male female different country think just adore tv miniserie element tv character drive drama australian soap opera epis

In [None]:
# Define the 3-shot examples for GPT-2 prompting-> Each sampled review and its sentiment label were printed so that the chosen examples could be inspected and verified before being used in the GPT-2 prompts.
few_shot_examples = [
    (
        "really like summerslam look arena curtain just look overall interesting reason anyways good summerslam wwf do not lex luger main event yokozuna time ok huge fat man vs strong man I m glad time change terrible main event just like match luger terrible match card razor ramon vs ted dibiase steiner brother vs heavenly body shawn michael vs curt hene event shawn name big monster body guard diesel irs vs 123 kid bret hart take doink take jerry lawler stuff hart lawler interesting ludvig borga destroy marty jannetty undertaker take giant gonzalez terrible match smoking gunn tatanka take bam bam bigelow headshrinker yokozuna defend world title lex luger match boring terrible ending deserve 810",
        "positive"
    ),
    (
        "television show appeal quite different kind fan like farscape doesi know youngster 3040 year oldfan male female different country think just adore tv miniserie element tv character drive drama australian soap opera episode science fact fiction hardy trekkie run money brainbender stake wormhole theory time travel true equational formmagnificent embrace culture map possibility endless have multiple star thousand planet choose fromwith broad scope expect able illusion long farscape really come elementit succeed fail especially like star trek universe practically zero kaos element run idea pretty quickly keep rehash course 4 season manage audience attention use good continuity constant character evolution multiple thread episode unique personal touch camera specific certain character group structure allow extremely large area subject matter loyalty forge broken way issue happen pilot premiere pass just tune crichton girl see television delight available dvd admit thing keep sane whilst 12 hour night shift develop chronic insomniafarscape thing extremely long nightsdo favour watch pilot meanfarscape comet",
        "positive"
    ),
    (
        "film quickly get major chase scene increase destruction really bad thing guy hijacking steven seagal beat pulp seagal drive probably end premise movieit like decide make kind change movie plot just plan enjoy action expect coherent plot turn sense logic reduce chance get headachei do hope steven seagal try type character portray popular movie",
        "negative"
    )
]



In [None]:
# Load GPT-2 and set up the prompt/classification functions
!pip install transformers torch --quiet  # to enable loading and running the GPT-2 model from Hugging Face.

[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m363.4/363.4 MB[0m [31m4.0 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m13.8/13.8 MB[0m [31m75.6 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m24.6/24.6 MB[0m [31m58.1 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m883.7/883.7 kB[0m [31m31.5 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m664.8/664.8 MB[0m [31m2.6 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m211.5/211.5 MB[0m [31m6.1 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m56.3/56.3 MB[0m [31m20.7 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m127.9/127.9 MB[0m [31m8.1 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

In [None]:
import re
import torch
from transformers import GPT2LMHeadModel, GPT2Tokenizer


In [None]:
# Define the model
MODEL_ID = "openai-community/gpt2"
tokenizer = GPT2Tokenizer.from_pretrained(MODEL_ID)
model     = GPT2LMHeadModel.from_pretrained(MODEL_ID)
model.eval()
if torch.cuda.is_available():
    model.to("cuda")

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


tokenizer_config.json:   0%|          | 0.00/26.0 [00:00<?, ?B/s]

vocab.json: 0.00B [00:00, ?B/s]

merges.txt: 0.00B [00:00, ?B/s]

tokenizer.json: 0.00B [00:00, ?B/s]

config.json:   0%|          | 0.00/665 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/548M [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/124 [00:00<?, ?B/s]

In [None]:

def build_prompt(review: str, k: int) -> str:
    prompt = ""
    for ex_review, ex_label in few_shot_examples[:k]:
        prompt += f"Review: \"{ex_review}\"\nSentiment: {ex_label}\n\n"
    prompt += f"Review: \"{review}\"\nSentiment:"
    return prompt

def classify_review(review: str, shots: int = 1) -> str:
    prompt = build_prompt(review, shots)
    input_ids = tokenizer.encode(prompt, return_tensors="pt")
    if torch.cuda.is_available():
        input_ids = input_ids.to("cuda")
    output = model.generate(
        input_ids,
        max_new_tokens=3,
        do_sample=False,
        pad_token_id=tokenizer.eos_token_id
    )
    gen = tokenizer.decode(output[0][input_ids.shape[-1]:], skip_special_tokens=True)
    m = re.search(r"(Positive|Negative)", gen, re.IGNORECASE)
    if m:
        return m.group(1).capitalize()
    # fallback
    return "Positive" if gen.lower().count("positive") >= gen.lower().count("negative") else "Negative"

In [None]:
# print other samples to evaluate
# Sample and print 6 random reviews (raw) with their sentiment for manual selection
sample_six = reviews_df.sample(n=12, random_state=42)[['review', 'sentiment']].reset_index(drop=True)
for i, row in sample_six.iterrows():
    print(f"Example {i+1}:")
    print(f"Review   : {row['review']}")
    print(f"Sentiment: {row['sentiment']}\n")


Example 1:
Review   : really like summerslam look arena curtain just look overall interesting reason anyways good summerslam wwf do not lex luger main event yokozuna time ok huge fat man vs strong man I m glad time change terrible main event just like match luger terrible match card razor ramon vs ted dibiase steiner brother vs heavenly body shawn michael vs curt hene event shawn name big monster body guard diesel irs vs 123 kid bret hart take doink take jerry lawler stuff hart lawler interesting ludvig borga destroy marty jannetty undertaker take giant gonzalez terrible match smoking gunn tatanka take bam bam bigelow headshrinker yokozuna defend world title lex luger match boring terrible ending deserve 810
Sentiment: positive

Example 2:
Review   : television show appeal quite different kind fan like farscape doesi know youngster 3040 year oldfan male female different country think just adore tv miniserie element tv character drive drama australian soap opera episode science fact fic



```
<1 2 and 3 where previously used as shots for the model while 4 5 and 6 are new samples>
```



Definitions 💎

One-shot prompting provided GPT-2 with one example review–label pair before asking it to classify a new review.

Two-shot prompting included two example pairs in the prompt.

Three-shot prompting supplied three example pairs.

In [None]:
# 1-Shot Prompting
test_review =''' jane austen definitely approve onegwyneth paltrow do awesome job capture attitude emma funny excessively silly elegant put convince british
 accent british maybe I m good judge fool meshe excellent slide doorsi forget she s american brilliant jeremy northam sophie thompson phyllida law emma thompson sister mother bate woman nearly steal showand ms law do not lineshighly recommend
'''
# actual was positive

prompt_1 = build_prompt(test_review, k=1)
print("1-Shot Prompt:\n", prompt_1)
print("Prediction:", classify_review(test_review, shots=1))


The attention mask is not set and cannot be inferred from input because pad token is same as eos token. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.


1-Shot Prompt:
 Review: "really like summerslam look arena curtain just look overall interesting reason anyways good summerslam wwf do not lex luger main event yokozuna time ok huge fat man vs strong man I m glad time change terrible main event just like match luger terrible match card razor ramon vs ted dibiase steiner brother vs heavenly body shawn michael vs curt hene event shawn name big monster body guard diesel irs vs 123 kid bret hart take doink take jerry lawler stuff hart lawler interesting ludvig borga destroy marty jannetty undertaker take giant gonzalez terrible match smoking gunn tatanka take bam bam bigelow headshrinker yokozuna defend world title lex luger match boring terrible ending deserve 810"
Sentiment: positive

Review: " jane austen definitely approve onegwyneth paltrow do awesome job capture attitude emma funny excessively silly elegant put convince british
 accent british maybe I m good judge fool meshe excellent slide doorsi forget she s american brilliant jere

In [None]:
# 2-Shot Prompting
test_review = '''story hope highlight tragic reality youth face favela rise draw scary unsafe unfair world show beautiful color move music man dedicated friend choose accept world
 change action art entertain interesting emotional aesthetically beautiful film show film numerous high school student live neighborhood poverty gun violence enamor anderson protagonist
 recommend film age 13 subtitle image death background
'''   # example 7 from the previous block actual was positive

prompt_2 = build_prompt(test_review, k=2)
print("2-Shot Prompt:\n", prompt_2)
print("Prediction:", classify_review(test_review, shots=2))


2-Shot Prompt:
 Review: "really like summerslam look arena curtain just look overall interesting reason anyways good summerslam wwf do not lex luger main event yokozuna time ok huge fat man vs strong man I m glad time change terrible main event just like match luger terrible match card razor ramon vs ted dibiase steiner brother vs heavenly body shawn michael vs curt hene event shawn name big monster body guard diesel irs vs 123 kid bret hart take doink take jerry lawler stuff hart lawler interesting ludvig borga destroy marty jannetty undertaker take giant gonzalez terrible match smoking gunn tatanka take bam bam bigelow headshrinker yokozuna defend world title lex luger match boring terrible ending deserve 810"
Sentiment: positive

Review: "television show appeal quite different kind fan like farscape doesi know youngster 3040 year oldfan male female different country think just adore tv miniserie element tv character drive drama australian soap opera episode science fact fiction hard

In [None]:
# 3-Shot Prompting
test_review = ''' jeez immensely boring lead man christian schoyen get bad actor see thing character movie move america live 20 year speak lot well english pull say language skikkelig gebrokkent cool norwegian
 dude movie hollywood just damn shame talentless hack storyline mediocre suspicion christian schoyen do movie just live dream clearly do film hump beautiful babe
'''  # actual was negative

prompt_3 = build_prompt(test_review, k=3)
print("3-Shot Prompt:\n", prompt_3)
print("Prediction:", classify_review(test_review, shots=3))


3-Shot Prompt:
 Review: "really like summerslam look arena curtain just look overall interesting reason anyways good summerslam wwf do not lex luger main event yokozuna time ok huge fat man vs strong man I m glad time change terrible main event just like match luger terrible match card razor ramon vs ted dibiase steiner brother vs heavenly body shawn michael vs curt hene event shawn name big monster body guard diesel irs vs 123 kid bret hart take doink take jerry lawler stuff hart lawler interesting ludvig borga destroy marty jannetty undertaker take giant gonzalez terrible match smoking gunn tatanka take bam bam bigelow headshrinker yokozuna defend world title lex luger match boring terrible ending deserve 810"
Sentiment: positive

Review: "television show appeal quite different kind fan like farscape doesi know youngster 3040 year oldfan male female different country think just adore tv miniserie element tv character drive drama australian soap opera episode science fact fiction hard

# Observation



```
# while giving the model more examples should outerperform the model .
the model seems to be biased to what class there are in the given shots
```

