# LLM Experimentation — Text Generation with GPT-2 (Hugging Face)

**Student:** Yisakor Mirany  
**Model:** `openai-community/gpt2`  
**Task:** Text Generation

This notebook explores how the **temperature** parameter affects GPT-2’s output.
We hold other parameters fixed and compare **coherence**, **creativity**, and **style**.

**Trials (Temperature):** 0.2, 0.7, 1.0  
**Fixed settings:** `max_new_tokens=120`, `top_p=0.9`, `do_sample=True`


In [1]:
# If you're on Colab or a fresh environment, uncomment the next line.
# %pip install -q transformers torch

import sys, platform
print("Python:", sys.version.split()[0], "| Platform:", platform.platform())


Python: 3.12.12 | Platform: Linux-6.6.105+-x86_64-with-glibc2.35


## Setup: Imports & Device
We’ll use PyTorch with Transformers. If CUDA is available, inference will use GPU.


In [2]:
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM, set_seed

device = "cuda" if torch.cuda.is_available() else "cpu"
print("Using device:", device)

# For reproducibility across runs
GLOBAL_SEED = 42
set_seed(GLOBAL_SEED)


Using device: cpu


## Load Pretrained Model & Tokenizer
We’ll use the small and fast **openai-community/gpt2** (same weights as GPT-2 small).


In [3]:
MODEL_NAME = "openai-community/gpt2"

tokenizer = AutoTokenizer.from_pretrained(MODEL_NAME)
# GPT-2 has no pad token; set one to avoid warnings for generation
if tokenizer.pad_token is None:
    tokenizer.pad_token = tokenizer.eos_token

model = AutoModelForCausalLM.from_pretrained(MODEL_NAME)
model = model.to(device)
model.eval()

print("Loaded:", MODEL_NAME)

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


tokenizer_config.json:   0%|          | 0.00/26.0 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/665 [00:00<?, ?B/s]

vocab.json: 0.00B [00:00, ?B/s]

merges.txt: 0.00B [00:00, ?B/s]

tokenizer.json: 0.00B [00:00, ?B/s]

model.safetensors:   0%|          | 0.00/548M [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/124 [00:00<?, ?B/s]

Loaded: openai-community/gpt2


## Helper: Generation Function
Convenience wrapper to generate text with chosen parameters and return decoded output.


In [4]:
def generate_text(
    prompt: str,
    max_new_tokens: int = 120,
    temperature: float = 0.7,
    top_p: float = 0.9,
    do_sample: bool = True,
    seed: int = GLOBAL_SEED
) -> str:
    set_seed(seed)  # keep runs comparable
    inputs = tokenizer(prompt, return_tensors="pt").to(device)
    with torch.no_grad():
        output_ids = model.generate(
            **inputs,
            max_new_tokens=max_new_tokens,
            temperature=temperature,
            top_p=top_p,
            do_sample=do_sample,
            pad_token_id=tokenizer.pad_token_id,
            eos_token_id=tokenizer.eos_token_id,
        )
    return tokenizer.decode(output_ids[0], skip_special_tokens=True)


## Baseline Prompt
A single creative prompt that works well across temperatures.


In [None]:
prompt = (
    "Once upon a time in a futuristic city, a student discovered a forgotten terminal "
    "that could speak back. They typed a single question: "
    "\"What is the price of creativity?\" The terminal replied"
)
print(prompt)


In [5]:
prompt = (
    "Once upon a time in a futuristic city, a student discovered a forgotten terminal "
    "that could speak back. They typed a single question: "
    "\"What is the price of creativity?\" The terminal replied"
)
print(prompt)


Once upon a time in a futuristic city, a student discovered a forgotten terminal that could speak back. They typed a single question: "What is the price of creativity?" The terminal replied


## Experiments: Vary Temperature
We keep `max_new_tokens=120`, `top_p=0.9`, `do_sample=True` constant  
and vary **temperature**: **0.2**, **0.7**, **1.0**.


In [6]:
trial_settings = [
    {"name": "Temp 0.2", "temperature": 0.2},
    {"name": "Temp 0.7", "temperature": 0.7},
    {"name": "Temp 1.0", "temperature": 1.0},
]

results = []
for ts in trial_settings:
    out = generate_text(
        prompt=prompt,
        temperature=ts["temperature"],
        top_p=0.9,
        max_new_tokens=120,
        do_sample=True,
        seed=GLOBAL_SEED,   # same seed for comparability
    )
    results.append({"trial": ts["name"], "temperature": ts["temperature"], "output": out})

print("Generated", len(results), "outputs.")


Generated 3 outputs.


## Outputs
Below are the full outputs for each temperature.


In [7]:
from textwrap import shorten

for r in results:
    print("="*80)
    print(f"[{r['trial']}]  Temperature={r['temperature']}")
    print("-"*80)
    print(r["output"])
    print()


[Temp 0.2]  Temperature=0.2
--------------------------------------------------------------------------------
Once upon a time in a futuristic city, a student discovered a forgotten terminal that could speak back. They typed a single question: "What is the price of creativity?" The terminal replied: "A thousand dollars."

The student was able to solve the question by typing a single word: "I want to be a scientist." The student then typed a question: "What is the price of creativity?" The terminal replied: "A thousand dollars."

The student then typed a question: "What is the price of creativity?" The terminal replied: "A thousand dollars."

The student then typed a question: "What is the price of creativity?" The terminal replied: "A thousand dollars."

The student then typed a question: "What is the price

[Temp 0.7]  Temperature=0.7
--------------------------------------------------------------------------------
Once upon a time in a futuristic city, a student discovered a forgotten 

## Summary Table (Markdown)
We’ll build a small table with **Parameter Value → Output Snippet → Brief Observation**.


In [8]:
def first_sentence(text: str, max_chars: int = 140) -> str:
    # Grab a short snippet/sentence for the table
    snippet = text.split("\n")[0]
    return shorten(snippet, width=max_chars, placeholder="...")

table_lines = []
table_lines.append("| Test | Temperature | Output Snippet | Brief Observation |")
table_lines.append("|------|-------------|----------------|-------------------|")

# Simple, heuristic observations you can refine by reading outputs:
for idx, r in enumerate(results, start=1):
    t = r["temperature"]
    snippet = first_sentence(r["output"])
    if t <= 0.25:
        obs = "Very deterministic; focused but can be bland or repetitive."
    elif t < 0.9:
        obs = "Balanced creativity and coherence; generally engaging."
    else:
        obs = "High diversity; more surprising but sometimes less coherent."
    row = f"| {idx} | {t} | {snippet} | {obs} |"
    table_lines.append(row)

md_table = "\n".join(table_lines)
print(md_table)


| Test | Temperature | Output Snippet | Brief Observation |
|------|-------------|----------------|-------------------|
| 1 | 0.2 | Once upon a time in a futuristic city, a student discovered a forgotten terminal that could speak back. They typed a single question:... | Very deterministic; focused but can be bland or repetitive. |
| 2 | 0.7 | Once upon a time in a futuristic city, a student discovered a forgotten terminal that could speak back. They typed a single question:... | Balanced creativity and coherence; generally engaging. |
| 3 | 1.0 | Once upon a time in a futuristic city, a student discovered a forgotten terminal that could speak back. They typed a single question:... | High diversity; more surprising but sometimes less coherent. |


## Analysis (Write-up Guide)

Use the outputs above to comment on:
- **Coherence:** Does the story stay on-topic with logical sentences?
- **Creativity/Novelty:** Are there surprising metaphors or plot turns?
- **Repetition:** Does the model repeat phrases at low temperatures?
- **Style/Tone:** How does the “voice” change as temperature increases?

**Example Observations (customize to your outputs):**
- **Temp 0.2:** The output sticks closely to the prompt and continues with safe, predictable wording. Coherence is high, but imagery and novelty are limited.
- **Temp 0.7:** A good balance—flow remains coherent while introducing fresh details. This setting often yields the most “human-like” narrative.
- **Temp 1.0:** The text becomes more adventurous and metaphorical. It can produce striking lines, but risks contradictions or abrupt topic shifts.

Conclude with which setting you prefer for your specific use-case (e.g., creative writing vs. factual continuity).


## Optional Extension
If you want, repeat the experiment by varying **max_new_tokens** or **top_p** instead of temperature.
Keep the analysis structure the same and compare the differences.


In [9]:
def run_sweep(
    prompt: str,
    variable: str = "max_new_tokens",
    values = (60, 120, 240),
    fixed = {"temperature": 0.7, "top_p": 0.9, "do_sample": True}
):
    sweep_results = []
    for v in values:
        kwargs = dict(max_new_tokens=120, temperature=0.7, top_p=0.9, do_sample=True)
        kwargs.update(fixed)
        if variable == "max_new_tokens":
            kwargs["max_new_tokens"] = int(v)
        elif variable == "top_p":
            kwargs["top_p"] = float(v)
        elif variable == "temperature":
            kwargs["temperature"] = float(v)
        else:
            raise ValueError("variable must be one of: max_new_tokens, top_p, temperature")

        text = generate_text(prompt, **kwargs)
        sweep_results.append({"variable": variable, "value": v, "output": text})
    return sweep_results

# Example usage (commented out):
# alt_results = run_sweep(prompt, variable="top_p", values=(0.5, 0.9, 1.0))
# len(alt_results)


## Final Notes
- Keep this notebook in your repo (`llm_experiment.ipynb`).
- Copy the **Summary Table (Markdown)** into your `README.md`.
- Record a short video (screen + face) running the three trials and explaining your analysis.

**Environment**
- `pip install torch transformers`
- GPU (if available) speeds up generation but is not required.
