# Session 3 ‚Äî Notebook 02: LLM Parameters

### Objective: 
Learn the **syntax** for controlling LLM outputs using generation parameters.

We will experiment with:
- `max_new_tokens` ‚Üí response length
- `do_sample` ‚Üí deterministic vs varied outputs
- `temperature` ‚Üí randomness (works with sampling)
- `top_p` ‚Üí controls how broad the word choice pool is (works with sampling)


In [2]:
import warnings
warnings.filterwarnings("ignore", category=FutureWarning)  # optional for cleaner class output

from transformers import pipeline

# Load an instruction-following model (CPU friendly)
# google/flan-t5-base ‚Üí higher quality than "small" (but slower/heavier on CPU)
gen = pipeline("text2text-generation", model="google/flan-t5-base")


In [3]:
def safe_prompt(task: str) -> str:
    return (
        "You are a helpful assistant for students. "
        "Keep responses polite, non-explicit, and suitable for a classroom.\n"
        "If the task is unclear, ask a short clarifying question.\n"
        f"Task: {task}"
    )

# We'll reuse one prompt across experiments to compare outputs
prompt = safe_prompt("Write a short story about a dog who becomes a hero.")


----------------------------------------------------------------

### 1) `max_new_tokens` -> response length

**Syntax:**
```python
response = gen(prompt, max_new_tokens=30)
```

- max_new_tokens limits how many new tokens the model can generate.
- Smaller value ‚Üí shorter response, Larger value ‚Üí longer response

In [8]:
# MAX NEW TOKENS -> controls how long the response can be
# Smaller values = short answers

print("Max tokens = 20")
print(gen(prompt, max_new_tokens=20)[0]["generated_text"])

Max tokens = 20
The dog is a dog. He is a dog. He is a hero.


In [9]:
# MAX NEW TOKENS -> controls how long the response can be
# Larger values = detailed answers

print("\nMax tokens = 80")
print(gen(prompt, max_new_tokens=80)[0]["generated_text"])


Max tokens = 80
The dog is a dog. He is a dog. He is a hero. He is a dog. He is a hero.


-------------------------------------------------------------

### 2) `do_sample` -> deterministic vs varied outputs

By default, most pipelines behave like:
- `do_sample=False` ‚Üí **deterministic** (same prompt ‚Üí same output)

**Syntax:**
```python
gen(prompt, do_sample=False)
gen(prompt, do_sample=True)

```

------------------------------------------------------------

When a model generates text, it chooses words one by one.
Each next word has probabilities ‚Äî like:

Token | Probability
------|-------------
dog   | 0.65
cat   | 0.25
wolf  | 0.10

`do_sample` tells the model *how* to pick from these probabilities.

-------------------------------------------------------------
üß© do_sample = False   ‚Üí  DETERMINISTIC MODE

-------------------------------------------------------------
- Always picks the word with the highest score.
- No randomness.
- Same prompt ‚Üí same output every time.
- ‚úÖ Called ‚Äúdeterministic‚Äù because the result is fixed and repeatable.

-------------------------------------------------------------
üé≤ do_sample = True   ‚Üí  STOCHASTIC MODE

-------------------------------------------------------------
- Randomly picks from the probability list.
- Sometimes ‚Äúcat,‚Äù sometimes ‚Äúdog,‚Äù depending on luck and temperature.
- Same prompt ‚Üí slightly different outputs each time.
- üé® Called ‚Äústochastic‚Äù because randomness is part of the process.

-------------------------------------------------------------

When sampling is ON (`do_sample=True`), we can control *how* it samples using `temperature` and `top_p`.


In [13]:
# do_sample = False

print("do_sample = False")
print(gen(prompt, do_sample=False, max_new_tokens=40)[0]["generated_text"])


do_sample = False
The dog is a dog. He is a dog. He is a hero. He is a dog. He is a hero.


In [14]:
# do_sample = True

print("\ndo_sample = True")
print(gen(prompt, do_sample=True, max_new_tokens=40)[0]["generated_text"])


do_sample = True
The Little Puppy is a German Shepherd dog with a nose. The owner teaches him how to recognize and care for the dog by examining the thigh. Soon the puppy begins


-------------------------------------------------------------

### 3) `temperature` (randomness / creativity)

**Important:** `temperature` works only when `do_sample=True`.

**How it works (intuition):**
- At each step, the model has multiple possible next tokens with probabilities.
- `temperature` reshapes those probabilities:
  - Lower temperature (e.g., 0.2) ‚Üí makes the top choices **more dominant**
    ‚Üí safer/more predictable wording
  - Higher temperature (e.g., 1.2+) ‚Üí flattens probabilities
    ‚Üí more variety, but higher risk of odd/off-topic outputs


**Syntax:**
```python
gen(prompt, do_sample=True, temperatur

In [10]:
# TEMPERATURE -> controls creativity / randomness
# Lower = predictable, Higher = more creative/varied

print("Temperature = 0.2")
print(gen(prompt, temperature=0.2, do_sample=True, max_new_tokens=60)[0]["generated_text"])

Temperature = 0.2
The dog is a dog. He is a dog. He is a hero. He is a hero.


In [11]:
# TEMPERATURE -> controls creativity / randomness
# Lower = predictable, Higher = more creative/varied

print("\nTemperature = 1.5")
print(gen(prompt, temperature=1.5, do_sample=True, max_new_tokens=60)[0]["generated_text"])


Temperature = 1.5
A small black dog runs inside under a bed of blue smoke drifting away out of air. A little green light comes over the head of the dog, pointing for where the fire belongs, hoping that someone will rescue it; but quickly she stumbles on a cat's head and


In [12]:
# TEMPERATURE -> controls creativity / randomness
# Lower = predictable, Higher = more creative/varied

print("\nTemperature = 1.5")
print(gen(prompt, temperature=1.5, do_sample=False, max_new_tokens=60)[0]["generated_text"])


Temperature = 1.5




The dog is a dog. He is a dog. He is a hero. He is a dog. He is a hero.


**What have we learnt?**

Temperature doesn‚Äôt make the model ‚Äúsmarter‚Äù ‚Äî it changes how risky it is when picking the next word.

- Low temp: always picks the safest word ‚Üí same output every time.
- High temp: takes more risks ‚Üí new ideas, but more errors.
- Works only if sampling is on (do_sample=True).

-------------------------------------------------------------

### 4) `top_p` (nucleus sampling: how wide the choice pool is)

**Important:** `top_p` is used when `do_sample=True`.

**How it works (intuition):**
Instead of considering every possible next token, `top_p` picks a *set* of tokens whose
probabilities add up to `p`, then samples from that set.

- `top_p=0.1` ‚Üí very narrow pool (only the most likely tokens)
  ‚Üí more focused / less diverse wording
- `top_p=0.9` ‚Üí wider pool
  ‚Üí more diverse wording

**Syntax:**
```python
gen(prompt, do_sample=True, top_p=0.9)


In [15]:
# TOP-P -> controls diversity of word choices
# Lower = focused on the most likely words
# Higher = allows more diverse / unexpected words

print("Top-p = 0.3")
print(gen(prompt, top_p=0.1, do_sample=True, max_new_tokens=60)[0]["generated_text"])

Top-p = 0.3
The dog is a dog. He is a dog. He is a hero. He is a dog. He is a hero.


In [16]:
# TOP-P -> controls diversity of word choices
# Lower = focused on the most likely words
# Higher = allows more diverse / unexpected words

print("Top-p = 0.9")
print(gen(prompt, top_p=0.9, do_sample=True, max_new_tokens=60)[0]["generated_text"])


Top-p = 0.9
The dog who was injured a few years ago lost his owner. Seeing the dog's owner in the street, he began to take a liking to him. His owner became more than a dog.


### Reflection

1) If you want the **same output every time** for the same prompt, which setting matters most?
   - **Hint:** Look at `do_sample`.

2) If your responses are **too repetitive**, what would you change first?
   - **Hint:** Turn ON sampling, then adjust randomness.

3) If outputs become **too weird/off-topic**, what two parameters could you lower to make it safer?
   - **Hint:** Both are used when sampling is ON.

4) Suppose you need a **very short answer** (1 sentence). Which parameter directly controls that?
   - **Hint:** This parameter limits length.

5) Two students used the same prompt and got different outputs. What is the most likely reason?
   - **Hint:** Sampling changes whether outputs can vary.


-------------------------------------------------

## Full demo: two classroom-safe use cases (tuned differently)

We will use the **same parameters** but choose different values depending on the goal:
- Use Case A: ‚ÄúReliable and safe‚Äù (more deterministic / focused)
- Use Case B: ‚ÄúMore creative‚Äù (more variety, but still classroom-safe)


--------------------------------------------

In [40]:
task_a = "Summarize machine learning in one line:"
prompt_a = safe_prompt(task_a)

response_a = gen(
    prompt_a,
    max_new_tokens=120,   # enough space for bullets
    do_sample=True,      # allow some flexibility in phrasing
    temperature=0.3,     # low randomness (more reliable)
    top_p=0.3            # narrow choice pool (more focused)
)

print("Use Case A (Reliable/Safe)")
print("Task:", task_a)
print("Output:\n", response_a[0]["generated_text"])


Use Case A (Reliable/Safe)
Task: Summarize machine learning in one line:
Output:
 What are the benefits of machine learning?


-----------------------------------

In [22]:
task_b = "Write a short, encouraging story (4 sentences) about teamwork during a college project."
prompt_b = safe_prompt(task_b)

response_b = gen(
    prompt_b,
    max_new_tokens=180,  # more room for storytelling
    do_sample=True,      # variation required for creativity
    temperature=0.9,     # higher randomness (more creative)
    top_p=0.9            # wider pool (more diverse word choice)
)

print("\nUse Case B (Creative but Classroom-Safe)")
print("Task:", task_b)
print("Output:\n", response_b[0]["generated_text"])



Use Case B (Creative but Classroom-Safe)
Task: Write a short, encouraging story (4 sentences) about teamwork during a college project.
Output:
 All of the students had a very successful project. My wife, who was a biology major, gave us a lot of help. I really enjoyed the project. But as the project progressed, the team struggled to stay on top. The teacher, who was a freshman, recommended that we use the project as a team and work together. My wife thought it was a great project. My wife, who was a freshman, was a little nervous about it. She also wanted to help out.


-----------------------------------