# Session 2 – Notebook 3: LLM Parameters

**Objectives:**
- Learn about key parameters that control LLM behavior.
- Experiment with:
  - `temperature` → randomness / creativity.
  - `max_new_tokens` → response length.
  - `top_p` → diversity of word choices.
- Understand how parameter tuning changes chatbot output.

In [None]:
from transformers import pipeline

# Load a small instruction-following model
# flan-t5-small is light and good for Q&A or simple tasks
gen = pipeline("text2text-generation", model="google/flan-t5-small")

prompt = "Write a short story about a dog who becomes a hero."


In [None]:
# TEMPERATURE -> controls creativity / randomness
# Lower = predictable, Higher = more creative/varied

print("Temperature = 0.2")
print(gen(prompt, temperature=0.2, max_new_tokens=60)[0]["generated_text"])

In [None]:
# TEMPERATURE -> controls creativity / randomness
# Lower = predictable, Higher = more creative/varied

print("\nTemperature = 0.9")
print(gen(prompt, temperature=0.9, max_new_tokens=60)[0]["generated_text"])

In [None]:
# MAX NEW TOKENS -> controls how long the response can be
# Smaller values = short answers, larger values = detailed answers

print("Max tokens = 20")
print(gen(prompt, max_new_tokens=20)[0]["generated_text"])

In [None]:
# MAX NEW TOKENS -> controls how long the response can be
# Smaller values = short answers, larger values = detailed answers

print("\nMax tokens = 80")
print(gen(prompt, max_new_tokens=80)[0]["generated_text"])

In [None]:
# TOP-P -> controls diversity of word choices
# Lower = focused on the most likely words
# Higher = allows more diverse / unexpected words

print("Top-p = 0.5")
print(gen(prompt, top_p=0.5, max_new_tokens=60)[0]["generated_text"])

In [None]:
# TOP-P -> controls diversity of word choices
# Lower = focused on the most likely words
# Higher = allows more diverse / unexpected words

print("Top-p = 0.5")
print(gen(prompt, top_p=0.5, max_new_tokens=60)[0]["generated_text"])


### Reflection

- Which response sounded more creative or surprising?  
- Which response was the shortest or most factual?  
- If you wanted to build a chatbot for **homework help**, what settings would you use?  
- If you wanted to build a **storytelling bot**, what settings would you use?  

In [None]:
# FULL DEMO: Adjust all parameters at once

prompt = "Explain photosynthesis like I am 10 years old."

response = gen(
    prompt,
    temperature=0.7,     # creativity
    max_new_tokens=80,   # length
    top_p=0.9            # diversity
)

print("User:", prompt)
print("Bot:", response[0]["generated_text"])
