## 🧭 Wrap-Up & Look-Ahead Reflection

### 🎓 What You Learned Today
- How to use the **Hugging Face pipeline** to connect prompts → models  
- How **parameters** like temperature, top-p, and tokens change model behavior  
- How to pick the **right model** for a given task (Flan vs GPT vs DialoGPT)  
- How a simple **Streamlit UI** turns code into an interactive chatbot  

---

### 💬 Think About…
1. Our chatbot only knows what’s inside its model — it can’t answer about *your* documents or notes.  
   - How could we make it read PDFs or data files and respond using that knowledge?  
     > *(Hint: this challenge leads to **Retrieval-Augmented Generation (RAG)** — next session!)*  

2. Today’s bot handles one message at a time.  
   - What if you wanted several “mini-bots” — one to search, one to plan, one to answer — all working together?  
     > *(That’s the world of **Multi-Agent AI**, coming soon.)*  

3. Our model always starts fresh — it forgets previous questions.  
   - How could a chatbot remember your last conversation or build on context?  
     > *(You’ll discover memory and state management when we combine RAG + agents.)*  

4. Curious minds only 🚀  
   - Ever wondered how these models can be **fine-tuned** on your own data, or how voice assistants use them in real time?  
     > *(That’s where advanced GenAI topics like fine-tuning and multi-modal inputs come in — optional reading!)*  

---

🎯 **Challenge for the Curious:**  
Write down one “pain point” you noticed while testing your chatbot today.  


# 🧠 In-Class Exercise – Solution
### Building Your First LLM Chatbot

This notebook contains **completed examples and explanations** for each step.  
Use it to review what we did in class, experiment with the code, and explore how different models behave.


## 🧩 Concept 1: The Hugging Face Pipeline

**Theory Recap:**  
A pipeline is like a “ready-made tool” that connects your text input to an AI model.  
Instead of manually loading weights and tokenizers, we use a *pipeline* for common tasks such as summarization, translation, and text generation.

In [None]:
# ✅ Example: Create a simple pipeline and use it

from transformers import pipeline

# Step 1: Choose your task and model
task = "text2text-generation"
model_name = "google/flan-t5-small"

# Step 2: Create the pipeline
generator = pipeline(task, model=model_name)

# Step 3: Try it out
response = generator("Summarize: Artificial intelligence helps automate tasks.")
print("Output:", response[0]['generated_text'])

✅ This is the foundation of any chatbot.
- `pipeline()` connects your text prompt to a model.
- `"text2text-generation"` means the model expects an instruction and outputs text (e.g., "Summarize", "Translate").


In [None]:
# 🧠 TASK 1 – Solution
# Use a different model - distilgpt2
from transformers import pipeline

task = "text-generation"
model_name = "distilgpt2"

generator = pipeline(task, model=model_name)
response = generator("Once upon a time, there was a student who", max_new_tokens=40)
print(response[0]['generated_text'])


📝 Here, `distilgpt2` continues text instead of following explicit commands.  
Try changing the prompt — notice how it just “keeps writing” instead of summarizing or translating.


In [None]:
# 💡 CHALLENGE 1 – Solution
# Trying an incorrect task-model combo
from transformers import pipeline

gen_wrong = pipeline("text2text-generation", model="distilgpt2")

try:
    response = gen_wrong("Translate: Hello world to French.")
    print(response[0]['generated_text'])
except Exception as e:
    print("Error observed:", e)


🧩 Using the wrong task type can cause errors or strange outputs.  
Each model is trained differently — some follow tasks, others just predict next words.

## 🧩 Concept 2: Controlling Model Creativity

**Recap:**  
You can control how “creative” or “focused” your chatbot is using parameters:
- **temperature** → randomness (0 = predictable, 1 = creative)  
- **top_p** → diversity of words considered  
- **max_new_tokens** → output length  


In [None]:
# ✅ Example: Comparing low vs high temperature
prompt = "Write a one-line quote about teamwork."

response_low = generator(prompt, temperature=0.2, max_new_tokens=30)
response_high = generator(prompt, temperature=0.9, max_new_tokens=30)

print("Low temperature:", response_low[0]['generated_text'])
print("High temperature:", response_high[0]['generated_text'])


🎯 Lower temperature = more focused and consistent.  
Higher temperature = more random and imaginative.  
Both can be useful — it depends on your goal.


In [None]:
# 🧠 TASK 2 – Solution
prompt = "Describe a sunset."

short = generator(prompt, max_new_tokens=20)
long = generator(prompt, max_new_tokens=80)

print("Short version:", short[0]['generated_text'])
print("\nLong version:", long[0]['generated_text'])


📝 Increasing `max_new_tokens` makes the answer longer and more descriptive.  
You can control the response size to fit your app’s purpose (short summaries vs long paragraphs).


In [None]:
# 💡 CHALLENGE 2 – Solution
prompt = "Create a catchy headline about teamwork."
creative = generator(prompt, temperature=0.9, max_new_tokens=20)
focused = generator(prompt, temperature=0.3, max_new_tokens=20)

print("Creative:", creative[0]['generated_text'])
print("Focused:", focused[0]['generated_text'])


💬 The “creative” setting gives varied headlines; the “focused” one sticks to predictable phrasing.  
This balance is key for designing personality in chatbots.


## 🧩 Concept 3: Choosing the Right Model

**Recap:**  
Different models have different personalities:
- `flan-t5-small` → follows instructions clearly  
- `distilgpt2` → continues text freely  
- `DialoGPT-small` → mimics human dialogue  


In [None]:
# ✅ Example: Compare FLAN vs DialoGPT
from transformers import pipeline

flan = pipeline("text2text-generation", model="google/flan-t5-small")
dialogpt = pipeline("text-generation", model="microsoft/DialoGPT-small")

prompt = "How do I make a good first impression?"

print("FLAN says:", flan(prompt)[0]['generated_text'])
print("\nDialoGPT says:", dialogpt(prompt)[0]['generated_text'])


In [None]:
# 🧠 TASK 3 – Solution
prompt = "How can students stay motivated while studying?"

flan_output = flan(prompt)
gpt_output = dialogpt(prompt, max_new_tokens=60)

print("FLAN:", flan_output[0]['generated_text'])
print("\nDialoGPT:", gpt_output[0]['generated_text'])


✨ The difference is clear:
- **FLAN** gives a structured, clear answer.  
- **DialoGPT** responds casually, like a conversation.  
Pick the model that matches your chatbot’s role.


In [None]:
# 💡 CHALLENGE 3 – Solution
# Choosing the best model for a "Study Helper" bot
study_bot = pipeline("text2text-generation", model="google/flan-t5-small")
response = study_bot("Explain gravity like I'm 10 years old.")
print(response[0]['generated_text'])


🧠 Instruction-following models like FLAN are ideal for teaching or explaining concepts.  
This is how “study bots” or “assistant bots” are usually built.


## 🧩 Concept 4: Connecting to a Streamlit App

**Recap:**  
Streamlit turns your chatbot logic into an interactive web app —  
letting users type prompts and see real-time answers.


In [None]:
# ✅ Example Streamlit Snippet (for reference)
"""
import streamlit as st
from transformers import pipeline

st.title("Mini Chatbot Demo")
generator = pipeline("text2text-generation", model="google/flan-t5-small")

user_input = st.text_input("Ask me something:")
if user_input:
    response = generator(user_input, max_new_tokens=60)
    st.write("Bot:", response[0]['generated_text'])
"""


💡 Streamlit is not just for chatbots — it’s used widely for quick AI demos, dashboards, and prototypes.  
You’ll build your own app version in the next exercises.


In [None]:
# 🧠 TASK 4 – Example Solution (pseudocode)

"""
Add a sidebar slider to control response length:

max_tokens = st.sidebar.slider("Max new tokens", 20, 200, 80, 10)
response = generator(user_input, max_new_tokens=max_tokens)
st.write("Bot:", response[0]["generated_text"])
"""
