# Session 9: From Single Models to Systems
## Chaining Models Together

**Session Length:** 2 hours

**Today's Mission:** Move beyond single models to build multi-model systems. Learn how to chain the output of one model into the input of another, understand error cascades, and start thinking like a system designer.

### Session Outline
| Time | Activity |
|------|----------|
| 0:00-0:05 | Review: What bias patterns did you find? |
| 0:05-0:30 | Part 1: What Is a Pipeline? |
| 0:30-1:00 | Part 2: Building a Multi-Model Pipeline |
| 1:00-1:40 | Part 3: Design Your Own Pipeline |
| 1:40-2:00 | On Your Own: Build or modify a pipeline |

### Key Vocabulary
| Term | Definition |
|------|-----------|
| Pipeline | Chaining multiple models where output of one feeds into the next |
| Error Cascade | When one model's mistake makes every downstream model worse |
| System Design | Planning how multiple components work together |
| Multi-Model System | A tool that uses more than one AI model |
| Dependencies | When one part depends on another part working correctly |

---

## Review: What Bias Patterns Did You Find? (0:00-0:05)

Last session we investigated bias in AI models -- how training data creates blind spots, how paired-sentence tests reveal unequal treatment, and why this matters for real people. We also explored uncertainty and the difference between confidence and correctness.

Today we shift gears. So far you have used **single models**: one input, one model, one output. Real AI systems do not work like that. Real systems chain multiple models together -- the output of one becomes the input of the next. This is powerful, but it introduces a critical new problem.

---

## Setup

Run this cell to install the libraries we need.

In [None]:
!pip install transformers==4.47.1 pillow requests gradio -q


### Important: Restart Your Runtime

After installing packages, you need to restart the runtime so Python can find them.

**Go to: Runtime > Restart runtime**

After restarting, come back here and continue running the cells below. You do NOT need to re-run the install cell -- the packages are already installed. Just start from the next code cell.

---

In [None]:
from transformers import pipeline
from PIL import Image
import requests
from io import BytesIO

def load_image(url):
    response = requests.get(url)
    return Image.open(BytesIO(response.content))

print("Ready!")

---

## Part 1: What Is a Pipeline? (0:05-0:30)

So far you have used single models: one input goes in, one output comes out.

```
Text --> [Sentiment Model] --> "POSITIVE (92%)"
```

Real AI systems chain multiple models together. The output of one becomes the input of the next:

```
Text --> [Emotion Model] --> emotions
     --> [Topic Classifier] --> topic
     --> [NER Model] --> key people and places
     --> [Summarizer] --> summary
```

Five models, one function. Each answers a different question about the same text. This is a **multi-model system** -- and it is how most real AI applications work.

### Building a Text Analysis Pipeline

Let's build a comprehensive text analysis function that uses five different models on the same input.

In [None]:
# Load all the models we need
print("Loading models (this may take a minute)...")

emotions = pipeline(
    "text-classification",
    model="j-hartmann/emotion-english-distilroberta-base",
    top_k=None
)
classifier = pipeline("zero-shot-classification")
ner = pipeline("ner", grouped_entities=True)
summarizer = pipeline("summarization", model="sshleifer/distilbart-cnn-12-6")
sentiment = pipeline("sentiment-analysis")

print("All 5 models loaded!")

In [None]:
def comprehensive_text_analysis(text):
    """
    Perform multiple analyses on a piece of text using 5 models.
    """
    print("COMPREHENSIVE TEXT ANALYSIS")
    print("=" * 60)
    print(f"Analyzing text ({len(text)} characters)...\n")

    # 1. Emotional Tone
    print("1. EMOTIONAL TONE")
    emotion_result = emotions(text[:500])[0]
    top_emotions = sorted(emotion_result, key=lambda x: x['score'], reverse=True)[:3]
    for e in top_emotions:
        bar = "*" * int(e['score'] * 20)
        print(f"   {e['label']:10} {bar} {e['score']:.1%}")

    # 2. Topic Classification
    print("\n2. LIKELY TOPIC")
    topics = ["technology", "politics", "sports", "entertainment", "science", "business", "personal"]
    topic_result = classifier(text[:500], topics)
    print(f"   Primary topic: {topic_result['labels'][0]} ({topic_result['scores'][0]:.1%})")

    # 3. Key Entities
    print("\n3. KEY ENTITIES")
    entity_result = ner(text)
    if entity_result:
        for e in entity_result[:5]:
            print(f"   {e['word']}: {e['entity_group']}")
    else:
        print("   No named entities found")

    # 4. Summary
    print("\n4. SUMMARY")
    if len(text.split()) > 30:
        summary = summarizer(text, max_length=60, min_length=20)[0]['summary_text']
        print(f"   {summary}")
    else:
        print(f"   (Text too short to summarize)")

    # 5. Overall Sentiment
    print("\n5. OVERALL SENTIMENT")
    sent_result = sentiment(text[:500])[0]
    print(f"   {sent_result['label']} ({sent_result['score']:.1%})")

    print("\n" + "=" * 60)
    print("Five models, one function. Each answered a different question.")

> **INSTRUCTOR NOTE:** Before running the next cell, explain: "This function runs 5 different AI models on the same text. Watch how each model extracts different information from the same input."

In [None]:
test_text = """
The breakthrough discovery in quantum computing announced yesterday by researchers
at MIT has sent shockwaves through the tech industry. Dr. Sarah Chen and her team
demonstrated a new method for maintaining quantum coherence at room temperature,
potentially solving one of the biggest obstacles to practical quantum computers.
Google and IBM stocks rose sharply on the news, while investors scrambled to
understand the implications. Critics caution that scaling the technology remains
a challenge, but optimists believe commercial quantum computers could arrive
within five years.
"""

comprehensive_text_analysis(test_text)

Notice how each model contributes something different:
- The **emotion model** detects the emotional tone
- The **topic classifier** identifies what the text is about
- The **NER model** finds specific people, organizations, and places
- The **summarizer** compresses the key information
- The **sentiment model** gives an overall positive/negative reading

No single model could do all of this. The power comes from **combining** them.

### Student Test

> **INSTRUCTOR NOTE:** Ask students for a text to analyze -- a news paragraph, a social media post, or something from a class they are taking. Type it in and run the analysis.

In [None]:
student_text = "REPLACE WITH STUDENT SUGGESTION"

comprehensive_text_analysis(student_text)

---

## Part 2: Building a Multi-Model Pipeline (0:30-1:00)

The text analysis above runs five models **independently** on the same text. That is useful but it is not a true pipeline. In a true pipeline, the **output of one model becomes the input of the next**.

Let's build one: Image --> Caption --> Mood --> Story.

### The Image-to-Story Pipeline

### Restart Your Runtime First

Before loading image models, restart to free memory from the text models.

**Go to: Runtime > Restart runtime**

Then re-run the setup cell (imports and `load_image` helper) and continue here.

---

In [None]:
from transformers import pipeline
from PIL import Image
import requests
from io import BytesIO

def load_image(url):
    response = requests.get(url)
    return Image.open(BytesIO(response.content))

# Load image + text models for the pipeline
print("Loading pipeline models...")
captioner = pipeline("image-to-text", model="Salesforce/blip-image-captioning-base")
image_classifier = pipeline("zero-shot-image-classification", model="openai/clip-vit-base-patch32")
generator = pipeline("text-generation", model="distilgpt2")
print("Pipeline models loaded!")

In [None]:
def image_to_story(image_url, genre='fantasy'):
    """
    A true pipeline: Image --> Caption --> Mood --> Story.
    Each model's output feeds into the next model.
    """
    print("IMAGE-TO-STORY PIPELINE")
    print("=" * 55)

    # Step 1: Load and show the image
    print("\nStep 1: Loading image...")
    image = load_image(image_url)
    display(image.resize((300, 300)))

    # Step 2: CAPTIONER generates a text description
    print("\nStep 2: Captioning image...")
    caption = captioner(image)[0]['generated_text']
    print(f"   Caption: {caption}")

    # Step 3: IMAGE CLASSIFIER detects the mood (uses the image)
    print("\nStep 3: Detecting mood...")
    moods = ['mysterious', 'cheerful', 'dramatic', 'peaceful', 'adventurous']
    mood_result = image_classifier(image, candidate_labels=moods)
    image_mood = mood_result[0]['label']
    print(f"   Mood: {image_mood}")

    # Step 4: GENERATOR creates a story from caption + mood + genre
    print("\nStep 4: Generating story...")
    story_prompts = {
        'fantasy': f"In a magical world, there was {caption}. The mood was {image_mood}. One day,",
        'mystery': f"The detective studied the scene: {caption}. Something {image_mood} hung in the air. Then,",
        'scifi': f"In the year 2150, {caption}. The atmosphere was {image_mood}. The AI calculated that",
    }
    prompt = story_prompts.get(genre, story_prompts['fantasy'])

    result = generator(prompt, max_length=120, do_sample=True, temperature=0.9)
    story = result[0]['generated_text']

    print(f"\n--- {genre.upper()} STORY ---")
    print(story)

    print("\n" + "=" * 55)
    print("Pipeline: Image --> Caption --> Mood --> Story")
    print("Each step depended on the step before it.")

    return caption, image_mood, story

> **INSTRUCTOR NOTE:** Ask students for an image URL (or use the ones below). Run the pipeline. Then ask: "What would happen if the captioner got the image wrong?"

In [None]:
# Try the pipeline with a sample image
image_url = "https://upload.wikimedia.org/wikipedia/commons/thumb/4/4d/Cat_November_2010-1a.jpg/1200px-Cat_November_2010-1a.jpg"

caption, mood, story = image_to_story(image_url, genre='mystery')

### CLIP Label-Set Comparison (Book Enhancement)

Zero-shot image classification is powerful because **you define the labels**.

The same image can produce different outputs depending on the categories you provide.


In [None]:
# Compare the same image with two different label sets
comparison_image = load_image(image_url)

label_sets = {
    "Set A (object types)": ["cat", "dog", "bird", "rabbit"],
    "Set B (scene mood)": ["peaceful", "chaotic", "playful", "mysterious"],
}

print("CLIP LABEL-SET COMPARISON")
print("=" * 60)
for name, labels in label_sets.items():
    predictions = image_classifier(comparison_image, candidate_labels=labels)
    print(f"
{name}: {labels}")
    for pred in predictions[:3]:
        print(f"  - {pred['label']}: {pred['score']:.1%}")


> **INSTRUCTOR NOTE:** Ask students for two custom label sets for the same image, then compare how the model behavior changes.


In [None]:
# Student-defined CLIP label sets
student_labels_a = ["REPLACE", "WITH", "LABELS"]
student_labels_b = ["REPLACE", "WITH", "LABELS"]

if "REPLACE" not in student_labels_a and "REPLACE" not in student_labels_b:
    print("Set A results:")
    for pred in image_classifier(comparison_image, candidate_labels=student_labels_a)[:3]:
        print(f"  - {pred['label']}: {pred['score']:.1%}")

    print("
Set B results:")
    for pred in image_classifier(comparison_image, candidate_labels=student_labels_b)[:3]:
        print(f"  - {pred['label']}: {pred['score']:.1%}")


### Error Cascades: When One Mistake Ruins Everything

In the pipeline above, the caption feeds into the story. What happens if the caption is wrong? Let's find out by deliberately overriding the caption with a wrong description.

In [None]:
def image_to_story_with_override(image_url, fake_caption, genre='fantasy'):
    """
    Same pipeline, but we override the caption to simulate an error.
    This shows how one mistake cascades through the system.
    """
    print("ERROR CASCADE DEMO")
    print("=" * 55)

    # Step 1: Load image
    image = load_image(image_url)
    display(image.resize((300, 300)))

    # Step 2: Use the WRONG caption (simulating a captioner error)
    print(f"\nReal image: (see above)")
    print(f"WRONG caption: {fake_caption}")

    # Step 3: Mood still comes from the image (correct)
    moods = ['mysterious', 'cheerful', 'dramatic', 'peaceful', 'adventurous']
    mood_result = image_classifier(image, candidate_labels=moods)
    image_mood = mood_result[0]['label']
    print(f"Mood: {image_mood}")

    # Step 4: Story is built from the WRONG caption
    prompt = f"In a magical world, there was {fake_caption}. The mood was {image_mood}. One day,"
    result = generator(prompt, max_length=120, do_sample=True, temperature=0.9)
    story = result[0]['generated_text']

    print(f"\n--- STORY (built from wrong caption) ---")
    print(story)

    print("\n" + "=" * 55)
    print("The story is about something completely different from the image!")
    print("This is an ERROR CASCADE -- one mistake affected everything downstream.")

In [None]:
# The image is a cat, but we tell the pipeline it's something else
image_url = "https://upload.wikimedia.org/wikipedia/commons/thumb/4/4d/Cat_November_2010-1a.jpg/1200px-Cat_November_2010-1a.jpg"

image_to_story_with_override(
    image_url,
    fake_caption="a rocket launching into space from a desert launchpad"
)

The image shows a cat, but because we fed a wrong caption ("a rocket launching into space"), the entire story is about rockets and space -- nothing to do with the actual image.

**In a real system** -- say, an accessibility tool that describes images for blind users -- a wrong caption means a wrong description. One mistake affects everything downstream. This is why **system design** matters, not just individual model accuracy.

### Student Pipeline Test

> **INSTRUCTOR NOTE:** Have students suggest an image URL and a genre. Run the pipeline. Then ask them to suggest a deliberately wrong caption and see how the story changes.

In [None]:
student_image_url = "REPLACE WITH IMAGE URL"
student_genre = "fantasy"  # Try: fantasy, mystery, or scifi

if "REPLACE" not in student_image_url:
    caption, mood, story = image_to_story(student_image_url, genre=student_genre)

> **ASK AI ABOUT THIS**
>
> Copy the `image_to_story` function into Claude or ChatGPT and ask:
>
> *"If the captioner said 'a dog playing fetch' but the image actually showed a cat sleeping, how would that affect each step of this pipeline? Walk me through the error cascade."*
>
> This is how real programmers learn -- by asking questions about code they encounter.

---

## Part 3: Design Your Own Pipeline (1:00-1:25)

Now that you understand how pipelines work -- and how they can fail -- let's think about designing new ones.

### Pipeline Design Exercise

For each scenario below, think about:
1. What models would you need?
2. What order would they run in?
3. Where could errors cascade?

### Scenario A: News Digest Tool
**Goal:** Take a news article, find the key people, analyze sentiment about each person, and summarize.

```
Article --> [NER] --> people mentioned
        --> [Sentiment] --> sentiment per sentence
        --> [Summarizer] --> summary
Combine: Who was mentioned and how the article feels about them
```

### Scenario B: Social Media Post Generator
**Goal:** Take an image, caption it, detect the mood, and generate a social media post.

```
Image --> [Captioner] --> description
      --> [Mood Detector] --> mood
      --> [Generator] --> social media post matching the mood
```

### Scenario C: Study Assistant
**Goal:** Take student homework, classify the subject, estimate difficulty, and suggest study approach.

```
Homework text --> [Topic Classifier] --> subject
              --> [Difficulty Classifier] --> easy/medium/hard
              --> [Summarizer] --> key concepts
Combine: "This is a [difficulty] [subject] assignment about [concepts]"
```

### Your Pipeline Design

> **INSTRUCTOR NOTE:** Have each student (or the group together) pick one scenario and describe the pipeline verbally. Diagram it on screen. They do NOT need to code it -- the goal is system design thinking. Ask: "Where could an error in step 1 ruin step 3?"

In [None]:
# If students want to try building Scenario A, here's a starter:
# (This is optional -- the main goal is the design exercise above)

# Uncomment and modify to try:
# from transformers import pipeline
#
# ner = pipeline("ner", grouped_entities=True)
# sentiment = pipeline("sentiment-analysis")
# summarizer = pipeline("summarization", model="sshleifer/distilbart-cnn-12-6")
#
# def news_digest(article):
#     print("NEWS DIGEST PIPELINE")
#     print("=" * 50)
#
#     # Step 1: Find key entities
#     entities = ner(article)
#     people = [e['word'] for e in entities if e['entity_group'] == 'PER']
#     print(f"People mentioned: {people}")
#
#     # Step 2: Overall sentiment
#     sent = sentiment(article[:500])[0]
#     print(f"Overall tone: {sent['label']} ({sent['score']:.1%})")
#
#     # Step 3: Summary
#     if len(article.split()) > 30:
#         summary = summarizer(article, max_length=60, min_length=20)[0]['summary_text']
#         print(f"Summary: {summary}")
#
# # Test it:
# news_digest("Your article text here...")

print("(Uncomment the code above to try the news digest pipeline)")

> **ASK AI ABOUT THIS**
>
> Describe one of the pipeline scenarios to Claude or ChatGPT and ask:
>
> *"I want to build an AI pipeline that [describe your scenario]. What models would I need, what order should they run in, and where could errors cascade?"*
>
> This is how real programmers learn -- by asking questions about code they encounter.

---

## On Your Own (1:40-2:00)

### Experiment 1: Modify the Image-to-Story Pipeline

Add a step to the pipeline. Ideas:
- Add sentiment analysis of the generated story
- Add a "title generator" step
- Try different genres and compare the stories

In [None]:
# Your modified pipeline
# Start from the image_to_story function and add your own steps

# Example: Run the pipeline on a new image with all three genres
test_url = "REPLACE WITH AN IMAGE URL"

if "REPLACE" not in test_url:
    for genre in ['fantasy', 'mystery', 'scifi']:
        print("\n" + "#" * 60 + "\n")
        image_to_story(test_url, genre=genre)

### Experiment 2: Design a Pipeline on Paper

Sketch a pipeline for something YOU care about. Write it out:

**My Pipeline Idea:**

**Step 1:** Input is ___________

**Step 2:** Model _________ produces _________

**Step 3:** Model _________ takes that and produces _________

**Step 4:** Model _________ takes that and produces _________

**Final Output:** _________

**Where could errors cascade?** _________

### Experiment 3: Error Cascade Exploration

Try the error cascade demo with different wrong captions. How wrong does the caption need to be before the story becomes completely unrelated to the image?

In [None]:
# Try different wrong captions
image_url = "https://upload.wikimedia.org/wikipedia/commons/thumb/4/4d/Cat_November_2010-1a.jpg/1200px-Cat_November_2010-1a.jpg"

wrong_captions = [
    "a small dog sleeping on a couch",      # Close but wrong animal
    "a person reading a book in a library",  # Completely different scene
    "an explosion in a factory",             # Wildly different
]

for caption in wrong_captions:
    print("\n" + "#" * 60)
    image_to_story_with_override(image_url, caption)

### Bonus: Pipeline App

In Session 3 you wrapped one model in Gradio. In Session 6 you compared three models side by side. Now let's wrap an entire **multi-model pipeline** into an app.

Upload any image, pick a genre, and the pipeline runs all three steps (caption, mood, story) behind the scenes.


In [None]:
import gradio as gr

def gradio_image_to_story(image, genre):
    if image is None:
        return "Upload an image to get started."

    # Step 1: Caption
    caption = captioner(image)[0]['generated_text']

    # Step 2: Mood
    moods = ['mysterious', 'cheerful', 'dramatic', 'peaceful', 'adventurous']
    mood_result = image_classifier(image, candidate_labels=moods)
    image_mood = mood_result[0]['label']

    # Step 3: Story
    story_prompts = {
        'fantasy': f"In a magical world, there was {caption}. The mood was {image_mood}. One day,",
        'mystery': f"The detective studied the scene: {caption}. Something {image_mood} hung in the air. Then,",
        'scifi': f"In the year 2150, {caption}. The atmosphere was {image_mood}. The AI calculated that",
    }
    prompt = story_prompts.get(genre, story_prompts['fantasy'])
    result = generator(prompt, max_length=120, do_sample=True, temperature=0.9)
    story = result[0]['generated_text']

    return f"Caption: {caption}\nMood: {image_mood}\n\n--- {genre.upper()} STORY ---\n{story}"

demo = gr.Interface(
    fn=gradio_image_to_story,
    inputs=[
        gr.Image(label="Upload an image", type="pil"),
        gr.Dropdown(choices=["fantasy", "mystery", "scifi"], value="fantasy", label="Genre"),
    ],
    outputs=gr.Textbox(label="Generated Story", lines=8),
    title="Image-to-Story Generator",
    description="Upload any image. AI will caption it, detect the mood, and write a story.",
    allow_flagging="never",
)

demo.launch(share=True)


> **INSTRUCTOR NOTE:** This is the payoff for the Gradio progression: Session 3 (one model), Session 6 (three models side by side), now a full pipeline with image upload and a dropdown. Students can drag-and-drop photos from their phone. The shareable link previews what their Session 11-12 project could look like. Stop the demo by clicking the stop button or restarting the runtime when done.

---


---

### Checklist: Before You Leave

- [ ] Ran the comprehensive text analysis (5 models on one input)
- [ ] Built and ran the image-to-story pipeline
- [ ] Saw how a wrong caption cascades into a wrong story
- [ ] Designed a pipeline on paper (identified models, order, and error points)
- [ ] Discussed where errors cascade in multi-model systems
- [ ] Saved your work (File > Save a copy in Drive)

---

## Looking Ahead

Next session, we bring everything together. You know what models can do (Sessions 2-3). You know their limits (Sessions 6-8). You know how to chain them (today). Next session: **prompt engineering** -- the art of controlling models through careful input design -- and **project planning**. By the end of Session 10, you will have a project idea and a plan to build it.

See you next session.

---

*Youth Horizons AI Researcher Program - Level 2*