# Week 7: Image Models
## Teaching AI to See

**Today's Goals:**
1. Understand how AI "sees" images
2. Run image classification models
3. Upload and classify your own images
4. Understand what makes vision models work

---

## Part 1: How Do AI Models "See"?

Images are just numbers!

- A pixel has RGB values (Red, Green, Blue): 0-255 each
- A 224√ó224 image = 224 √ó 224 √ó 3 = **150,528 numbers**
- The model learns patterns in these numbers

```
Image ‚Üí Pixels (numbers) ‚Üí Model ‚Üí Prediction
```

## Setup

In [None]:
# Install required libraries
!pip install transformers -q
!pip install torch torchvision -q
!pip install Pillow -q
!pip install requests -q

print("Libraries installed!")

In [None]:
# Import libraries
from transformers import pipeline
from PIL import Image
import requests
from io import BytesIO

# Helper function to load images from URL
def load_image_from_url(url):
    """Load an image from a URL."""
    response = requests.get(url)
    image = Image.open(BytesIO(response.content))
    return image

print("Ready to classify images!")

---
## Part 2: Your First Image Classification

Let's classify an image using the pipeline (easy way):

In [None]:
# Create an image classification pipeline
classifier = pipeline("image-classification")

print("Image classifier loaded!")

In [None]:
# Load a sample image (a cat!)
image_url = "https://upload.wikimedia.org/wikipedia/commons/thumb/3/3a/Cat03.jpg/1200px-Cat03.jpg"
image = load_image_from_url(image_url)

# Display the image
display(image.resize((300, 300)))  # Resize for display

In [None]:
# Classify the image!
results = classifier(image)

print("What the AI sees:\n")
for result in results:
    print(f"  {result['label']}: {result['score']:.1%}")

### Understanding the Output:

The model gives its top 5 guesses:
- `label`: What it thinks the image shows
- `score`: How confident it is (0-100%)

Notice the model knows specific cat breeds!

---
## Part 3: Try Different Images

Let's test with various images:

In [None]:
# A collection of test images
test_images = {
    "Dog": "https://upload.wikimedia.org/wikipedia/commons/thumb/2/26/YellowLabradorLooking_new.jpg/1200px-YellowLabradorLooking_new.jpg",
    "Car": "https://upload.wikimedia.org/wikipedia/commons/thumb/1/1b/2019_Honda_Civic_sedan_%28facelift%29%2C_front_11.29.19.jpg/1200px-2019_Honda_Civic_sedan_%28facelift%29%2C_front_11.29.19.jpg",
    "Pizza": "https://upload.wikimedia.org/wikipedia/commons/thumb/a/a3/Eq_it-na_pizza-margherita_sep2005_sml.jpg/800px-Eq_it-na_pizza-margherita_sep2005_sml.jpg",
    "Laptop": "https://upload.wikimedia.org/wikipedia/commons/thumb/d/d9/Laptop-and-hands.jpg/1200px-Laptop-and-hands.jpg"
}

for name, url in test_images.items():
    print(f"\n{'='*40}")
    print(f"Testing: {name}")
    print(f"{'='*40}")
    
    try:
        image = load_image_from_url(url)
        display(image.resize((200, 200)))
        
        results = classifier(image)
        print("\nTop predictions:")
        for r in results[:3]:  # Top 3
            print(f"  {r['label']}: {r['score']:.1%}")
    except Exception as e:
        print(f"Error loading image: {e}")

---
## Part 4: Upload Your Own Image!

In Google Colab, you can upload images from your computer:

In [None]:
# Upload an image (works in Google Colab)
from google.colab import files

print("Click 'Choose Files' to upload an image...")
uploaded = files.upload()

# Get the filename
filename = list(uploaded.keys())[0]
print(f"\nUploaded: {filename}")

In [None]:
# Classify your uploaded image
my_image = Image.open(filename)

# Display it
display(my_image.resize((300, 300)))

# Classify it
results = classifier(my_image)

print("\nThe AI thinks this is:")
for r in results:
    print(f"  {r['label']}: {r['score']:.1%}")

---
## Part 5: Using a Different Model

Let's try a smaller, faster model that still works great:

In [None]:
# Load MobileViT - a small, mobile-friendly model
mobile_classifier = pipeline(
    "image-classification", 
    model="apple/mobilevit-small"
)

print("MobileViT loaded! (Apple's mobile-friendly vision model)")

In [None]:
# Compare the two models on the same image
test_url = "https://upload.wikimedia.org/wikipedia/commons/thumb/3/3a/Cat03.jpg/1200px-Cat03.jpg"
image = load_image_from_url(test_url)

print("Comparing models on the same image:\n")
display(image.resize((200, 200)))

print("\n--- Default Model ---")
for r in classifier(image)[:3]:
    print(f"  {r['label']}: {r['score']:.1%}")

print("\n--- MobileViT (smaller, faster) ---")
for r in mobile_classifier(image)[:3]:
    print(f"  {r['label']}: {r['score']:.1%}")

---
## Part 6: Understanding What Models Know

These models were trained on **ImageNet**, a dataset with 1000 categories.

They know things like:
- Animals (dogs, cats, birds, etc.)
- Vehicles (cars, planes, boats)
- Food (pizza, banana, ice cream)
- Objects (laptop, phone, furniture)

They DON'T know:
- Specific people's faces
- Custom categories you might want
- Things not in ImageNet

**This is why fine-tuning exists!** (We'll learn this later)

---
## Part 7: Challenge - Trick the Model!

Can you find images that confuse the model?

Ideas to try:
- Unusual angles
- Drawings vs photos
- Multiple objects
- Optical illusions
- Things that look like other things

In [None]:
# Try an ambiguous image - a cloud that looks like something?
# Or find your own tricky image!

tricky_url = "YOUR_IMAGE_URL_HERE"  # Replace with a URL!

# Uncomment when you have a URL:
# tricky_image = load_image_from_url(tricky_url)
# display(tricky_image.resize((300, 300)))
# results = classifier(tricky_image)
# for r in results:
#     print(f"  {r['label']}: {r['score']:.1%}")

---
## Part 8: Build an Image Analysis Tool

In [None]:
def analyze_image(image_source, show_image=True):
    """
    Analyze an image from URL or file path.
    
    Args:
        image_source: URL string or file path
        show_image: Whether to display the image
    """
    # Load image
    if image_source.startswith('http'):
        image = load_image_from_url(image_source)
    else:
        image = Image.open(image_source)
    
    # Display if requested
    if show_image:
        display(image.resize((300, 300)))
    
    # Classify
    results = classifier(image)
    
    # Print results nicely
    print("\nüîç Image Analysis Results:")
    print("=" * 40)
    
    top_result = results[0]
    print(f"\nBest guess: {top_result['label']}")
    print(f"Confidence: {top_result['score']:.1%}")
    
    print("\nOther possibilities:")
    for r in results[1:]:
        print(f"  ‚Ä¢ {r['label']}: {r['score']:.1%}")
    
    return results

# Test it
analyze_image("https://upload.wikimedia.org/wikipedia/commons/thumb/3/3a/Cat03.jpg/1200px-Cat03.jpg")

---
## Discussion: How Did It Know?

Think about these questions:

1. **What features help identify a cat?**
   - Ears, whiskers, fur patterns, eye shape?

2. **Why might the model confuse similar things?**
   - A chihuahua might look like a cat to the model!

3. **What are the limitations?**
   - Only knows what it was trained on
   - Can be fooled by unusual images
   - Doesn't truly "understand" - just pattern matching

---

## Quick Reference

### Image Classification Pipeline:
```python
from transformers import pipeline
from PIL import Image

classifier = pipeline("image-classification")
image = Image.open("your_image.jpg")
results = classifier(image)
```

### Good Models for Free Colab:
- `google/vit-base-patch16-224` - Accurate, medium size
- `apple/mobilevit-small` - Small, fast, mobile-friendly
- `microsoft/resnet-50` - Classic, reliable

---
## Checklist: What You Learned Today

- [ ] How AI models "see" images (pixels ‚Üí numbers)
- [ ] How to use image classification pipelines
- [ ] How to upload and classify your own images
- [ ] Different vision models and their trade-offs
- [ ] Limitations of pre-trained models

---

## Looking Ahead: Next Week

Next week we'll explore **text generation**:
- How GPT-style models work
- Generate text with small models
- Control generation with parameters

**Homework (optional):**
- Test the model on 10 different images
- Find images that confuse it
- Save your experiments to GitHub!

---

*Youth Horizons AI Researcher Program - Level 2*