# Session 8 - Stable Diffusion
## Text-to-Image Generation with Diffusion Models

### Learning Objectives:
- Understand Stable Diffusion architecture
- Generate images from text prompts
- Experiment with parameters
- Apply prompting best practices

In [None]:
!pip install -q diffusers transformers torch accelerate scipy ftfy pillow

## 1. Load Stable Diffusion

In [None]:
import torch
from diffusers import StableDiffusionPipeline
import matplotlib.pyplot as plt

device = 'cuda' if torch.cuda.is_available() else 'cpu'
model_id = 'runwayml/stable-diffusion-v1-5'
pipe = StableDiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16)
pipe = pipe.to(device)
print(f'✓ Model loaded on {device}')

## 2. Architecture Inspection

In [None]:
print('=== TEXT ENCODER (CLIP) ===')
print(f'Model: {pipe.text_encoder.__class__.__name__}')
print(f'Parameters: {sum(p.numel() for p in pipe.text_encoder.parameters()):,}')
print()
print('=== UNET DENOISER ===')
print(f'Parameters: {sum(p.numel() for p in pipe.unet.parameters()):,}')
print()
print('=== VAE ===')
print(f'Parameters: {sum(p.numel() for p in pipe.vae.parameters()):,}')

## 3. Basic Generation

In [None]:
prompt = 'a beautiful sunset over mountains, oil painting, 4k'
generator = torch.Generator(device=device).manual_seed(42)
image = pipe(prompt, num_inference_steps=50, guidance_scale=7.5, generator=generator).images[0]
print('✓ Image generated successfully!')

## 4. Prompt Complexity Experiment

In [None]:
prompts = [
    'a cat',
    'a fluffy cat sitting',
    'a majestic orange cat with striking eyes, sitting on antique chair',
    'a professional portrait of orange tabby cat, highly detailed, 4k, oil painting'
]
for idx, p in enumerate(prompts, 1):
    print(f'{idx}. {p[:50]}...')

## 5. Key Parameters

| Parameter | Range | Sweet Spot | Effect |
|-----------|-------|-----------|--------|
| guidance_scale | 3-12 | 7-8 | Prompt adherence |
| num_inference_steps | 10-100 | 40-50 | Quality/speed |
| seed | 0-2^32 | Any | Reproducibility |

## 6. Conclusions

✓ Detailed prompts produce better results
✓ Guidance scale 7-8 is optimal
✓ 40-50 steps balances quality and speed
✓ Seeds enable reproducibility