# Generate with Stable Diffusion

**Module 6.4, Lesson 3** | CourseAI

For 16 lessons, you built every concept from scratch. You learned to compress images (VAE), to destroy and reconstruct (diffusion), to condition on time and text (U-Net, CLIP, cross-attention, CFG), to work in latent space, to assemble the pipeline, and to choose a sampler. This is the payoff.

**What you will do:**
- Load Stable Diffusion with the diffusers library and generate your first image with full understanding of what happens inside
- Sweep `guidance_scale` and `num_inference_steps` to see the tradeoffs you predicted from the CFG and sampler lessons
- Compare three schedulers on the same prompt/seed and measure the speed-quality tradeoff
- Design your own controlled experiment to test a hypothesis about parameter interactions

**For each exercise, PREDICT the output before running the cell.**

This is a CONSOLIDATE notebook. There are zero new concepts. Every parameter maps to something you already know. The goal is not challengeâ€”it is satisfaction: "I know what every parameter does."

**Estimated time:** 30â€“45 minutes.

---

## Setup

Run this cell to install dependencies and import everything.

In [None]:
!pip install -q diffusers transformers accelerate

import torch
import matplotlib.pyplot as plt
import numpy as np
from PIL import Image
import time

# Reproducible results
torch.manual_seed(42)
np.random.seed(42)

# Nice plots
plt.style.use('dark_background')
plt.rcParams['figure.figsize'] = [10, 4]

# Device setup
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
dtype = torch.float16 if device.type == 'cuda' else torch.float32
print(f'Using device: {device}')
if device.type == 'cuda':
    print(f'GPU: {torch.cuda.get_device_name(0)}')
    print(f'VRAM: {torch.cuda.get_device_properties(0).total_mem / 1024**3:.1f} GB')

print('\nSetup complete.')

## Shared Helpers

Load the Stable Diffusion pipeline once and define helper functions. All exercises share the same model weights.

In [None]:
from diffusers import (
    StableDiffusionPipeline,
    DPMSolverMultistepScheduler,
    DDIMScheduler,
    EulerDiscreteScheduler,
)

model_id = 'stable-diffusion-v1-5/stable-diffusion-v1-5'

# Load the pipeline once.
print('Loading Stable Diffusion pipeline...')
pipe = StableDiffusionPipeline.from_pretrained(
    model_id,
    torch_dtype=dtype,
    safety_checker=None,
    requires_safety_checker=False,
)
pipe = pipe.to(device)
print(f'Pipeline loaded on {device}.')

# Save the scheduler config so we can create fresh schedulers from it.
scheduler_config = pipe.scheduler.config


def show_images(images, titles, figsize=None):
    """Display a list of PIL images side by side."""
    n = len(images)
    if figsize is None:
        figsize = (5 * n, 5)
    fig, axes = plt.subplots(1, n, figsize=figsize)
    if n == 1:
        axes = [axes]
    for ax, img, title in zip(axes, images, titles):
        ax.imshow(np.array(img))
        ax.set_title(title, fontsize=10)
        ax.axis('off')
    plt.tight_layout()
    plt.show()


def show_image_grid(images, titles, nrows, ncols, figsize=None, suptitle=None):
    """Display images in a grid with the given number of rows and columns."""
    if figsize is None:
        figsize = (4 * ncols, 4 * nrows)
    fig, axes = plt.subplots(nrows, ncols, figsize=figsize)
    axes_flat = axes.flat if nrows > 1 or ncols > 1 else [axes]
    for ax, img, title in zip(axes_flat, images, titles):
        ax.imshow(np.array(img))
        ax.set_title(title, fontsize=10)
        ax.axis('off')
    # Hide any extra axes
    for ax in list(axes_flat)[len(images):]:
        ax.axis('off')
    if suptitle:
        plt.suptitle(suptitle, fontsize=13)
    plt.tight_layout()
    plt.show()


print('Helpers defined.')

---

## Exercise 1: Your First Generation [Guided]

This is the moment. You will load the pipeline, call `pipe()`, and get an image. Every parameter in this call maps to something you built:

| Parameter | What You Know It As | Source |
|-----------|-------------------|--------|
| `prompt` | CLIP tokenization â†’ [1, 77, 768] embeddings â†’ cross-attention K/V | CLIP, Text Conditioning & Guidance |
| `guidance_scale` | The *w* in Îµ_cfg = Îµ_uncond + w Â· (Îµ_cond âˆ’ Îµ_uncond) | Text Conditioning & Guidance |
| `num_inference_steps` | Sampler step count along the ODE trajectory | Samplers and Efficiency |
| `generator` | Seed â†’ z_T [4, 64, 64], the initial random latent | The SD Pipeline |
| `height` / `width` | Pixel dimensions â†’ latent dimensions via 8Ã— VAE downsampling | From Pixels to Latents |

**Before running, predict:**
- The pipeline will use DPM-Solver++ as the default scheduler. With 25 steps, how many U-Net forward passes will it make? (Remember: CFG means two passes per step.)
- What is the latent tensor shape for a 512Ã—512 image? (Think: 512/8 = ?)
- With `guidance_scale=7.5`, how strongly is the text conditioning amplified relative to the unconditional prediction?

In [None]:
# Set the scheduler to DPM-Solver++ (the recommended default from Samplers and Efficiency)
pipe.scheduler = DPMSolverMultistepScheduler.from_config(scheduler_config)

# Generate your first image.
# Every parameter here maps to a concept you built:
#   prompt         -> CLIP encoding -> cross-attention (Lessons 12-13)
#   guidance_scale -> CFG w parameter (Lesson 13)
#   num_inference_steps -> sampler step count (Lesson 16)
#   generator      -> seed -> z_T (Lesson 15)
#   height/width   -> latent dimensions via VAE 8x downsampling (Lesson 14)

generator = torch.Generator(device=device).manual_seed(42)

result = pipe(
    prompt="a cat sitting on a beach at sunset",
    guidance_scale=7.5,
    num_inference_steps=25,
    generator=generator,
    height=512,
    width=512,
)

image = result.images[0]
show_images([image], ['"a cat sitting on a beach at sunset"'], figsize=(6, 6))

# Verify your predictions:
print(f'Scheduler: {pipe.scheduler.__class__.__name__}')
print(f'Steps: 25 -> U-Net forward passes: {25 * 2} (2 per step for CFG)')
print(f'Image size: {image.size[0]}x{image.size[1]} pixels')
print(f'Latent size: {image.size[0]//8}x{image.size[1]//8}x4 = [{4}, {image.size[1]//8}, {image.size[0]//8}]')
print(f'guidance_scale=7.5: the text direction is amplified 7.5x relative to unconditional')

In [None]:
# Now change the prompt. Same seed, same everything else.
# The seed determines z_T (the structural "skeleton"). The prompt determines
# what concept the cross-attention steers toward.
#
# Before running, predict: will the composition (layout, shapes) be similar
# to the cat image, or completely different?

generator = torch.Generator(device=device).manual_seed(42)

result_2 = pipe(
    prompt="a dog playing in a field of flowers",
    guidance_scale=7.5,
    num_inference_steps=25,
    generator=generator,
    height=512,
    width=512,
)

show_images(
    [image, result_2.images[0]],
    ['Same seed: "a cat on a beach at sunset"', 'Same seed: "a dog in a field of flowers"'],
    figsize=(12, 6),
)

print('Same seed = same z_T = same structural starting point.')
print('Different prompt = different cross-attention steering = different content.')
print('The seed controls the "skeleton." The prompt controls the "concept."')

### What Just Happened

You generated your first image with **full comprehension** of what happened inside:

1. **Prompt** â†’ tokenized to at most 77 tokens, padded with SOT/EOT â†’ CLIP text encoder produced [1, 77, 768] contextual embeddings
2. **Seed (42)** â†’ `torch.Generator` sampled z_T with shape [4, 64, 64] (the initial random latent in the VAE's compressed space)
3. **25 steps with DPM-Solver++** â†’ at each step, the U-Net ran twice (once conditioned on the prompt, once unconditional for CFG), the scheduler combined the predictions using Îµ_cfg = Îµ_uncond + 7.5 Â· (Îµ_cond âˆ’ Îµ_uncond), and updated z_t
4. **VAE decode** â†’ the final z_0 [4, 64, 64] was decoded to [3, 512, 512] pixel image

When you changed the prompt but kept the same seed, the z_T starting point was identical. The structural decisions (composition, layout) are driven by early denoising steps where the noise pattern from z_T dominates. The prompt changes *what* appears in that structure via cross-attention, but the overall composition shares similarities because z_T is the same.

This is not a tutorial. You know what the machine does inside.

---

## Exercise 2: Parameter Sweeps [Guided]

Now you will see the parameters you studied come to life. Two systematic sweeps, both using the predict-then-verify pattern:

**Part A â€” Guidance Scale Sweep**

You know `guidance_scale` is the *w* in the CFG formula: Îµ_cfg = Îµ_uncond + w Â· (Îµ_cond âˆ’ Îµ_uncond). It is the "contrast slider" from Text Conditioning & Guidance.

**Before running, predict:**
- At w=1, there is no amplification. The conditional prediction is used directly. Will the image follow the prompt closely?
- At w=7.5 (the typical default), how balanced will the prompt fidelity vs image naturalness be?
- At w=25, the conditional-minus-unconditional direction is amplified 25Ã—. What happens when you extrapolate that far? (Think: oversaturated colors? Unnatural contrast? Distortion?)

**Part B â€” Step Count Sweep**

You know from Samplers and Efficiency that DPM-Solver++ plateaus around 20â€“30 steps.

**Before running, predict:**
- At 5 steps, how much of the ODE trajectory can DPM-Solver accurately traverse?
- At 20 steps, will quality be acceptable?
- Between 50 and 100 steps, will you see any visible improvement? (Remember: the quality curve plateaus.)

In [None]:
# Part A: Guidance Scale Sweep
# Fix everything except guidance_scale. One variable at a time.

prompt = "a cat sitting on a beach at sunset"
seed = 42
guidance_values = [1, 3, 7.5, 15, 25]

pipe.scheduler = DPMSolverMultistepScheduler.from_config(scheduler_config)

guidance_images = []
for w in guidance_values:
    generator = torch.Generator(device=device).manual_seed(seed)
    result = pipe(
        prompt,
        guidance_scale=w,
        num_inference_steps=25,
        generator=generator,
        height=512,
        width=512,
    )
    guidance_images.append(result.images[0])
    print(f'  w={w} done')

titles = [f'w = {w}' for w in guidance_values]
show_images(guidance_images, titles, figsize=(25, 5))

print('\n=== Guidance Scale as the "Contrast Slider" ===')
print('w=1:   No amplification. Conditional prediction used directly. Soft, may not follow prompt well.')
print('w=3:   Mild amplification. Partially follows prompt. Dreamlike.')
print('w=7.5: Balanced. Prompt-faithful with natural quality. The typical default.')
print('w=15:  Strong amplification. Colors pushed. Oversaturated.')
print('w=25:  Extreme extrapolation. Distorted, unnatural. The CFG formula overshoots.')
print()
print('guidance_scale is NOT a quality dial. It is a tradeoff between')
print('prompt fidelity and image naturalness. Higher = "follow the text harder,"')
print('not "make the image better."')

In [None]:
# Part B: Step Count Sweep
# Fix everything except num_inference_steps. Same prompt, seed, guidance.

step_counts = [5, 10, 20, 50, 100]

pipe.scheduler = DPMSolverMultistepScheduler.from_config(scheduler_config)

step_images = []
step_timings = []

for n in step_counts:
    generator = torch.Generator(device=device).manual_seed(seed)
    start = time.time()
    result = pipe(
        prompt,
        guidance_scale=7.5,
        num_inference_steps=n,
        generator=generator,
        height=512,
        width=512,
    )
    elapsed = time.time() - start
    step_images.append(result.images[0])
    step_timings.append(elapsed)
    print(f'  {n:>3d} steps: {elapsed:.1f}s')

titles = [f'{n} steps ({t:.1f}s)' for n, t in zip(step_counts, step_timings)]
show_images(step_images, titles, figsize=(25, 5))

# Time comparison
print('\n=== Step Count vs Generation Time ===')
for n, t in zip(step_counts, step_timings):
    bar = 'â–ˆ' * int(t / max(step_timings) * 30)
    print(f'  {n:>3d} steps: {t:>5.1f}s  {bar}')

print(f'\nSpeed ratio (100 steps / 20 steps): {step_timings[step_counts.index(100)] / step_timings[step_counts.index(20)]:.1f}x')
print(f'Quality difference at 20 vs 100 steps: look at the images above.')
print(f'The quality plateau is real. More steps past 20-30 with DPM-Solver++ is wasted compute.')

### What Just Happened

**Guidance scale sweep:** You verified the "contrast slider" analogy from Text Conditioning & Guidance. At low *w*, the unconditional prediction has too much influenceâ€”the image is soft and may not match the prompt. At high *w*, the conditional-minus-unconditional direction is amplified so aggressively that the image overshoots into oversaturation and distortion. The sweet spot (w=7â€“8) balances prompt fidelity with image naturalness. This is the CFG formula you implemented, made visible.

**Step count sweep:** You verified the sweet spot from Samplers and Efficiency. DPM-Solver++ achieves good quality at ~20 steps because its higher-order method accurately follows the ODE trajectory with large steps. At 5 steps, the trajectory is traversed too coarsely (missing fine detail). At 50â€“100 steps, quality is indistinguishable from 20â€”but generation time scales linearly because each step requires 2 U-Net forward passes for CFG.

The systematic workflow: fix everything except one parameter, vary it, observe the effect. This is controlled experimentation applied to generative AI.

---

## Exercise 3: Sampler Comparison [Supported]

The lesson's central insight: **the model defines where to go, the sampler defines how to get there.** In the previous lesson's notebook, you compared DDPM, DDIM, and DPM-Solver. Now you will compare three practical schedulers at their recommended step counts using the high-level `pipe()` API.

Your task: generate the **same image** (same prompt, same seed) with three different schedulers:
1. **DPM-Solver++** at 25 steps (the current standard)
2. **DDIM** at 50 steps (deterministic, for reproducibility)
3. **Euler** at 30 steps (simple, good for debugging)

Display all three images with their generation times. Then answer:
- Are the three images identical? Why or why not? (Think: same model, same z_T, but different ODE solvers follow different numerical paths.)
- Which is fastest? Does that match the step count?

**Hints:**
- Use `DPMSolverMultistepScheduler.from_config(scheduler_config)` to create DPM-Solver++
- Use `DDIMScheduler.from_config(scheduler_config)` to create DDIM
- Use `EulerDiscreteScheduler.from_config(scheduler_config)` to create Euler
- Remember to create a fresh generator (same seed) for each run

In [None]:
prompt = "a cozy cabin in a snowy forest at night, warm light from windows"
seed = 77
guidance_scale = 7.5

# Define the three scheduler configurations: (scheduler_class, num_steps, label)
scheduler_configs = [
    (DPMSolverMultistepScheduler, 25, 'DPM-Solver++ (25 steps)'),
    (DDIMScheduler, 50, 'DDIM (50 steps)'),
    (EulerDiscreteScheduler, 30, 'Euler (30 steps)'),
]

sampler_images = []
sampler_timings = []

for scheduler_cls, num_steps, label in scheduler_configs:
    print(f'--- {label} ---')

    # TODO: Set the pipeline's scheduler to a fresh instance of scheduler_cls.
    # Use scheduler_cls.from_config(scheduler_config).
    # (Look at Exercise 1 or Exercise 2 for the pattern.)

    # TODO: Create a generator with the same seed for fair comparison.

    start = time.time()

    # TODO: Call pipe() with the prompt, num_steps, guidance_scale, generator,
    # height=512, width=512. Store the result.
    result = None

    elapsed = time.time() - start

    if result is not None:
        sampler_images.append(result.images[0])
        sampler_timings.append(elapsed)
        print(f'  Scheduler: {pipe.scheduler.__class__.__name__}')
        print(f'  Time: {elapsed:.1f}s')
        print(f'  U-Net passes: {num_steps * 2}')
    else:
        print('  TODO: fill in the code above')
        break

# Display results
if len(sampler_images) == 3:
    titles = [
        f'{label}\n{t:.1f}s'
        for (_, _, label), t in zip(scheduler_configs, sampler_timings)
    ]
    show_images(sampler_images, titles, figsize=(18, 6))

    print('\n=== Sampler Comparison ===')
    for (_, steps, label), t in zip(scheduler_configs, sampler_timings):
        print(f'  {label}: {t:.1f}s')
    print()
    print('Same model. Same weights. Same seed. Same prompt.')
    print('Different samplers follow different numerical paths through the same ODE.')
    print('The images are NOT identical, but the quality is comparable.')
else:
    print('\nFill in the TODOs above to complete this exercise.')

<details>
<summary>ðŸ’¡ Solution</summary>

The key insight: all three schedulers use the same trained U-Net weights. The model's prediction (noise) is the same. The schedulers differ in *how they use that prediction to step from z_t to z_{t-1}*. Different numerical methods follow different paths through the same trajectory, producing similar but not identical images.

```python
    # Set the scheduler
    pipe.scheduler = scheduler_cls.from_config(scheduler_config)

    # Same seed for fair comparison
    generator = torch.Generator(device=device).manual_seed(seed)

    start = time.time()

    result = pipe(
        prompt,
        num_inference_steps=num_steps,
        guidance_scale=guidance_scale,
        generator=generator,
        height=512,
        width=512,
    )
```

**What to observe:**
- The three images depict the same scene but with subtle differences in composition, color, and detail.
- DPM-Solver++ at 25 steps is likely fastest (fewest steps, higher-order solver).
- DDIM at 50 steps takes roughly twice as long (twice the steps, first-order solver).
- Euler at 30 steps is in between.
- The quality is comparable across all three. The "same vehicle, different route" analogy holds.

**Common mistakes:**
1. **Forgetting to re-seed the generator.** Without creating a fresh `torch.Generator(device=device).manual_seed(seed)` for each scheduler, z_T differs between runs, making the comparison meaningless. The generator's state is consumed during sampling and cannot be reused.
2. **Generator device mismatch.** If the pipeline is on CUDA but you create `torch.Generator("cpu").manual_seed(seed)`, you will get an error or silently incorrect results. Always match the generator's device to the pipeline: `torch.Generator(device=device).manual_seed(seed)` where `device` matches `pipe.device`. This is a frequent diffusers stumbling block.

</details>

---

## Exercise 4: Design Your Own Experiment [Independent]

You have explored `guidance_scale`, `num_inference_steps`, and `scheduler`. Now it is your turn to design a controlled experiment.

**Your task:**
1. **Pick a parameter** to investigate. Suggestions:
   - `negative_prompt` â€” How does adding "blurry, low quality, deformed" change the generation? (Remember: negative prompts replace the empty-string unconditional in CFG. They steer, not erase.)
   - **Seed variation** â€” How much does the seed change the image while keeping the same prompt? (Seeds control the structural "skeleton" via z_T.)
   - **Prompt wording** â€” Does "a cat sitting on a beach" vs "cat beach sitting" produce different results? (CLIP's transformer produces context-dependent embeddings.)
   - **Parameter interaction** â€” Does guidance_scale interact with num_inference_steps? (Try low guidance at few steps vs high guidance at few steps.)

2. **Form a hypothesis** â€” predict what will happen, based on the concepts you know.

3. **Design a controlled comparison** â€” fix everything except your target variable. Generate a set of images.

4. **Display the results** in a grid.

5. **Write a one-sentence conclusion** â€” did your hypothesis hold? If not, what did you learn?

This is your experiment. There is no single correct answer. The goal is systematic exploration with understanding, not random parameter tweaking.

In [None]:
# YOUR EXPERIMENT
#
# 1. Choose your parameter and state your hypothesis
# 2. Set up the controlled comparison (fix everything else)
# 3. Generate images
# 4. Display them
# 5. Print your conclusion
#
# Use the patterns from Exercises 1-3:
#   - pipe.scheduler = SchedulerClass.from_config(scheduler_config)
#   - generator = torch.Generator(device=device).manual_seed(seed)
#   - result = pipe(prompt, guidance_scale=..., num_inference_steps=...,
#                   generator=generator, height=512, width=512)
#   - For negative prompts: result = pipe(prompt, negative_prompt="...", ...)
#   - show_images() or show_image_grid() to display results

print('Hypothesis: ')
print()

# YOUR CODE HERE

print()
print('Conclusion: ')

<details>
<summary>ðŸ’¡ Example Experiment: Negative Prompt Steering</summary>

**Hypothesis:** Adding a negative prompt "blurry, low quality, deformed, watermark" will produce a noticeably sharper, cleaner image than generating without a negative promptâ€”not by removing blur from the image, but by steering the entire generation trajectory away from blurry outputs from step 1 onward.

**Why this hypothesis:** The negative prompt replaces the empty-string unconditional embedding in the CFG formula. Instead of Îµ_cfg = Îµ_uncond + w Â· (Îµ_cond âˆ’ Îµ_uncond), it becomes Îµ_cfg = Îµ_neg + w Â· (Îµ_cond âˆ’ Îµ_neg). The generation is steered AWAY from the negative prompt's semantic direction at every step. It is a compass pointing away from undesirable directions, not an eraser.

```python
prompt = "a detailed portrait of an elderly woman, oil painting"
seed = 42

pipe.scheduler = DPMSolverMultistepScheduler.from_config(scheduler_config)

# Without negative prompt
gen = torch.Generator(device=device).manual_seed(seed)
img_without = pipe(
    prompt, guidance_scale=7.5, num_inference_steps=25,
    generator=gen, height=512, width=512,
).images[0]

# With negative prompt
gen = torch.Generator(device=device).manual_seed(seed)
img_with = pipe(
    prompt,
    negative_prompt="blurry, low quality, deformed, watermark",
    guidance_scale=7.5, num_inference_steps=25,
    generator=gen, height=512, width=512,
).images[0]

show_images(
    [img_without, img_with],
    ['No negative prompt', 'negative_prompt="blurry, low quality, deformed, watermark"'],
    figsize=(12, 6),
)

print('Conclusion: The negative prompt produces a cleaner, more detailed image.')
print('The two images are NOT the same image with blur removed.')
print('They are fundamentally different generations because the CFG direction')
print('changed at every step. The negative prompt steered the trajectory,')
print('not erased artifacts from the result.')
```

**What to notice:**
- The two images are not the same composition with blur removed. They are different generations because the CFG direction was different at every denoising step.
- This confirms that negative prompts are a *steering* mechanism (directional change in the CFG formula), not a *post-processing* mechanism (no erasing).
- The negative prompt is most effective at preventing common failure modes (blur, deformation) rather than removing specific content.

</details>

<details>
<summary>ðŸ’¡ Example Experiment: Prompt Wording (Sentences vs Keywords)</summary>

**Hypothesis:** "a cat sitting on a beach at sunset" will produce a higher-quality, more coherent image than "cat beach sunset sitting a on" because CLIP's transformer produces context-dependent embeddingsâ€”word order and syntax matter.

```python
prompts = [
    "a cat sitting on a beach at sunset",
    "cat beach sunset sitting a on",
    "a beach sitting on a cat at sunset",
]
seed = 42

pipe.scheduler = DPMSolverMultistepScheduler.from_config(scheduler_config)

wording_images = []
for p in prompts:
    gen = torch.Generator(device=device).manual_seed(seed)
    result = pipe(
        p, guidance_scale=7.5, num_inference_steps=25,
        generator=gen, height=512, width=512,
    )
    wording_images.append(result.images[0])

show_images(
    wording_images,
    [f'"{p}"' for p in prompts],
    figsize=(18, 6),
)

print('Conclusion: Prompts are sentences, not keyword bags.')
print('CLIP\'s self-attention makes each token\'s embedding context-dependent.')
print('Scrambled syntax produces different embeddings, different cross-attention,')
print('and therefore different (usually worse) images.')
```

</details>

---

## Key Takeaways

1. **Every parameter maps to a concept you built from scratch.** `prompt` â†’ CLIP encoding â†’ cross-attention. `guidance_scale` â†’ CFG *w* parameter. `num_inference_steps` â†’ sampler step count along the ODE trajectory. `scheduler` â†’ sampler choice. `negative_prompt` â†’ CFG unconditional substitution. `generator` â†’ seed â†’ z_T. `height`/`width` â†’ latent dimensions via VAE 8Ã— downsampling.

2. **`guidance_scale` is a tradeoff, not a quality dial.** Low values produce soft, prompt-unfaithful images. High values produce oversaturated, distorted images. The sweet spot (~7â€“8) balances prompt fidelity with image naturalness. This is the CFG formula made visible.

3. **DPM-Solver++ plateaus at 20â€“30 steps.** More steps past the sweet spot waste compute with negligible quality improvement. The default of 50 in many tutorials is a conservative holdover from DDIM. Use 20â€“25 steps as your default.

4. **Systematic experimentation beats random exploration.** Fix everything except one parameter, vary it, observe the effect. Start with defaults (DPM-Solver++ at 25 steps, guidance_scale=7.5, 512Ã—512) and refine from there.

5. **The API is a dashboard, not a black box.** You did not follow a tutorial. You understood the system. Every parameter change had a predictable effect because you built the underlying concepts across 16 lessons.