# Introduction to Diffusers 🧨

Welcome to the first unit of the Diffusion Models course! In this notebook, we'll explore the basics of the 🤗 Diffusers library and learn how to generate images using state-of-the-art diffusion models.

## Learning Objectives

By the end of this notebook, you will:
- Understand what diffusion models are and how they work
- Know how to use the Diffusers library to generate images
- Be familiar with different schedulers and their effects
- Understand key parameters like guidance scale and inference steps

## What are Diffusion Models?

Diffusion models are a class of generative models that learn to generate data by reversing a gradual noising process. They work by:

1. **Forward Process**: Gradually adding noise to training images until they become pure noise
2. **Reverse Process**: Learning to remove noise step by step to generate new images

The most popular diffusion models for text-to-image generation include:
- Stable Diffusion
- DALL-E 2
- Imagen
- Midjourney

## Setup

First, let's import the necessary libraries and set up our environment.

In [None]:
# Import necessary libraries
import torch
from diffusers import DiffusionPipeline, StableDiffusionPipeline
from diffusers import DDIMScheduler, PNDMScheduler, LMSDiscreteScheduler
from PIL import Image
import matplotlib.pyplot as plt
import numpy as np
from huggingface_hub import login
import os
from dotenv import load_dotenv

# Load environment variables
load_dotenv()

# Setup device
device = "cuda" if torch.cuda.is_available() else "mps" if torch.backends.mps.is_available() else "cpu"
print(f"Using device: {device}")

# Login to HuggingFace (if token is available)
hf_token = os.getenv("HUGGING_FACE_WRITE_TOKEN")
if hf_token:
    login(token=hf_token)
    print("✅ Logged in to HuggingFace")
else:
    print("⚠️ No HuggingFace token found. Some models may not be accessible.")

## Loading Your First Diffusion Model

Let's start by loading a Stable Diffusion model. We'll use the `runwayml/stable-diffusion-v1-5` model, which is one of the most popular and accessible models.

In [None]:
# Load the Stable Diffusion pipeline
model_id = "runwayml/stable-diffusion-v1-5"

print(f"Loading model: {model_id}")
print("This may take a few minutes on first run...")

# Load the pipeline
pipe = StableDiffusionPipeline.from_pretrained(
    model_id,
    torch_dtype=torch.float16 if device != "cpu" else torch.float32,
    safety_checker=None,  # Disable for educational purposes
    requires_safety_checker=False
)

# Move to device
pipe = pipe.to(device)

# Enable memory efficient attention
pipe.enable_attention_slicing()

print("✅ Model loaded successfully!")

## Your First Image Generation

Now let's generate our first image! We'll start with a simple prompt.

In [None]:
# Define our prompt
prompt = "a photograph of an astronaut riding a horse"

print(f"Generating image with prompt: '{prompt}'")

# Generate the image
with torch.autocast(device):
    image = pipe(prompt).images[0]

# Display the image
plt.figure(figsize=(8, 8))
plt.imshow(image)
plt.title(f"Generated Image: {prompt}")
plt.axis('off')
plt.show()

# Save the image
image.save("astronaut_horse.png")
print("Image saved as 'astronaut_horse.png'")

## Understanding Key Parameters

Diffusion models have several important parameters that control the generation process. Let's explore the most important ones:

### 1. Number of Inference Steps

This controls how many denoising steps the model takes. More steps generally mean higher quality but slower generation.

In [None]:
# Compare different numbers of inference steps
prompt = "a beautiful landscape with mountains and a lake"
steps_list = [10, 25, 50]

fig, axes = plt.subplots(1, 3, figsize=(15, 5))

for i, steps in enumerate(steps_list):
    print(f"Generating with {steps} steps...")
    
    with torch.autocast(device):
        image = pipe(
            prompt, 
            num_inference_steps=steps,
            generator=torch.Generator(device=device).manual_seed(42)  # For reproducibility
        ).images[0]
    
    axes[i].imshow(image)
    axes[i].set_title(f"{steps} steps")
    axes[i].axis('off')

plt.tight_layout()
plt.show()

### 2. Guidance Scale

This parameter controls how closely the model follows your prompt. Higher values make the model follow the prompt more strictly, but can reduce creativity.

In [None]:
# Compare different guidance scales
prompt = "a cute robot painting a masterpiece"
guidance_scales = [1.0, 7.5, 15.0]

fig, axes = plt.subplots(1, 3, figsize=(15, 5))

for i, scale in enumerate(guidance_scales):
    print(f"Generating with guidance scale {scale}...")
    
    with torch.autocast(device):
        image = pipe(
            prompt, 
            guidance_scale=scale,
            num_inference_steps=25,
            generator=torch.Generator(device=device).manual_seed(42)
        ).images[0]
    
    axes[i].imshow(image)
    axes[i].set_title(f"Guidance Scale: {scale}")
    axes[i].axis('off')

plt.tight_layout()
plt.show()

### 3. Negative Prompts

Negative prompts tell the model what you DON'T want in your image. This is a powerful technique for improving image quality.

In [None]:
# Compare with and without negative prompts
prompt = "a portrait of a person"
negative_prompt = "blurry, low quality, distorted, ugly, bad anatomy"

fig, axes = plt.subplots(1, 2, figsize=(12, 6))

# Without negative prompt
print("Generating without negative prompt...")
with torch.autocast(device):
    image1 = pipe(
        prompt,
        num_inference_steps=25,
        generator=torch.Generator(device=device).manual_seed(42)
    ).images[0]

# With negative prompt
print("Generating with negative prompt...")
with torch.autocast(device):
    image2 = pipe(
        prompt,
        negative_prompt=negative_prompt,
        num_inference_steps=25,
        generator=torch.Generator(device=device).manual_seed(42)
    ).images[0]

axes[0].imshow(image1)
axes[0].set_title("Without Negative Prompt")
axes[0].axis('off')

axes[1].imshow(image2)
axes[1].set_title("With Negative Prompt")
axes[1].axis('off')

plt.tight_layout()
plt.show()

## Understanding Schedulers

Schedulers control how noise is removed during the generation process. Different schedulers can produce different results and have different speed/quality tradeoffs.

In [None]:
# Compare different schedulers
from diffusers import DDIMScheduler, PNDMScheduler, LMSDiscreteScheduler, EulerDiscreteScheduler

prompt = "a magical forest with glowing mushrooms"

# Define schedulers to compare
schedulers = {
    "DDIM": DDIMScheduler.from_config(pipe.scheduler.config),
    "PNDM": PNDMScheduler.from_config(pipe.scheduler.config),
    "LMS": LMSDiscreteScheduler.from_config(pipe.scheduler.config),
    "Euler": EulerDiscreteScheduler.from_config(pipe.scheduler.config)
}

fig, axes = plt.subplots(2, 2, figsize=(12, 12))
axes = axes.flatten()

for i, (name, scheduler) in enumerate(schedulers.items()):
    print(f"Generating with {name} scheduler...")
    
    # Set the scheduler
    pipe.scheduler = scheduler
    
    with torch.autocast(device):
        image = pipe(
            prompt,
            num_inference_steps=25,
            generator=torch.Generator(device=device).manual_seed(42)
        ).images[0]
    
    axes[i].imshow(image)
    axes[i].set_title(f"{name} Scheduler")
    axes[i].axis('off')

plt.tight_layout()
plt.show()

## Exploring Different Art Styles

One of the most exciting aspects of text-to-image models is their ability to generate images in different artistic styles. Let's explore this!

In [None]:
# Different art styles
base_prompt = "a cat sitting on a windowsill"
styles = [
    "oil painting",
    "watercolor",
    "digital art",
    "pencil sketch",
    "anime style",
    "photorealistic"
]

fig, axes = plt.subplots(2, 3, figsize=(15, 10))
axes = axes.flatten()

for i, style in enumerate(styles):
    styled_prompt = f"{base_prompt}, {style}"
    print(f"Generating: {styled_prompt}")
    
    with torch.autocast(device):
        image = pipe(
            styled_prompt,
            num_inference_steps=25,
            generator=torch.Generator(device=device).manual_seed(42 + i)
        ).images[0]
    
    axes[i].imshow(image)
    axes[i].set_title(style.title())
    axes[i].axis('off')

plt.tight_layout()
plt.show()

## Batch Generation

You can generate multiple images at once for the same prompt to explore variations.

In [None]:
# Generate multiple variations of the same prompt
prompt = "a cozy coffee shop in autumn"
num_images = 4

print(f"Generating {num_images} variations of: '{prompt}'")

with torch.autocast(device):
    images = pipe(
        prompt,
        num_images_per_prompt=num_images,
        num_inference_steps=25
    ).images

# Display all variations
fig, axes = plt.subplots(2, 2, figsize=(12, 12))
axes = axes.flatten()

for i, image in enumerate(images):
    axes[i].imshow(image)
    axes[i].set_title(f"Variation {i+1}")
    axes[i].axis('off')

plt.tight_layout()
plt.show()

## Exercise: Create Your Own Generations

Now it's your turn! Try creating images with your own prompts. Here are some ideas to get you started:

- "a futuristic city at sunset"
- "a dragon made of flowers"
- "a steampunk airship flying through clouds"
- "a minimalist bedroom with plants"
- "a vintage car in a cyberpunk setting"

In [None]:
# Your turn! Try your own prompts here
your_prompt = ""  # Add your prompt here

if your_prompt:
    print(f"Generating your image: '{your_prompt}'")
    
    with torch.autocast(device):
        your_image = pipe(
            your_prompt,
            num_inference_steps=25,
            guidance_scale=7.5
        ).images[0]
    
    plt.figure(figsize=(8, 8))
    plt.imshow(your_image)
    plt.title(f"Your Creation: {your_prompt}")
    plt.axis('off')
    plt.show()
    
    # Save your creation
    your_image.save("my_creation.png")
    print("Your image saved as 'my_creation.png'")
else:
    print("Add your prompt to the 'your_prompt' variable above and run this cell!")

## Summary

Congratulations! You've completed the introduction to diffusers. Here's what you've learned:

✅ **What diffusion models are** and how they work  
✅ **How to load and use** the Diffusers library  
✅ **Key parameters** like inference steps, guidance scale, and negative prompts  
✅ **Different schedulers** and their effects  
✅ **Art style control** through prompting  
✅ **Batch generation** for exploring variations  

## Next Steps

In the next notebooks, we'll explore:
- Image-to-image generation
- Inpainting and outpainting
- Fine-tuning your own models
- Advanced prompting techniques

Keep experimenting and have fun creating! 🎨✨

## Cleanup

To free up GPU memory, you can delete the pipeline when you're done:

In [None]:
# Clean up GPU memory
del pipe
torch.cuda.empty_cache() if torch.cuda.is_available() else None
print("✅ Memory cleaned up!")