# Lab 2.6.4: Flux Exploration - Next-Gen Image Generation

**Module:** 2.6 - Diffusion Models  
**Time:** 2 hours  
**Difficulty:** ‚≠ê‚≠ê (Beginner-Intermediate)

---

## ‚ö†Ô∏è IMPORTANT: License Requirements

**Before running this notebook, you MUST:**
1. Visit [black-forest-labs/FLUX.1-schnell](https://huggingface.co/black-forest-labs/FLUX.1-schnell) on Hugging Face
2. Accept the license agreement (Apache 2.0 for schnell)
3. Visit [black-forest-labs/FLUX.1-dev](https://huggingface.co/black-forest-labs/FLUX.1-dev) on Hugging Face
4. Accept the license agreement (non-commercial for dev)
5. Log in with `huggingface-cli login` if not already authenticated

**Flux-dev is for non-commercial use only!** For commercial use, consider Flux-schnell (Apache 2.0) or Flux-pro (via API).

---

## üéØ Learning Objectives

By the end of this notebook, you will:
- [ ] Understand the Flux architecture and how it differs from SDXL
- [ ] Load and run Flux-dev and Flux-schnell on DGX Spark
- [ ] Compare quality and speed between Flux variants
- [ ] Perform side-by-side comparisons with SDXL
- [ ] Learn when to use each model

---

## üìö Prerequisites

- Completed: Lab 2.6.2 (Stable Diffusion Generation)
- Knowledge of: Basic diffusion concepts, SDXL usage
- **Required packages:**
  - `diffusers>=0.30.0` (Flux support)
  - `transformers>=4.42.0`
  - `sentencepiece` (for T5 tokenizer)
- **Required**: Hugging Face account with Flux model access (see above)

---

## üåç Real-World Context

**Flux represents the next evolution in diffusion models:**

- Created by **Black Forest Labs** (team behind Stable Diffusion)
- Often produces more photorealistic results than SDXL
- Different aesthetic, especially for text rendering
- **Flux-schnell**: Extremely fast (4 steps!)
- **Flux-dev**: Higher quality, more steps

DGX Spark can run Flux at full precision with room to spare!

---

## üßí ELI5: What Makes Flux Different?

> **Think of SDXL and Flux like two talented artists:**
>
> **SDXL** is like a classical painter:
> - Uses a U-Net (traditional CNN architecture)
> - Works in latent space (64√ó64‚Üí512√ó512)
> - Great at many styles, especially artistic
>
> **Flux** is like a modern digital artist:
> - Uses a DiT (Diffusion Transformer) architecture
> - Better at understanding complex prompts
> - Often more photorealistic
> - Better at rendering text in images!
>
> **The key innovation:** Flux uses Transformers (like ChatGPT uses for text)
> for image generation. This gives better long-range understanding.

### Architecture Comparison

```
SDXL Architecture:                 Flux Architecture:
‚îå‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îê             ‚îå‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îê
‚îÇ    U-Net with     ‚îÇ             ‚îÇ    Diffusion      ‚îÇ
‚îÇ  Cross-Attention  ‚îÇ             ‚îÇ   Transformer     ‚îÇ
‚îÇ  (CNN-based)      ‚îÇ             ‚îÇ  (DiT-based)      ‚îÇ
‚îî‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îò             ‚îî‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îò
        ‚îÇ                                  ‚îÇ
    Local + Some                      Full Global
    Global Context                    Attention
```

---

## Part 1: Setting Up

In [None]:
# Core imports
import torch
import gc
import time
from pathlib import Path

# Diffusers
from diffusers import FluxPipeline

# Visualization
from PIL import Image
import matplotlib.pyplot as plt
import numpy as np

# Device setup
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
print(f"Device: {device}")

if torch.cuda.is_available():
    print(f"GPU: {torch.cuda.get_device_name()}")
    print(f"Memory: {torch.cuda.get_device_properties(0).total_memory / 1e9:.1f} GB")

In [None]:
# Helper functions
def show_comparison(images, titles, figsize=(20, 10)):
    """Display images side by side for comparison."""
    n = len(images)
    fig, axes = plt.subplots(1, n, figsize=figsize)
    if n == 1:
        axes = [axes]
    
    for ax, img, title in zip(axes, images, titles):
        ax.imshow(img)
        ax.set_title(title, fontsize=12)
        ax.axis('off')
    
    plt.tight_layout()
    plt.show()

def get_memory_usage():
    """Get current GPU memory usage."""
    if torch.cuda.is_available():
        allocated = torch.cuda.memory_allocated() / 1e9
        reserved = torch.cuda.memory_reserved() / 1e9
        return f"Allocated: {allocated:.2f}GB, Reserved: {reserved:.2f}GB"
    return "N/A"

def timed_generation(pipe, prompt, **kwargs):
    """Generate image and return timing info."""
    torch.cuda.synchronize()
    start = time.time()
    
    image = pipe(prompt=prompt, **kwargs).images[0]
    
    torch.cuda.synchronize()
    elapsed = time.time() - start
    
    return image, elapsed

print("Helper functions ready!")

---

## Part 2: Loading Flux Models

### Flux Variants

| Variant | Steps | Speed | Quality | Use Case |
|---------|-------|-------|---------|----------|
| **Flux-schnell** | 4 | Very Fast (~3s) | Good | Quick iterations, previews |
| **Flux-dev** | 20-50 | Moderate (~15s) | Excellent | Final renders, quality |
| **Flux-pro** | 25+ | Slow | Best | API only (commercial) |

**Note:** Flux requires accepting the license on Hugging Face. Visit the model page and accept first!

In [None]:
# Load Flux-schnell (fast version)
print("Loading Flux-schnell...")
print(f"Memory before: {get_memory_usage()}")

start_time = time.time()

pipe_schnell = FluxPipeline.from_pretrained(
    "black-forest-labs/FLUX.1-schnell",
    torch_dtype=torch.bfloat16,  # Native Blackwell support
)
pipe_schnell = pipe_schnell.to(device)

load_time = time.time() - start_time
print(f"\n‚úÖ Flux-schnell loaded in {load_time:.1f} seconds")
print(f"Memory after: {get_memory_usage()}")

In [None]:
# Test Flux-schnell with a simple prompt
prompt = "A beautiful sunset over a calm ocean, photorealistic"

print(f"Generating with Flux-schnell...")

generator = torch.Generator(device=device).manual_seed(42)

image, gen_time = timed_generation(
    pipe_schnell,
    prompt=prompt,
    num_inference_steps=4,  # Schnell is optimized for 4 steps!
    generator=generator,
    guidance_scale=0.0,  # Schnell doesn't need guidance
)

print(f"‚è±Ô∏è Generation time: {gen_time:.2f}s")
print(f"üìê Image size: {image.size}")

plt.figure(figsize=(10, 10))
plt.imshow(image)
plt.title(f"Flux-schnell (4 steps, {gen_time:.1f}s)")
plt.axis('off')
plt.show()

### Loading Flux-dev

Flux-dev is the higher quality version. Let's load it and compare.

In [None]:
# Clean up schnell to make room for dev
del pipe_schnell
gc.collect()
torch.cuda.empty_cache()
print(f"Memory after cleanup: {get_memory_usage()}")

In [None]:
# Load Flux-dev (higher quality version)
print("Loading Flux-dev...")
print(f"Memory before: {get_memory_usage()}")

start_time = time.time()

pipe_dev = FluxPipeline.from_pretrained(
    "black-forest-labs/FLUX.1-dev",
    torch_dtype=torch.bfloat16,
)
pipe_dev = pipe_dev.to(device)

load_time = time.time() - start_time
print(f"\n‚úÖ Flux-dev loaded in {load_time:.1f} seconds")
print(f"Memory after: {get_memory_usage()}")

In [None]:
# Test Flux-dev with different step counts
prompt = "A professional portrait of a wise old wizard with a long white beard, wearing ornate robes with celestial patterns, soft dramatic lighting, highly detailed, fantasy art"

step_counts = [20, 35, 50]
images = []
times = []

for steps in step_counts:
    print(f"Generating with {steps} steps...")
    generator = torch.Generator(device=device).manual_seed(42)
    
    image, gen_time = timed_generation(
        pipe_dev,
        prompt=prompt,
        num_inference_steps=steps,
        generator=generator,
        guidance_scale=3.5,
    )
    
    images.append(image)
    times.append(gen_time)
    print(f"  ‚è±Ô∏è {gen_time:.1f}s")

# Display comparison
titles = [f"Flux-dev ({s} steps, {t:.1f}s)" for s, t in zip(step_counts, times)]
show_comparison(images, titles)

---

## Part 3: Flux vs SDXL Comparison

Let's do a head-to-head comparison on the same prompts!

In [None]:
# Load SDXL for comparison
from diffusers import StableDiffusionXLPipeline

print("Loading SDXL for comparison...")

pipe_sdxl = StableDiffusionXLPipeline.from_pretrained(
    "stabilityai/stable-diffusion-xl-base-1.0",
    torch_dtype=torch.bfloat16,
    variant="fp16",
)
pipe_sdxl = pipe_sdxl.to(device)

print(f"‚úÖ SDXL loaded")
print(f"Memory: {get_memory_usage()}")

In [None]:
# Comparison prompts - testing different aspects
comparison_prompts = [
    {
        "name": "Photorealism",
        "prompt": "A close-up portrait of a middle-aged woman with freckles, natural lighting, professional photography, 8K resolution",
    },
    {
        "name": "Text Rendering",
        "prompt": "A vintage wooden sign that says 'Welcome to the Future' in elegant script, rustic style",
    },
    {
        "name": "Complex Scene",
        "prompt": "A busy street market in Morocco at golden hour, spices, fabrics, local vendors, photorealistic, crowded, vibrant colors",
    },
    {
        "name": "Fantasy Art",
        "prompt": "A majestic dragon perched on a mountain peak at sunset, scales gleaming, fantasy art, highly detailed, epic scale",
    },
]

results = []

for test in comparison_prompts:
    print(f"\nüì∏ Testing: {test['name']}")
    print(f"   Prompt: {test['prompt'][:60]}...")
    
    # Generate with Flux-dev
    print("   Generating with Flux-dev...")
    generator = torch.Generator(device=device).manual_seed(42)
    flux_img, flux_time = timed_generation(
        pipe_dev,
        prompt=test['prompt'],
        num_inference_steps=30,
        generator=generator,
        guidance_scale=3.5,
    )
    print(f"   Flux: {flux_time:.1f}s")
    
    # Generate with SDXL
    print("   Generating with SDXL...")
    generator = torch.Generator(device=device).manual_seed(42)
    sdxl_img, sdxl_time = timed_generation(
        pipe_sdxl,
        prompt=test['prompt'],
        num_inference_steps=30,
        generator=generator,
        guidance_scale=7.5,
    )
    print(f"   SDXL: {sdxl_time:.1f}s")
    
    results.append({
        'name': test['name'],
        'flux_img': flux_img,
        'flux_time': flux_time,
        'sdxl_img': sdxl_img,
        'sdxl_time': sdxl_time,
    })

print("\n‚úÖ All comparisons complete!")

In [None]:
# Display comparison results
fig, axes = plt.subplots(len(results), 2, figsize=(16, 8*len(results)))

for i, result in enumerate(results):
    # Flux result
    axes[i, 0].imshow(result['flux_img'])
    axes[i, 0].set_title(f"Flux-dev ({result['flux_time']:.1f}s)\n{result['name']}", fontsize=12)
    axes[i, 0].axis('off')
    
    # SDXL result
    axes[i, 1].imshow(result['sdxl_img'])
    axes[i, 1].set_title(f"SDXL ({result['sdxl_time']:.1f}s)\n{result['name']}", fontsize=12)
    axes[i, 1].axis('off')

plt.suptitle("Flux vs SDXL Comparison", fontsize=16, y=1.01)
plt.tight_layout()
plt.show()

### Analysis: When to Use Each Model

| Aspect | Flux | SDXL |
|--------|------|------|
| **Photorealism** | ‚≠ê‚≠ê‚≠ê‚≠ê‚≠ê Often more realistic | ‚≠ê‚≠ê‚≠ê‚≠ê Very good |
| **Text Rendering** | ‚≠ê‚≠ê‚≠ê‚≠ê‚≠ê Much better! | ‚≠ê‚≠ê Struggles |
| **Artistic Styles** | ‚≠ê‚≠ê‚≠ê‚≠ê Good | ‚≠ê‚≠ê‚≠ê‚≠ê‚≠ê Excellent |
| **Speed (schnell)** | ‚≠ê‚≠ê‚≠ê‚≠ê‚≠ê ~3s | N/A |
| **Memory Usage** | ~12GB | ~7GB |
| **ControlNet/LoRA** | Limited support | ‚≠ê‚≠ê‚≠ê‚≠ê‚≠ê Extensive |
| **Community Models** | Growing | ‚≠ê‚≠ê‚≠ê‚≠ê‚≠ê Huge ecosystem |

---

## Part 4: Text-in-Image Generation

One of Flux's standout capabilities is rendering readable text in images!

In [None]:
# Test text rendering capabilities
text_prompts = [
    "A coffee shop window with 'OPEN' written on a chalkboard sign, cozy atmosphere",
    "A book cover with the title 'The Last Adventure' in elegant serif font, fantasy style",
    "A neon sign that says 'CYBER CAFE' in a futuristic cityscape, night scene, rain",
    "A birthday cake with 'Happy 30th!' written in icing, colorful, celebration",
]

flux_images = []
sdxl_images = []

for prompt in text_prompts:
    print(f"Testing: {prompt[:40]}...")
    
    # Flux
    generator = torch.Generator(device=device).manual_seed(42)
    flux_img = pipe_dev(
        prompt=prompt,
        num_inference_steps=30,
        generator=generator,
        guidance_scale=3.5,
    ).images[0]
    flux_images.append(flux_img)
    
    # SDXL
    generator = torch.Generator(device=device).manual_seed(42)
    sdxl_img = pipe_sdxl(
        prompt=prompt,
        num_inference_steps=30,
        generator=generator,
        guidance_scale=7.5,
    ).images[0]
    sdxl_images.append(sdxl_img)

# Display
fig, axes = plt.subplots(4, 2, figsize=(14, 24))

for i, prompt in enumerate(text_prompts):
    axes[i, 0].imshow(flux_images[i])
    axes[i, 0].set_title(f"Flux: {prompt[:30]}...", fontsize=10)
    axes[i, 0].axis('off')
    
    axes[i, 1].imshow(sdxl_images[i])
    axes[i, 1].set_title(f"SDXL: {prompt[:30]}...", fontsize=10)
    axes[i, 1].axis('off')

plt.suptitle("Text Rendering: Flux vs SDXL", fontsize=14, y=1.01)
plt.tight_layout()
plt.show()

print("\nüí° Notice how Flux renders text much more accurately!")

---

## Part 5: Flux Generation Settings

Let's explore optimal settings for Flux models.

In [None]:
# Test different guidance scales for Flux-dev
prompt = "A serene Japanese garden with a red bridge over a koi pond, cherry blossoms, photorealistic"

guidance_scales = [1.0, 2.0, 3.5, 5.0, 7.0]
images = []

for gs in guidance_scales:
    print(f"Generating with guidance_scale={gs}...")
    generator = torch.Generator(device=device).manual_seed(42)
    
    image = pipe_dev(
        prompt=prompt,
        num_inference_steps=25,
        generator=generator,
        guidance_scale=gs,
    ).images[0]
    
    images.append(image)

# Display
fig, axes = plt.subplots(1, 5, figsize=(25, 5))
for ax, img, gs in zip(axes, images, guidance_scales):
    ax.imshow(img)
    ax.set_title(f"Guidance: {gs}")
    ax.axis('off')

plt.suptitle("Flux-dev Guidance Scale Comparison", fontsize=14, y=1.05)
plt.tight_layout()
plt.show()

print("\nüìä Flux-dev Guidance Scale Guide:")
print("  - 1.0-2.0: Very creative, may deviate from prompt")
print("  - 3.0-4.0: Recommended balance (3.5 is default)")
print("  - 5.0-7.0: Stronger prompt adherence, may oversaturate")

---

## Part 6: Performance Benchmarks on DGX Spark

In [None]:
# Comprehensive benchmark
print("üöÄ DGX Spark Flux Benchmark")
print("=" * 50)

# Clean up SDXL for accurate Flux benchmarks
del pipe_sdxl
gc.collect()
torch.cuda.empty_cache()

benchmark_prompt = "A beautiful mountain landscape at sunset, photorealistic, 8K"

# Benchmark Flux-dev at different step counts
print("\nFlux-dev Benchmarks:")
for steps in [20, 30, 50]:
    times = []
    for _ in range(3):  # 3 runs for average
        generator = torch.Generator(device=device).manual_seed(42)
        torch.cuda.synchronize()
        start = time.time()
        _ = pipe_dev(
            prompt=benchmark_prompt,
            num_inference_steps=steps,
            generator=generator,
            guidance_scale=3.5,
        )
        torch.cuda.synchronize()
        times.append(time.time() - start)
    
    avg_time = sum(times) / len(times)
    print(f"  {steps} steps: {avg_time:.2f}s (avg of 3 runs)")

print(f"\nMemory Usage: {get_memory_usage()}")

---

## ‚ö†Ô∏è Common Mistakes

### Mistake 1: Using SDXL Guidance Scales with Flux

```python
# ‚ùå Wrong: SDXL-style high guidance
pipe_dev(prompt="...", guidance_scale=7.5)  # Too high for Flux!

# ‚úÖ Right: Flux-appropriate guidance
pipe_dev(prompt="...", guidance_scale=3.5)  # Flux default
```

### Mistake 2: Using Guidance with Schnell

```python
# ‚ùå Wrong: Schnell doesn't need guidance
pipe_schnell(prompt="...", guidance_scale=7.5)

# ‚úÖ Right: Guidance-free distilled model
pipe_schnell(prompt="...", guidance_scale=0.0)
```

### Mistake 3: Wrong Step Count for Schnell

```python
# ‚ùå Wrong: Too many steps wastes time
pipe_schnell(num_inference_steps=50)  # No benefit!

# ‚úÖ Right: Schnell is optimized for 4 steps
pipe_schnell(num_inference_steps=4)  # Sweet spot
```

---

## üéâ Checkpoint

You've learned:
- ‚úÖ How Flux differs architecturally from SDXL (DiT vs U-Net)
- ‚úÖ Loading and using Flux-schnell (fast) and Flux-dev (quality)
- ‚úÖ Optimal settings for Flux (guidance scale, steps)
- ‚úÖ Flux's superior text rendering capabilities
- ‚úÖ When to choose Flux vs SDXL for different tasks

---

## üöÄ Challenge (Optional)

1. **Speed Run**: Generate 10 images with Flux-schnell in under 30 seconds
2. **Typography**: Create a poster with complex text layout using Flux
3. **Style Matching**: Find prompts where SDXL beats Flux and vice versa

---

## üßπ Cleanup

In [None]:
# Clean up
del pipe_dev
gc.collect()
torch.cuda.empty_cache()
print("GPU memory cleared!")

---

## Next Steps

Proceed to **Lab 2.6.5: LoRA Style Training** to learn how to train custom styles for SDXL!