# Diffusion and Flow Matching Tutorial

This tutorial demonstrates how to use the modular implementation of Diffusion Models and Flow Matching for generative modeling on a mixture of Gaussians dataset.

## Setup

First, let's import the necessary components from our codebase:

In [None]:
import torch
from src.data.dataset import get_dataloaders
from src.utils.training import train_diffusion, train_flow
from src.utils.visualization import plot_model_distributions, visualize_flow_evolution
from src.utils.metrics import evaluate_generated_samples

## 1. Data Generation and Loading

We'll start by creating our data loaders for the mixture of Gaussians dataset:

In [None]:
# Set random seed for reproducibility
torch.manual_seed(42)

# Create data loaders
train_loader, test_loader, x_coords = get_dataloaders(
    batch_size=512,
    num_train=4000,
    num_test=1000,
    seed=42
)

# Move coordinates to device
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
x_coords = x_coords.to(device)

print(f"Using device: {device}")
print(f"Training set size: {len(train_loader.dataset)}")
print(f"Test set size: {len(test_loader.dataset)}")

## 2. Training the Models

Now we'll train both the DDPM and Flow Matching models:

In [None]:
# Train Diffusion model
print("Training Diffusion Model...")
diffusion_model, _, diffusion_losses = train_diffusion(
    train_loader=train_loader,
    test_loader=test_loader,
    x_coords=x_coords,
    num_epochs=100,  # Reduced for tutorial
    save_interval=25,
    device=device
)

In [None]:
# Train Flow Matching model
print("Training Flow Matching Model...")
flow_model, _, test_data, flow_losses = train_flow(
    train_loader=train_loader,
    test_loader=test_loader,
    x_coords=x_coords,
    num_epochs=100,  # Reduced for tutorial
    save_interval=25,
    device=device
)

## 3. Generating and Comparing Samples

Let's generate samples from both models and compare them:

In [None]:
# Generate samples
print("Generating samples...")
with torch.no_grad():
    num_samples = 100
    
    # Generate Diffusion samples
    diffusion_samples = diffusion_model.generate_samples(
        num_samples=num_samples,
        device=device,
        x_coords=x_coords
    )
    
    # Generate Flow samples
    flow_samples = torch.randn(num_samples, len(x_coords)).to(device)
    time_steps = torch.linspace(0, 1.0, 100, device=device)
    
    for i in range(len(time_steps)-1):
        flow_samples = flow_model.step(
            x_t=flow_samples,
            t_start=time_steps[i],
            t_end=time_steps[i+1]
        )

# Compare distributions
plot_model_distributions(
    diffusion_samples=diffusion_samples,
    flow_samples=flow_samples,
    x_coords=x_coords,
    test_data=test_data,
    num_samples=num_samples
)

## 4. Evaluating Sample Quality

We can evaluate the quality of our generated samples using various metrics:

In [None]:
# Evaluate samples
diffusion_metrics = evaluate_generated_samples(diffusion_samples.cpu(), test_data[:num_samples].cpu())
flow_metrics = evaluate_generated_samples(flow_samples.cpu(), test_data[:num_samples].cpu())

# Display metrics
print("\nModel Comparison Metrics:")
print("-" * 40)
print(f"{'Metric':<15} {'Diffusion':>10} {'Flow':>10}")
print("-" * 40)
for metric in diffusion_metrics.keys():
    print(f"{metric:<15} {diffusion_metrics[metric]:>10.4f} {flow_metrics[metric]:>10.4f}")

## 5. Visualizing Flow Evolution

Finally, let's visualize how the Flow Matching model transforms noise into samples:

In [None]:
# Generate evolution samples
x_samples = []
current_sample = torch.randn(1, len(x_coords)).to(device)
x_samples.append(current_sample)

time_steps = torch.linspace(0, 1.0, 100, device=device)
for i in range(len(time_steps)-1):
    current_sample = flow_model.step(
        x_t=current_sample,
        t_start=time_steps[i],
        t_end=time_steps[i+1]
    )
    x_samples.append(current_sample)

# Visualize evolution
visualize_flow_evolution(
    x_samples=x_samples,
    x_coords=x_coords,
    time_steps=time_steps.tolist()
)