# üé® Generative Models: GANs, VAE & Diffusion

**Author**: Data Science Master System  
**Difficulty**: ‚≠ê‚≠ê‚≠ê‚≠ê Advanced  
**Time**: 90 minutes  
**Prerequisites**: 13_cv_image_segmentation

## Learning Objectives
- Understand generative vs discriminative models
- Implement GAN and VAE architectures
- Learn about Stable Diffusion
- Image-to-image translation

In [None]:
import torch
import torch.nn as nn
import numpy as np
import matplotlib.pyplot as plt

device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
print(f"Device: {device}")

## 1. GAN: Generator & Discriminator

In [None]:
class Generator(nn.Module):
    def __init__(self, latent_dim=100, img_shape=(1, 28, 28)):
        super().__init__()
        self.img_shape = img_shape
        self.model = nn.Sequential(
            nn.Linear(latent_dim, 256),
            nn.LeakyReLU(0.2),
            nn.BatchNorm1d(256),
            nn.Linear(256, 512),
            nn.LeakyReLU(0.2),
            nn.BatchNorm1d(512),
            nn.Linear(512, 1024),
            nn.LeakyReLU(0.2),
            nn.BatchNorm1d(1024),
            nn.Linear(1024, int(np.prod(img_shape))),
            nn.Tanh()
        )
    
    def forward(self, z):
        img = self.model(z)
        return img.view(img.size(0), *self.img_shape)

class Discriminator(nn.Module):
    def __init__(self, img_shape=(1, 28, 28)):
        super().__init__()
        self.model = nn.Sequential(
            nn.Linear(int(np.prod(img_shape)), 512),
            nn.LeakyReLU(0.2),
            nn.Dropout(0.3),
            nn.Linear(512, 256),
            nn.LeakyReLU(0.2),
            nn.Dropout(0.3),
            nn.Linear(256, 1),
            nn.Sigmoid()
        )
    
    def forward(self, img):
        return self.model(img.view(img.size(0), -1))

G = Generator().to(device)
D = Discriminator().to(device)
print(f"Generator params: {sum(p.numel() for p in G.parameters()):,}")
print(f"Discriminator params: {sum(p.numel() for p in D.parameters()):,}")

## 2. VAE: Variational Autoencoder

In [None]:
class VAE(nn.Module):
    def __init__(self, latent_dim=20):
        super().__init__()
        self.latent_dim = latent_dim
        
        # Encoder
        self.encoder = nn.Sequential(
            nn.Linear(784, 400),
            nn.ReLU()
        )
        self.fc_mu = nn.Linear(400, latent_dim)
        self.fc_var = nn.Linear(400, latent_dim)
        
        # Decoder
        self.decoder = nn.Sequential(
            nn.Linear(latent_dim, 400),
            nn.ReLU(),
            nn.Linear(400, 784),
            nn.Sigmoid()
        )
    
    def encode(self, x):
        h = self.encoder(x.view(-1, 784))
        return self.fc_mu(h), self.fc_var(h)
    
    def reparameterize(self, mu, logvar):
        std = torch.exp(0.5 * logvar)
        eps = torch.randn_like(std)
        return mu + eps * std
    
    def decode(self, z):
        return self.decoder(z)
    
    def forward(self, x):
        mu, logvar = self.encode(x)
        z = self.reparameterize(mu, logvar)
        return self.decode(z), mu, logvar

vae = VAE().to(device)
print(f"VAE params: {sum(p.numel() for p in vae.parameters()):,}")

## 3. Diffusion Models Overview

In [None]:
diffusion_info = '''
üé® Diffusion Models (Stable Diffusion, DALL-E 3):

1. Forward Process: Add noise gradually
   x_t = ‚àö(Œ±_t) * x_0 + ‚àö(1-Œ±_t) * Œµ

2. Reverse Process: Learn to denoise
   Model predicts noise Œµ at each step

3. Training: 
   - Add noise to image
   - Predict the noise
   - MSE loss between predicted and actual noise

4. Generation:
   - Start with pure noise
   - Iteratively denoise (50-1000 steps)
   - Get final image
'''
print(diffusion_info)

In [None]:
# Using Stable Diffusion (HuggingFace)
sd_code = '''
from diffusers import StableDiffusionPipeline
import torch

pipe = StableDiffusionPipeline.from_pretrained(
    "runwayml/stable-diffusion-v1-5",
    torch_dtype=torch.float16
).to("cuda")

image = pipe("A beautiful sunset over mountains").images[0]
image.save("generated.png")
'''
print("üìã Stable Diffusion Usage:")
print(sd_code)

## 4. Model Comparison

In [None]:
import pandas as pd

comparison = pd.DataFrame({
    'Model': ['GAN', 'VAE', 'Diffusion', 'Flow'],
    'Quality': ['High', 'Medium', 'Very High', 'High'],
    'Training': ['Unstable', 'Stable', 'Stable', 'Stable'],
    'Speed': ['Fast', 'Fast', 'Slow', 'Medium'],
    'Diversity': ['Mode collapse risk', 'Good', 'Excellent', 'Good']
})

display(comparison)

## 5. Applications

In [None]:
applications = [
    ('üñºÔ∏è Image Generation', 'Art, design, content creation'),
    ('üîÑ Style Transfer', 'Apply artistic styles to photos'),
    ('‚¨ÜÔ∏è Super Resolution', 'Enhance image quality'),
    ('üé≠ Face Generation', 'Synthetic faces for privacy'),
    ('üè• Medical', 'Synthetic training data'),
    ('üéÆ Gaming', 'Asset generation, textures'),
]

print("üöÄ Generative Model Applications:")
for name, desc in applications:
    print(f"  {name}: {desc}")

## üéØ Key Takeaways
1. GANs: Fast but unstable training
2. VAEs: Stable, good for compression
3. Diffusion: Best quality, slow generation
4. Use case determines model choice

**Next**: 15_nlp_text_classification.ipynb