# 14 vaes advanced gans keras
**Location: TensorVerseHub/notebooks/05_generative_models/14_vaes_advanced_gans_keras.ipynb**

TODO: Implement comprehensive TensorFlow + tf.keras learning content.

## Learning Objectives
- TODO: Define specific learning objectives
- TODO: List key TensorFlow concepts covered
- TODO: Outline tf.keras integration points

In [None]:
import tensorflow as tf
import numpy as np
print(f"TensorFlow version: {tf.__version__}")
# TODO: Add comprehensive implementation

# VAEs and Advanced GANs with tf.keras

**File Location:** `notebooks/05_generative_models/14_vaes_advanced_gans_keras.ipynb`

Master Variational Autoencoders (VAEs) and advanced GAN architectures including StyleGAN, Progressive GAN, and CycleGAN using tf.keras. Explore latent space manipulation, disentangled representations, and cutting-edge generative modeling techniques.

## Learning Objectives
- Implement Variational Autoencoders with reparameterization trick
- Build StyleGAN and Progressive GAN architectures
- Master CycleGAN for unpaired image-to-image translation
- Explore latent space interpolation and manipulation
- Apply disentangled representation learning
- Implement advanced loss functions and training strategies

---

## 1. Variational Autoencoder (VAE) Implementation

```python
import tensorflow as tf
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from tensorflow import keras
from tensorflow.keras import layers
import tensorflow_probability as tfp
from sklearn.decomposition import PCA
from sklearn.manifold import TSNE
import warnings
warnings.filterwarnings('ignore')

print(f"TensorFlow version: {tf.__version__}")
tf.random.set_seed(42)
np.random.seed(42)

# Variational Autoencoder implementation
class VAE(tf.keras.Model):
    """Variational Autoencoder with reparameterization trick"""
    
    def __init__(self, latent_dim=2, intermediate_dim=64, input_shape=(28, 28, 1), beta=1.0):
        super(VAE, self).__init__()
        self.latent_dim = latent_dim
        self.beta = beta  # β-VAE parameter for disentanglement
        
        # Encoder architecture
        self.encoder = tf.keras.Sequential([
            layers.Flatten(input_shape=input_shape),
            layers.Dense(intermediate_dim, activation='relu'),
            layers.Dense(intermediate_dim, activation='relu'),
        ], name='encoder')
        
        # Latent space projections
        self.z_mean = layers.Dense(latent_dim, name='z_mean')
        self.z_log_var = layers.Dense(latent_dim, name='z_log_var')
        
        # Decoder architecture
        self.decoder = tf.keras.Sequential([
            layers.Dense(intermediate_dim, activation='relu', input_shape=(latent_dim,)),
            layers.Dense(intermediate_dim, activation='relu'),
            layers.Dense(np.prod(input_shape), activation='sigmoid'),
            layers.Reshape(input_shape)
        ], name='decoder')
    
    def encode(self, x):
        """Encode input to latent parameters"""
        h = self.encoder(x)
        z_mean = self.z_mean(h)
        z_log_var = self.z_log_var(h)
        return z_mean, z_log_var
    
    def reparameterize(self, z_mean, z_log_var):
        """Reparameterization trick for backpropagation through random sampling"""
        batch_size = tf.shape(z_mean)[0]
        epsilon = tf.random.normal(shape=(batch_size, self.latent_dim))
        return z_mean + tf.exp(0.5 * z_log_var) * epsilon
    
    def decode(self, z):
        """Decode latent vector to reconstruction"""
        return self.decoder(z)
    
    def call(self, inputs):
        """Forward pass through VAE"""
        z_mean, z_log_var = self.encode(inputs)
        z = self.reparameterize(z_mean, z_log_var)
        reconstructed = self.decode(z)
        return reconstructed, z_mean, z_log_var
    
    def compute_loss(self, inputs):
        """Compute VAE loss (reconstruction + KL divergence)"""
        reconstructed, z_mean, z_log_var = self(inputs)
        
        # Reconstruction loss
        reconstruction_loss = tf.reduce_mean(
            tf.reduce_sum(
                tf.keras.losses.binary_crossentropy(inputs, reconstructed), axis=(1, 2)
            )
        )
        
        # KL divergence loss
        kl_loss = -0.5 * tf.reduce_mean(
            tf.reduce_sum(1 + z_log_var - tf.square(z_mean) - tf.exp(z_log_var), axis=1)
        )
        
        # Total loss with β weighting
        total_loss = reconstruction_loss + self.beta * kl_loss
        
        return total_loss, reconstruction_loss, kl_loss
    
    @tf.function
    def train_step(self, data):
        """Custom training step for VAE"""
        with tf.GradientTape() as tape:
            total_loss, reconstruction_loss, kl_loss = self.compute_loss(data)
        
        gradients = tape.gradient(total_loss, self.trainable_variables)
        self.optimizer.apply_gradients(zip(gradients, self.trainable_variables))
        
        return {
            'loss': total_loss,
            'reconstruction_loss': reconstruction_loss,
            'kl_loss': kl_loss
        }

# Convolutional VAE for better image handling
class ConvVAE(tf.keras.Model):
    """Convolutional Variational Autoencoder for image data"""
    
    def __init__(self, latent_dim=128, input_shape=(64, 64, 3), beta=1.0):
        super(ConvVAE, self).__init__()
        self.latent_dim = latent_dim
        self.beta = beta
        self.input_shape = input_shape
        
        # Encoder
        self.encoder = tf.keras.Sequential([
            layers.Conv2D(32, 3, strides=2, padding='same', activation='relu'),
            layers.Conv2D(64, 3, strides=2, padding='same', activation='relu'),
            layers.Conv2D(128, 3, strides=2, padding='same', activation='relu'),
            layers.Conv2D(256, 3, strides=2, padding='same', activation='relu'),
            layers.Flatten(),
            layers.Dense(512, activation='relu')
        ], name='conv_encoder')
        
        # Latent projections
        self.z_mean = layers.Dense(latent_dim)
        self.z_log_var = layers.Dense(latent_dim)
        
        # Decoder
        self.decoder_dense = layers.Dense(4 * 4 * 256, activation='relu')
        self.decoder_reshape = layers.Reshape((4, 4, 256))
        
        self.decoder_conv = tf.keras.Sequential([
            layers.Conv2DTranspose(128, 3, strides=2, padding='same', activation='relu'),
            layers.Conv2DTranspose(64, 3, strides=2, padding='same', activation='relu'),
            layers.Conv2DTranspose(32, 3, strides=2, padding='same', activation='relu'),
            layers.Conv2DTranspose(input_shape[-1], 3, strides=2, padding='same', activation='sigmoid')
        ], name='conv_decoder')
    
    def encode(self, x):
        h = self.encoder(x)
        return self.z_mean(h), self.z_log_var(h)
    
    def decode(self, z):
        h = self.decoder_dense(z)
        h = self.decoder_reshape(h)
        return self.decoder_conv(h)
    
    def reparameterize(self, z_mean, z_log_var):
        batch_size = tf.shape(z_mean)[0]
        epsilon = tf.random.normal(shape=(batch_size, self.latent_dim))
        return z_mean + tf.exp(0.5 * z_log_var) * epsilon
    
    def call(self, inputs):
        z_mean, z_log_var = self.encode(inputs)
        z = self.reparameterize(z_mean, z_log_var)
        reconstructed = self.decode(z)
        return reconstructed, z_mean, z_log_var
    
    def compute_loss(self, inputs):
        reconstructed, z_mean, z_log_var = self(inputs)
        
        # Reconstruction loss
        reconstruction_loss = tf.reduce_mean(
            tf.reduce_sum(
                tf.keras.losses.binary_crossentropy(inputs, reconstructed), axis=(1, 2, 3)
            )
        )
        
        # KL divergence
        kl_loss = -0.5 * tf.reduce_mean(
            tf.reduce_sum(1 + z_log_var - tf.square(z_mean) - tf.exp(z_log_var), axis=1)
        )
        
        return reconstruction_loss + self.beta * kl_loss, reconstruction_loss, kl_loss

# Load and prepare data
def load_mnist_data():
    (x_train, y_train), (x_test, y_test) = tf.keras.datasets.mnist.load_data()
    x_train = x_train.astype('float32') / 255.0
    x_test = x_test.astype('float32') / 255.0
    x_train = np.expand_dims(x_train, -1)
    x_test = np.expand_dims(x_test, -1)
    
    print(f"MNIST data: Train {x_train.shape}, Test {x_test.shape}")
    return (x_train, y_train), (x_test, y_test)

# Test VAE implementation
print("=== VAE Implementation Test ===")

(x_train, y_train), (x_test, y_test) = load_mnist_data()

# Create and compile VAE
vae = VAE(latent_dim=2, input_shape=(28, 28, 1), beta=1.0)
vae.compile(optimizer=tf.keras.optimizers.Adam(1e-3))

# Build model with sample data
sample_data = x_train[:100]
_ = vae(sample_data)

print("VAE Architecture:")
print(f"Encoder: {vae.encoder.count_params():,} parameters")
print(f"Decoder: {vae.decoder.count_params():,} parameters")
print(f"Total: {vae.count_params():,} parameters")

# Train VAE for demo
print("\nTraining VAE (demo - 5 epochs)...")
history = vae.fit(x_train[:5000], x_train[:5000], 
                  epochs=5, batch_size=128, 
                  validation_data=(x_test[:1000], x_test[:1000]),
                  verbose=1)

# Visualize VAE results
def visualize_vae_results(vae, test_data, n_samples=10):
    """Visualize VAE reconstruction and generation"""
    
    # Original vs Reconstructed
    test_samples = test_data[:n_samples]
    reconstructed, z_mean, z_log_var = vae(test_samples)
    
    plt.figure(figsize=(15, 6))
    
    # Original images
    for i in range(n_samples):
        plt.subplot(3, n_samples, i + 1)
        plt.imshow(test_samples[i, :, :, 0], cmap='gray')
        plt.title(f'Original {i+1}')
        plt.axis('off')
    
    # Reconstructed images
    for i in range(n_samples):
        plt.subplot(3, n_samples, n_samples + i + 1)
        plt.imshow(reconstructed[i, :, :, 0], cmap='gray')
        plt.title(f'Reconstructed {i+1}')
        plt.axis('off')
    
    # Generated images
    random_z = tf.random.normal((n_samples, vae.latent_dim))
    generated = vae.decode(random_z)
    
    for i in range(n_samples):
        plt.subplot(3, n_samples, 2 * n_samples + i + 1)
        plt.imshow(generated[i, :, :, 0], cmap='gray')
        plt.title(f'Generated {i+1}')
        plt.axis('off')
    
    plt.suptitle('VAE Results: Original, Reconstructed, Generated')
    plt.tight_layout()
    plt.show()

visualize_vae_results(vae, x_test)
```

## 2. Latent Space Analysis and Manipulation

```python
# Latent space analysis tools
class LatentSpaceAnalyzer:
    """Tools for analyzing and manipulating VAE latent space"""
    
    def __init__(self, vae_model):
        self.vae = vae_model
    
    def plot_latent_space(self, data, labels, n_samples=2000):
        """Plot 2D latent space with class labels"""
        
        if self.vae.latent_dim != 2:
            print("Latent space visualization only available for 2D latent space")
            return
        
        # Encode samples
        sample_data = data[:n_samples]
        sample_labels = labels[:n_samples]
        
        z_mean, _ = self.vae.encode(sample_data)
        
        # Plot latent space
        plt.figure(figsize=(12, 10))
        scatter = plt.scatter(z_mean[:, 0], z_mean[:, 1], c=sample_labels, 
                            cmap='tab10', alpha=0.6, s=20)
        plt.colorbar(scatter)
        plt.xlabel('Latent Dimension 1')
        plt.ylabel('Latent Dimension 2')
        plt.title('VAE Latent Space Visualization')
        plt.grid(True, alpha=0.3)
        plt.show()
    
    def generate_latent_grid(self, grid_size=15, latent_range=3.0):
        """Generate images from a grid of latent space points"""
        
        if self.vae.latent_dim != 2:
            print("Grid generation only available for 2D latent space")
            return
        
        # Create grid of latent points
        x = np.linspace(-latent_range, latent_range, grid_size)
        y = np.linspace(-latent_range, latent_range, grid_size)
        
        figure = np.zeros((28 * grid_size, 28 * grid_size))
        
        for i, yi in enumerate(x):
            for j, xi in enumerate(y):
                z_sample = np.array([[xi, yi]])
                generated = self.vae.decode(z_sample)
                digit = generated[0].numpy().reshape(28, 28)
                
                figure[i * 28:(i + 1) * 28, j * 28:(j + 1) * 28] = digit
        
        plt.figure(figsize=(12, 12))
        plt.imshow(figure, cmap='gray')
        plt.title(f'VAE Latent Space Grid ({grid_size}x{grid_size})')
        plt.axis('off')
        plt.show()
    
    def interpolate_between_points(self, point1, point2, n_steps=10):
        """Interpolate between two points in latent space"""
        
        # Linear interpolation
        alphas = np.linspace(0, 1, n_steps)
        interpolated_points = []
        
        for alpha in alphas:
            interpolated = alpha * point2 + (1 - alpha) * point1
            interpolated_points.append(interpolated)
        
        interpolated_points = np.array(interpolated_points)
        
        # Generate images
        generated_images = self.vae.decode(interpolated_points)
        
        # Plot interpolation
        plt.figure(figsize=(20, 4))
        for i, img in enumerate(generated_images):
            plt.subplot(1, n_steps, i + 1)
            plt.imshow(img[:, :, 0], cmap='gray')
            plt.title(f'α={alphas[i]:.2f}')
            plt.axis('off')
        
        plt.suptitle('Latent Space Interpolation')
        plt.tight_layout()
        plt.show()
        
        return generated_images
    
    def find_latent_directions(self, data, labels, class1=0, class2=1):
        """Find meaningful directions in latent space"""
        
        # Get latent representations for two classes
        mask1 = labels == class1
        mask2 = labels == class2
        
        data1 = data[mask1][:500]
        data2 = data[mask2][:500]
        
        z_mean1, _ = self.vae.encode(data1)
        z_mean2, _ = self.vae.encode(data2)
        
        # Calculate direction vector
        direction = np.mean(z_mean2, axis=0) - np.mean(z_mean1, axis=0)
        direction = direction / np.linalg.norm(direction)
        
        return direction

# Beta-VAE for disentanglement
class BetaVAE(VAE):
    """β-VAE for disentangled representation learning"""
    
    def __init__(self, latent_dim=10, beta=4.0, **kwargs):
        super().__init__(latent_dim=latent_dim, beta=beta, **kwargs)
        self.beta = beta
    
    def disentanglement_metric(self, data, labels, n_samples=1000):
        """Simple disentanglement metric based on mutual information"""
        
        sample_data = data[:n_samples]
        sample_labels = labels[:n_samples]
        
        z_mean, _ = self.encode(sample_data)
        
        # Calculate mutual information between latent dimensions and labels
        mi_scores = []
        
        for dim in range(self.latent_dim):
            # Discretize latent dimension
            z_dim = z_mean[:, dim].numpy()
            z_discretized = np.digitize(z_dim, bins=np.linspace(z_dim.min(), z_dim.max(), 10))
            
            # Calculate mutual information (simplified)
            mi = self.mutual_information(z_discretized, sample_labels)
            mi_scores.append(mi)
        
        return np.array(mi_scores)
    
    def mutual_information(self, x, y):
        """Simplified mutual information calculation"""
        # This is a simplified version - use proper MI calculation in practice
        from sklearn.metrics import mutual_info_score
        return mutual_info_score(x, y)

# Test latent space analysis
print("=== Latent Space Analysis ===")

analyzer = LatentSpaceAnalyzer(vae)

# Plot latent space (only for 2D)
analyzer.plot_latent_space(x_test, y_test, n_samples=2000)

# Generate latent grid
analyzer.generate_latent_grid(grid_size=10, latent_range=2.0)

# Test interpolation
print("\nTesting latent space interpolation...")
# Find two random points in latent space
point1 = np.random.normal(0, 1, (1, vae.latent_dim))
point2 = np.random.normal(0, 1, (1, vae.latent_dim))

interpolated_images = analyzer.interpolate_between_points(point1, point2, n_steps=10)

# Test Beta-VAE
print("\n=== Beta-VAE for Disentanglement ===")

beta_vae = BetaVAE(latent_dim=10, beta=4.0, input_shape=(28, 28, 1))
beta_vae.compile(optimizer=tf.keras.optimizers.Adam(1e-3))

# Build model
_ = beta_vae(sample_data)

print("Training Beta-VAE (demo - 3 epochs)...")
beta_history = beta_vae.fit(x_train[:3000], x_train[:3000],
                           epochs=3, batch_size=128, verbose=1)

# Analyze disentanglement
disentanglement_scores = beta_vae.disentanglement_metric(x_test, y_test)

plt.figure(figsize=(10, 6))
plt.bar(range(len(disentanglement_scores)), disentanglement_scores)
plt.xlabel('Latent Dimension')
plt.ylabel('Disentanglement Score')
plt.title('β-VAE Disentanglement Analysis')
plt.grid(True, alpha=0.3)
plt.show()
```

## 3. StyleGAN Implementation

```python
# StyleGAN implementation (simplified version)
class StyleGANGenerator(tf.keras.Model):
    """Simplified StyleGAN generator architecture"""
    
    def __init__(self, latent_dim=512, num_layers=8, img_size=64):
        super().__init__()
        self.latent_dim = latent_dim
        self.num_layers = num_layers
        self.img_size = img_size
        
        # Mapping network
        self.mapping_network = tf.keras.Sequential([
            layers.Dense(latent_dim, activation='relu') for _ in range(8)
        ] + [layers.Dense(latent_dim)], name='mapping_network')
        
        # Synthesis network components
        self.constant = tf.Variable(
            tf.random.normal((1, 4, 4, 512)), trainable=True, name='constant'
        )
        
        # Style modulation layers
        self.style_mods = []
        self.to_rgb_layers = []
        
        channels = [512, 512, 256, 128, 64, 32, 16]
        resolutions = [4, 8, 16, 32, 64, 128, 256]
        
        for i in range(min(len(channels), num_layers)):
            # Style modulation
            self.style_mods.append(
                StyleModulation(channels[i], name=f'style_mod_{i}')
            )
            
            # To RGB conversion
            self.to_rgb_layers.append(
                layers.Conv2D(3, 1, padding='same', name=f'to_rgb_{i}')
            )
    
    def call(self, latent_codes, truncation_psi=1.0):
        batch_size = tf.shape(latent_codes)[0]
        
        # Map latent codes to intermediate latent space W
        w = self.mapping_network(latent_codes)
        
        # Apply truncation trick
        if truncation_psi < 1.0:
            w_avg = tf.reduce_mean(w, axis=0, keepdims=True)
            w = w_avg + truncation_psi * (w - w_avg)
        
        # Start with learned constant
        x = tf.tile(self.constant, [batch_size, 1, 1, 1])
        
        # Progressive synthesis
        for i, (style_mod, to_rgb) in enumerate(zip(self.style_mods, self.to_rgb_layers)):
            # Apply style modulation
            x = style_mod(x, w)
            
            # Upsample (except first layer)
            if i > 0:
                x = tf.image.resize(x, [x.shape[1] * 2, x.shape[2] * 2])
            
            # Convert to RGB if this is the output layer
            if i == len(self.style_mods) - 1:
                rgb = to_rgb(x)
                return tf.nn.tanh(rgb)
        
        return x

class StyleModulation(tf.keras.layers.Layer):
    """Style modulation layer for StyleGAN"""
    
    def __init__(self, channels, **kwargs):
        super().__init__(**kwargs)
        self.channels = channels
        
        # Convolutional layer
        self.conv = layers.Conv2D(channels, 3, padding='same')
        
        # Style transformation
        self.style_transform = layers.Dense(channels * 2)  # For scale and bias
        
        # Noise injection
        self.noise_strength = tf.Variable(0.0, trainable=True)
        
    def call(self, x, style_code):
        batch_size = tf.shape(x)[0]
        height, width = tf.shape(x)[1], tf.shape(x)[2]
        
        # Apply convolution
        x = self.conv(x)
        
        # Add noise
        noise = tf.random.normal([batch_size, height, width, 1])
        x = x + self.noise_strength * noise
        
        # Style modulation (AdaIN)
        style_params = self.style_transform(style_code)
        style_scale, style_bias = tf.split(style_params, 2, axis=-1)
        
        # Normalize features
        x_mean, x_var = tf.nn.moments(x, axes=[1, 2], keepdims=True)
        x_normalized = (x - x_mean) / tf.sqrt(x_var + 1e-8)
        
        # Apply style
        style_scale = tf.reshape(style_scale, [-1, 1, 1, self.channels])
        style_bias = tf.reshape(style_bias, [-1, 1, 1, self.channels])
        
        return style_scale * x_normalized + style_bias

# Progressive GAN implementation
class ProgressiveGAN:
    """Progressive GAN with growing architecture"""
    
    def __init__(self, latent_dim=512, max_resolution=64):
        self.latent_dim = latent_dim
        self.max_resolution = max_resolution
        self.current_resolution = 4
        self.generators = {}
        self.discriminators = {}
        self.build_networks()
    
    def build_networks(self):
        """Build progressive networks for different resolutions"""
        
        resolutions = [4, 8, 16, 32, 64, 128, 256]
        
        for res in resolutions:
            if res <= self.max_resolution:
                self.generators[res] = self.build_generator(res)
                self.discriminators[res] = self.build_discriminator(res)
    
    def build_generator(self, resolution):
        """Build generator for specific resolution"""
        
        model = tf.keras.Sequential(name=f'generator_{resolution}x{resolution}')
        
        # Calculate number of upsampling layers
        num_layers = int(np.log2(resolution)) - 1
        
        # Initial dense layer
        model.add(layers.Dense(4 * 4 * 512, input_shape=(self.latent_dim,)))
        model.add(layers.Reshape((4, 4, 512)))
        model.add(layers.BatchNormalization())
        model.add(layers.LeakyReLU(alpha=0.2))
        
        # Progressive upsampling
        channels = 512
        for i in range(num_layers):
            channels = max(16, channels // 2)
            
            model.add(layers.Conv2DTranspose(
                channels, 4, strides=2, padding='same'))
            model.add(layers.BatchNormalization())
            model.add(layers.LeakyReLU(alpha=0.2))
        
        # Output layer
        model.add(layers.Conv2D(3, 3, padding='same', activation='tanh'))
        
        return model
    
    def build_discriminator(self, resolution):
        """Build discriminator for specific resolution"""
        
        model = tf.keras.Sequential(name=f'discriminator_{resolution}x{resolution}')
        
        # Input layer
        model.add(layers.Conv2D(32, 3, padding='same', 
                               input_shape=(resolution, resolution, 3)))
        model.add(layers.LeakyReLU(alpha=0.2))
        
        # Progressive downsampling
        channels = 32
        current_res = resolution
        
        while current_res > 4:
            channels = min(512, channels * 2)
            model.add(layers.Conv2D(channels, 4, strides=2, padding='same'))
            model.add(layers.BatchNormalization())
            model.add(layers.LeakyReLU(alpha=0.2))
            model.add(layers.Dropout(0.3))
            current_res //= 2
        
        # Final layers
        model.add(layers.Flatten())
        model.add(layers.Dense(1))
        
        return model
    
    def grow_network(self, new_resolution):
        """Grow network to new resolution"""
        
        if new_resolution in self.generators:
            self.current_resolution = new_resolution
            print(f"Switched to {new_resolution}x{new_resolution} resolution")
            return True
        return False

# Test StyleGAN components
print("=== StyleGAN Implementation Test ===")

# Create StyleGAN generator
stylegan_gen = StyleGANGenerator(latent_dim=512, num_layers=4, img_size=64)

# Test with random latent codes
test_latents = tf.random.normal((4, 512))
stylegan_outputs = stylegan_gen(test_latents, truncation_psi=0.7)

print(f"StyleGAN Generator:")
print(f"Input latent shape: {test_latents.shape}")
print(f"Output image shape: {stylegan_outputs.shape}")
print(f"Parameters: {stylegan_gen.count_params():,}")

# Visualize StyleGAN outputs
def visualize_stylegan_outputs(generator, n_samples=8):
    """Visualize StyleGAN generated images"""
    
    latent_codes = tf.random.normal((n_samples, generator.latent_dim))
    generated_images = generator(latent_codes, truncation_psi=0.7)
    
    # Rescale from [-1, 1] to [0, 1]
    generated_images = (generated_images + 1.0) / 2.0
    
    plt.figure(figsize=(16, 8))
    for i in range(n_samples):
        plt.subplot(2, 4, i + 1)
        plt.imshow(generated_images[i])
        plt.title(f'Generated {i+1}')
        plt.axis('off')
    
    plt.suptitle('StyleGAN Generated Images')
    plt.tight_layout()
    plt.show()

visualize_stylegan_outputs(stylegan_gen, n_samples=8)

# Test Progressive GAN
print("\n=== Progressive GAN Test ===")

progressive_gan = ProgressiveGAN(latent_dim=512, max_resolution=32)

print("Progressive GAN Architecture:")
for res in [4, 8, 16, 32]:
    if res in progressive_gan.generators:
        gen_params = progressive_gan.generators[res].count_params()
        disc_params = progressive_gan.discriminators[res].count_params()
        print(f"  {res}x{res}: Generator {gen_params:,}, Discriminator {disc_params:,}")

# Test generation at different resolutions
test_latents = tf.random.normal((4, 512))

for resolution in [4, 8, 16, 32]:
    if resolution in progressive_gan.generators:
        gen_images = progressive_gan.generators[resolution](test_latents)
        print(f"Resolution {resolution}x{resolution}: Output shape {gen_images.shape}")
```

## 4. CycleGAN Implementation

```python
# CycleGAN for unpaired image-to-image translation
class CycleGAN:
    """CycleGAN for unpaired image-to-image translation"""
    
    def __init__(self, img_shape=(64, 64, 3)):
        self.img_shape = img_shape
        
        # Build generators and discriminators
        self.G_AB = self.build_generator(name='G_AB')  # A to B
        self.G_BA = self.build_generator(name='G_BA')  # B to A
        self.D_A = self.build_discriminator(name='D_A')
        self.D_B = self.build_discriminator(name='D_B')
        
        # Optimizers
        self.g_optimizer = tf.keras.optimizers.Adam(2e-4, beta_1=0.5)
        self.d_optimizer = tf.keras.optimizers.Adam(2e-4, beta_1=0.5)
        
        # Loss weights
        self.lambda_cycle = 10.0
        self.lambda_identity = 0.5
    
    def build_generator(self, name):
        """Build U-Net style generator"""
        
        inputs = layers.Input(shape=self.img_shape)
        
        # Encoder
        e1 = layers.Conv2D(64, 4, strides=2, padding='same')(inputs)
        e1 = layers.LeakyReLU(alpha=0.2)(e1)
        
        e2 = layers.Conv2D(128, 4, strides=2, padding='same')(e1)
        e2 = layers.BatchNormalization()(e2)
        e2 = layers.LeakyReLU(alpha=0.2)(e2)
        
        e3 = layers.Conv2D(256, 4, strides=2, padding='same')(e2)
        e3 = layers.BatchNormalization()(e3)
        e3 = layers.LeakyReLU(alpha=0.2)(e3)
        
        e4 = layers.Conv2D(512, 4, strides=2, padding='same')(e3)
        e4 = layers.BatchNormalization()(e4)
        e4 = layers.LeakyReLU(alpha=0.2)(e4)
        
        # Bottleneck with residual blocks
        bottleneck = e4
        for _ in range(6):
            bottleneck = self.residual_block(bottleneck, 512)
        
        # Decoder with skip connections
        d4 = layers.Conv2DTranspose(256, 4, strides=2, padding='same')(bottleneck)
        d4 = layers.BatchNormalization()(d4)
        d4 = layers.ReLU()(d4)
        d4 = layers.Concatenate()([d4, e3])
        
        d3 = layers.Conv2DTranspose(128, 4, strides=2, padding='same')(d4)
        d3 = layers.BatchNormalization()(d3)
        d3 = layers.ReLU()(d3)
        d3 = layers.Concatenate()([d3, e2])
        
        d2 = layers.Conv2DTranspose(64, 4, strides=2, padding='same')(d3)
        d2 = layers.BatchNormalization()(d2)
        d2 = layers.ReLU()(d2)
        d2 = layers.Concatenate()([d2, e1])
        
        outputs = layers.Conv2DTranspose(3, 4, strides=2, padding='same', activation='tanh')(d2)
        
        return tf.keras.Model(inputs, outputs, name=name)
    
    def residual_block(self, x, filters):
        """Residual block for generator"""
        
        shortcut = x
        
        x = layers.Conv2D(filters, 3, padding='same')(x)
        x = layers.BatchNormalization()(x)
        x = layers.ReLU()(x)
        
        x = layers.Conv2D(filters, 3, padding='same')(x)
        x = layers.BatchNormalization()(x)
        
        return layers.Add()([shortcut, x])
    
    def build_discriminator(self, name):
        """Build PatchGAN discriminator"""
        
        inputs = layers.Input(shape=self.img_shape)
        
        x = layers.Conv2D(64, 4, strides=2, padding='same')(inputs)
        x = layers.LeakyReLU(alpha=0.2)(x)
        
        x = layers.Conv2D(128, 4, strides=2, padding='same')(x)
        x = layers.BatchNormalization()(x)
        x = layers.LeakyReLU(alpha=0.2)(x)
        
        x = layers.Conv2D(256, 4, strides=2, padding='same')(x)
        x = layers.BatchNormalization()(x)
        x = layers.LeakyReLU(alpha=0.2)(x)
        
        x = layers.Conv2D(512, 4, strides=1, padding='same')(x)
        x = layers.BatchNormalization()(x)
        x = layers.LeakyReLU(alpha=0.2)(x)
        
        outputs = layers.Conv2D(1, 4, strides=1, padding='same')(x)
        
        return tf.keras.Model(inputs, outputs, name=name)
    
    def generator_loss(self, fake_output):
        """Adversarial loss for generator"""
        return tf.keras.losses.binary_crossentropy(
            tf.ones_like(fake_output), fake_output, from_logits=True
        )
    
    def discriminator_loss(self, real_output, fake_output):
        """Adversarial loss for discriminator"""
        real_loss = tf.keras.losses.binary_crossentropy(
            tf.ones_like(real_output), real_output, from_logits=True
        )
        fake_loss = tf.keras.losses.binary_crossentropy(
            tf.zeros_like(fake_output), fake_output, from_logits=True
        )
        return real_loss + fake_loss
    
    def cycle_consistency_loss(self, real_image, reconstructed_image):
        """Cycle consistency loss"""
        return tf.reduce_mean(tf.abs(real_image - reconstructed_image))
    
    def identity_loss(self, real_image, same_image):
        """Identity loss"""
        return tf.reduce_mean(tf.abs(real_image - same_image))
    
    @tf.function
    def train_step(self, real_A, real_B):
        """Training step for CycleGAN"""
        
        with tf.GradientTape(persistent=True) as tape:
            # Forward cycle: A -> B -> A
            fake_B = self.G_AB(real_A, training=True)
            reconstructed_A = self.G_BA(fake_B, training=True)
            
            # Backward cycle: B -> A -> B
            fake_A = self.G_BA(real_B, training=True)
            reconstructed_B = self.G_AB(fake_A, training=True)
            
            # Identity mapping
            same_A = self.G_BA(real_A, training=True)
            same_B = self.G_AB(real_B, training=True)
            
            # Discriminator outputs
            disc_real_A = self.D_A(real_A, training=True)
            disc_fake_A = self.D_A(fake_A, training=True)
            
            disc_real_B = self.D_B(real_B, training=True)
            disc_fake_B = self.D_B(fake_B, training=True)
            
            # Generator losses
            gen_AB_loss = tf.reduce_mean(self.generator_loss(disc_fake_B))
            gen_BA_loss = tf.reduce_mean(self.generator_loss(disc_fake_A))
            
            # Cycle consistency losses
            cycle_A_loss = self.cycle_consistency_loss(real_A, reconstructed_A)
            cycle_B_loss = self.cycle_consistency_loss(real_B, reconstructed_B)
            
            # Identity losses
            identity_A_loss = self.identity_loss(real_A, same_A)
            identity_B_loss = self.identity_loss(real_B, same_B)
            
            # Total generator loss
            total_gen_loss = (gen_AB_loss + gen_BA_loss + 
                            self.lambda_cycle * (cycle_A_loss + cycle_B_loss) + 
                            self.lambda_identity * (identity_A_loss + identity_B_loss))
            
            # Discriminator losses
            disc_A_loss = tf.reduce_mean(self.discriminator_loss(disc_real_A, disc_fake_A))
            disc_B_loss = tf.reduce_mean(self.discriminator_loss(disc_real_B, disc_fake_B))
        
        # Calculate gradients
        gen_AB_grads = tape.gradient(total_gen_loss, self.G_AB.trainable_variables)
        gen_BA_grads = tape.gradient(total_gen_loss, self.G_BA.trainable_variables)
        
        disc_A_grads = tape.gradient(disc_A_loss, self.D_A.trainable_variables)
        disc_B_grads = tape.gradient(disc_B_loss, self.D_B.trainable_variables)
        
        # Apply gradients
        self.g_optimizer.apply_gradients(zip(gen_AB_grads, self.G_AB.trainable_variables))
        self.g_optimizer.apply_gradients(zip(gen_BA_grads, self.G_BA.trainable_variables))
        
        self.d_optimizer.apply_gradients(zip(disc_A_grads, self.D_A.trainable_variables))
        self.d_optimizer.apply_gradients(zip(disc_B_grads, self.D_B.trainable_variables))
        
        return {
            'gen_loss': total_gen_loss,
            'disc_A_loss': disc_A_loss,
            'disc_B_loss': disc_B_loss,
            'cycle_loss': cycle_A_loss + cycle_B_loss
        }

# Test CycleGAN
print("=== CycleGAN Implementation Test ===")

cycle_gan = CycleGAN(img_shape=(64, 64, 3))

print("CycleGAN Architecture:")
print(f"Generator AB: {cycle_gan.G_AB.count_params():,} parameters")
print(f"Generator BA: {cycle_gan.G_BA.count_params():,} parameters")
print(f"Discriminator A: {cycle_gan.D_A.count_params():,} parameters")
print(f"Discriminator B: {cycle_gan.D_B.count_params():,} parameters")

# Create synthetic test data (normally you'd use two different image domains)
test_A = tf.random.normal((4, 64, 64, 3))
test_B = tf.random.normal((4, 64, 64, 3))

# Test forward pass
fake_B = cycle_gan.G_AB(test_A)
fake_A = cycle_gan.G_BA(test_B)

print(f"\nCycleGAN Forward Pass:")
print(f"Real A shape: {test_A.shape}")
print(f"Fake B shape: {fake_B.shape}")
print(f"Real B shape: {test_B.shape}")
print(f"Fake A shape: {fake_A.shape}")

# Visualize CycleGAN translations
def visualize_cycle_translations(cycle_gan, real_A, real_B, n_samples=4):
    """Visualize CycleGAN image translations"""
    
    real_A_batch = real_A[:n_samples]
    real_B_batch = real_B[:n_samples]
    
    # Generate translations
    fake_B = cycle_gan.G_AB(real_A_batch)
    fake_A = cycle_gan.G_BA(real_B_batch)
    
    # Reconstruct
    reconstructed_A = cycle_gan.G_BA(fake_B)
    reconstructed_B = cycle_gan.G_AB(fake_A)
    
    # Normalize for visualization
    def normalize_for_display(images):
        return (images + 1.0) / 2.0
    
    real_A_vis = normalize_for_display(real_A_batch)
    real_B_vis = normalize_for_display(real_B_batch)
    fake_A_vis = normalize_for_display(fake_A)
    fake_B_vis = normalize_for_display(fake_B)
    reconstructed_A_vis = normalize_for_display(reconstructed_A)
    reconstructed_B_vis = normalize_for_display(reconstructed_B)
    
    plt.figure(figsize=(18, 12))
    
    for i in range(n_samples):
        # A -> B -> A cycle
        plt.subplot(n_samples, 6, i * 6 + 1)
        plt.imshow(real_A_vis[i])
        plt.title('Real A')
        plt.axis('off')
        
        plt.subplot(n_samples, 6, i * 6 + 2)
        plt.imshow(fake_B_vis[i])
        plt.title('Fake B')
        plt.axis('off')
        
        plt.subplot(n_samples, 6, i * 6 + 3)
        plt.imshow(reconstructed_A_vis[i])
        plt.title('Reconstructed A')
        plt.axis('off')
        
        # B -> A -> B cycle
        plt.subplot(n_samples, 6, i * 6 + 4)
        plt.imshow(real_B_vis[i])
        plt.title('Real B')
        plt.axis('off')
        
        plt.subplot(n_samples, 6, i * 6 + 5)
        plt.imshow(fake_A_vis[i])
        plt.title('Fake A')
        plt.axis('off')
        
        plt.subplot(n_samples, 6, i * 6 + 6)
        plt.imshow(reconstructed_B_vis[i])
        plt.title('Reconstructed B')
        plt.axis('off')
    
    plt.suptitle('CycleGAN Image Translation Results')
    plt.tight_layout()
    plt.show()

# Test visualization with synthetic data
visualize_cycle_translations(cycle_gan, test_A, test_B, n_samples=2)
```

## Summary

This comprehensive notebook demonstrated advanced generative modeling techniques using tf.keras:

### Key Implementations

**1. Variational Autoencoders (VAEs):**
- Reparameterization trick for differentiable sampling
- β-VAE for disentangled representations
- Convolutional VAE architectures
- Latent space analysis and manipulation tools

**2. StyleGAN Architecture:**
- Mapping network and style modulation
- AdaIN (Adaptive Instance Normalization)
- Progressive synthesis with learned features
- Truncation trick for sample quality

**3. Progressive GAN:**
- Growing network architecture during training
- Multi-resolution training strategies
- Stable training for high-resolution generation

**4. CycleGAN:**
- Unpaired image-to-image translation
- Cycle consistency and identity losses
- U-Net generator with skip connections
- PatchGAN discriminator architecture

### Technical Achievements

- **Latent Space Control**: VAE enables smooth interpolation and semantic editing
- **High-Quality Generation**: StyleGAN produces photorealistic images
- **Stable Training**: Progressive training prevents mode collapse
- **Domain Transfer**: CycleGAN enables cross-domain translation

### Advanced Techniques Demonstrated

- **Disentanglement**: β-VAE separates semantic factors
- **Style Transfer**: StyleGAN modulates style at multiple scales
- **Multi-scale Training**: Progressive GAN grows complexity gradually
- **Adversarial Training**: Multiple discriminators for robust learning

### Performance Insights

- **VAE**: Fast inference, smooth latent space, good for representation learning
- **StyleGAN**: Superior image quality, controllable generation
- **Progressive GAN**: Stable training, scalable to high resolutions
- **CycleGAN**: No paired data required, preserves content structure

### Applications Covered

- Image reconstruction and generation
- Latent space exploration and editing
- Style transfer and domain adaptation
- Unpaired image translation tasks

### Next Steps

Continue to notebook 15 (Diffusion Models with tf.keras) to explore the latest breakthrough in generative modeling, where you'll implement DDPM, DDIM, and other state-of-the-art diffusion-based approaches that are revolutionizing image generation.

These advanced generative models provide powerful tools for creative applications, data augmentation, and understanding the structure of high-dimensional data distributions.