# GANs — Theory & Practical

This Colab-ready notebook contains both **theory** (concise, student-style answers) and **practical** Keras/TensorFlow examples. Code cells are runnable in Google Colab. I kept training epochs small so cells run quickly for demonstration.

## Theory — GAN Questions and Short Answers


**Q1. What does GAN stand for, and what is its main purpose?**

GAN stands for **Generative Adversarial Network**. Its main purpose is to learn a generator that can produce new data samples (e.g., images) that are similar to real data, by training a generator and a discriminator in an adversarial setup.



**Q2. Explain the concept of the 'discriminator' in GANs.**

The discriminator is a neural network trained to distinguish real samples from fake (generated) samples. It outputs a probability (or score) indicating whether an input is real. The discriminator provides feedback to the generator via gradients.



**Q3. How does a GAN work?**

A GAN has two networks: a generator G that maps random noise z -> x_fake, and a discriminator D that scores inputs. Training alternates: D learns to classify real vs fake, and G learns to produce fakes that fool D. This adversarial game improves generator samples over time.



**Q4. What is the generator's role in a GAN?**

The generator creates synthetic samples from noise (and possibly conditional inputs). Its goal is to produce outputs that are indistinguishable from real data according to the discriminator.



**Q5. What is the loss function used in the training of GANs?**

The original GAN uses a minimax loss: D maximizes log(D(x)) + log(1 - D(G(z))) while G minimizes log(1 - D(G(z))). In practice, G often maximizes log(D(G(z))) (non-saturating loss) for better gradients.



**Q6. What is the difference between a WGAN and a traditional GAN?**

WGAN (Wasserstein GAN) replaces the discriminator with a critic that estimates the Wasserstein distance between real and generated distributions. WGAN uses a different loss (Wasserstein loss) and enforces Lipschitz continuity (originally via weight clipping, later via gradient penalty) to improve training stability.



**Q7. How does the training of the generator differ from that of the discriminator?**

Training alternates: the discriminator (or critic) is trained for some steps to improve classification/Scoring, then the generator is trained to improve its outputs using gradients from the discriminator. Hyperparameters (like multiple critic steps) may differ between the two.



**Q8. What is a DCGAN, and how is it different from a traditional GAN?**

DCGAN (Deep Convolutional GAN) uses convolutional and transposed-convolutional layers in the discriminator and generator respectively, with architectural best-practices (BatchNorm, ReLU/LeakyReLU) tailored for stable image generation.



**Q9. Explain the concept of 'controllable generation' in the context of GANs.**

Controllable generation means conditioning the generator on external inputs (labels, attributes, or latent codes) so you can control aspects of generated samples — e.g., class-conditional GANs, style codes in StyleGAN.



**Q10. What is the primary goal of training a GAN?**

The primary goal is to train a generator whose output distribution matches the real data distribution closely, so generated samples are realistic and diverse.



**Q11. What are the limitations of GANs?**

GANs can be unstable to train, suffer from mode collapse (lack of diversity), require careful architecture and hyperparameter tuning, and evaluation metrics are not straightforward (FID/IS approximate).



**Q12. What are StyleGANs, and what makes them unique?**

StyleGAN family introduces a style-based generator architecture that injects learned style vectors at multiple levels via adaptive instance normalization; they produce high-quality, controllable, and disentangled image synthesis.



**Q13. What is the role of noise in a GAN?**

Noise (latent vector z) is the input to the generator and provides randomness to produce varied outputs. Structured latent spaces enable interpolation and semantic manipulations.



**Q14. How does the loss function in a WGAN improve training stability?**

Wasserstein loss approximates an earth-mover distance which correlates better with sample quality and provides smoother gradients. When coupled with gradient penalty (WGAN-GP) or proper Lipschitz enforcement, it stabilizes training and reduces mode collapse.



**Q15. Describe the architecture of a typical GAN.**

A typical GAN has: a generator (series of upsampling/transpose-conv layers, batchnorm, activations) and a discriminator (downsampling conv layers, LeakyReLU, and a final sigmoid/linear output). DCGANs follow specific design rules for stability.



**Q16. What challenges do GANs face during training, and how can they be addressed?**

Challenges: instability, mode collapse, vanishing gradients. Remedies: alternative losses (WGAN), gradient penalty, spectral normalization, two-time-scale updates, architectural choices (BatchNorm/InstanceNorm), and regularization.



**Q17. How does DCGAN help improve image generation in GANs?**

DCGAN uses convolutional architectures, BatchNorm, and ReLU/LeakyReLU activations which empirically lead to more stable training and higher-quality images for common image sizes (e.g., 64x64).



**Q18. What are the key differences between a traditional GAN and a StyleGAN?**

StyleGAN introduces mapping network and style injection (AdaIN), noise inputs at each layer for stochastic detail, and progressive architecture adjustments. This yields finer control over synthesis and higher fidelity outputs.



**Q19. How does the discriminator decide whether an image is real or fake in a GAN?**

The discriminator processes the input through layers and produces a score (or probability). It learns features that separate real vs fake via supervised training using real and generated samples and a chosen loss function.



**Q20. What is the main advantage of using GANs in image generation?**

GANs can produce very realistic and high-fidelity images and can learn complex data distributions without explicit probabilistic density modeling.



**Q21. How can GANs be used in real-world applications?**

Applications: image synthesis, image-to-image translation, super-resolution, data augmentation, anomaly detection, style transfer, domain adaptation, and creative content generation.



**Q22. What is Mode Collapse in GANs, and how can it be prevented?**

Mode collapse is when the generator produces limited varieties of outputs (collapsing to a few modes). Prevention: use WGAN/WGAN-GP, minibatch discrimination, feature matching, unrolled GANs, diversity-promoting regularization, and architecture/hyperparameter tuning.



## Practical — Implementations (TensorFlow / Keras)

The examples below are small, runnable in Colab, and use tiny epochs or subsets so they finish quickly for demonstration. Replace datasets with larger ones if you want fuller training.

In [None]:
# Practical 1: Simple GAN (lightweight) - generate MNIST-like digits
import tensorflow as tf
from tensorflow.keras import layers
import numpy as np
import matplotlib.pyplot as plt

# Load MNIST and normalize to [-1,1]
(x_train, _), (_, _) = tf.keras.datasets.mnist.load_data()
x_train = x_train.astype('float32') / 127.5 - 1.0
x_train = np.expand_dims(x_train, axis=-1)

BUFFER_SIZE = 60000
BATCH_SIZE = 256
train_dataset = tf.data.Dataset.from_tensor_slices(x_train).shuffle(BUFFER_SIZE).batch(BATCH_SIZE)

# Generator model
def make_generator_model():
    model = tf.keras.Sequential()
    model.add(layers.Dense(7*7*128, use_bias=False, input_shape=(100,)))
    model.add(layers.BatchNormalization())
    model.add(layers.LeakyReLU())
    model.add(layers.Reshape((7,7,128)))
    model.add(layers.Conv2DTranspose(64, (5,5), strides=(1,1), padding='same', use_bias=False))
    model.add(layers.BatchNormalization())
    model.add(layers.LeakyReLU())
    model.add(layers.Conv2DTranspose(1, (5,5), strides=(2,2), padding='same', use_bias=False, activation='tanh'))
    return model

# Discriminator model
def make_discriminator_model(dropout_rate=0.3):
    model = tf.keras.Sequential()
    model.add(layers.Conv2D(64, (5,5), strides=(2,2), padding='same', input_shape=[28,28,1]))
    model.add(layers.LeakyReLU())
    model.add(layers.Dropout(dropout_rate))
    model.add(layers.Conv2D(128, (5,5), strides=(2,2), padding='same'))
    model.add(layers.LeakyReLU())
    model.add(layers.Dropout(dropout_rate))
    model.add(layers.Flatten())
    model.add(layers.Dense(1))
    return model

# Loss functions and optimizers
cross_entropy = tf.keras.losses.BinaryCrossentropy(from_logits=True)
generator = make_generator_model()
discriminator = make_discriminator_model()

generator_optimizer = tf.keras.optimizers.Adam(1e-4)
discriminator_optimizer = tf.keras.optimizers.Adam(1e-4)

# Training step (non-saturating loss)
@tf.function
def train_step(images):
    noise = tf.random.normal([BATCH_SIZE, 100])
    with tf.GradientTape() as gen_tape, tf.GradientTape() as disc_tape:
        generated_images = generator(noise, training=True)

        real_output = discriminator(images, training=True)
        fake_output = discriminator(generated_images, training=True)

        gen_loss = cross_entropy(tf.ones_like(fake_output), fake_output)
        disc_loss = cross_entropy(tf.ones_like(real_output), real_output) + cross_entropy(tf.zeros_like(fake_output), fake_output)

    gradients_of_generator = gen_tape.gradient(gen_loss, generator.trainable_variables)
    gradients_of_discriminator = disc_tape.gradient(disc_loss, discriminator.trainable_variables)

    generator_optimizer.apply_gradients(zip(gradients_of_generator, generator.trainable_variables))
    discriminator_optimizer.apply_gradients(zip(gradients_of_discriminator, discriminator.trainable_variables))
    return gen_loss, disc_loss

# Training loop (very small epochs for demo)
def train(dataset, epochs=1):
    for epoch in range(epochs):
        for image_batch in dataset:
            gl, dl = train_step(image_batch)
        print(f'Epoch {epoch+1}, gen_loss={gl.numpy():.4f}, disc_loss={dl.numpy():.4f}')

# Run a quick demo training
train(train_dataset.take(50), epochs=1)

# Generate and show a few images
noise = tf.random.normal([16,100])
generated = generator(noise, training=False)
generated = (generated + 1.0) / 2.0  # bring to [0,1]
plt.figure(figsize=(4,4))
for i in range(16):
    plt.subplot(4,4,i+1)
    plt.imshow(generated[i,:,:,0], cmap='gray')
    plt.axis('off')
plt.suptitle('Generated samples (demo)')
plt.show()

In [None]:
# Practical 2: Standalone discriminator for 28x28 images
from tensorflow.keras import layers, Sequential
def build_discriminator(dropout_rate=0.3):
    model = Sequential([
        layers.Input(shape=(28,28,1)),
        layers.Conv2D(64, (5,5), strides=(2,2), padding='same'),
        layers.LeakyReLU(),
        layers.Dropout(dropout_rate),
        layers.Conv2D(128, (5,5), strides=(2,2), padding='same'),
        layers.LeakyReLU(),
        layers.Flatten(),
        layers.Dense(1)
    ])
    return model

disc = build_discriminator(0.4)
disc.summary()

In [None]:
# Practical 3: Sampling helper (use generator from Practical 1)
import matplotlib.pyplot as plt
def sample_and_plot(generator, n=9):
    z = tf.random.normal([n,100])
    imgs = generator(z, training=False)
    imgs = (imgs + 1.0) / 2.0
    plt.figure(figsize=(3,3))
    for i in range(n):
        plt.subplot(int(n**0.5), int(n**0.5), i+1)
        plt.imshow(imgs[i,:,:,0], cmap='gray')
        plt.axis('off')
    plt.show()

# Example (call sample_and_plot(generator, 9) after training)

In [None]:
# Practical 4: Simplified WGAN-style critic and loss (demo, not full WGAN-GP)
import tensorflow as tf
from tensorflow.keras import layers, Model

# Critic (no sigmoid in final layer)
def make_critic():
    model = tf.keras.Sequential([
        layers.Input(shape=(28,28,1)),
        layers.Conv2D(64, (5,5), strides=(2,2), padding='same'),
        layers.LeakyReLU(),
        layers.Conv2D(128, (5,5), strides=(2,2), padding='same'),
        layers.LeakyReLU(),
        layers.Flatten(),
        layers.Dense(1)  # linear output as critic score
    ])
    return model

critic = make_critic()
critic.summary()

# Wasserstein losses (neg for generator to maximize critic score)
def critic_loss(real_score, fake_score):
    return tf.reduce_mean(fake_score) - tf.reduce_mean(real_score)

def generator_loss(fake_score):
    return -tf.reduce_mean(fake_score)

# Note: a proper WGAN-GP training requires gradient penalty and multiple critic steps; this demonstrates difference in loss form.

In [None]:
# Practical 5: Generate a batch of fake images (using generator) and display
z = tf.random.normal([16,100])
imgs = generator(z, training=False)
imgs = (imgs + 1.0) / 2.0
import matplotlib.pyplot as plt
plt.figure(figsize=(4,4))
for i in range(16):
    plt.subplot(4,4,i+1); plt.imshow(imgs[i,:,:,0], cmap='gray'); plt.axis('off')
plt.show()

### Practical 6: StyleGAN-inspired architecture

StyleGAN is large and complex; a full implementation is beyond the scope of a small demo notebook. Below is a small sketch of style-based generator building blocks (for conceptual learning).

In [None]:
# Sketch: tiny style-based block (conceptual, not full StyleGAN)
from tensorflow.keras import layers, Sequential
def style_block(x, style_vector):
    # This is a conceptual placeholder showing AdaIN-like modulation
    # Real StyleGAN has mapping network and learned noise injection per layer
    gamma = layers.Dense(x.shape[-1])(style_vector)
    beta = layers.Dense(x.shape[-1])(style_vector)
    # Apply simple modulation (this code is conceptual and minimal)
    return x * (1 + tf.expand_dims(tf.expand_dims(gamma,1),1)) + tf.expand_dims(tf.expand_dims(beta,1),1)

In [None]:
# Practical 7 & 8: Waterstein loss function demo & discriminator with configurable dropout
def add_dropout_and_build(dropout_rate=0.4):
    return build_discriminator(dropout_rate=dropout_rate)

disc_with_dropout = add_dropout_and_build(0.4)
disc_with_dropout.summary()

**Note:** Practical question 8 and 9 in the prompt were identical (add dropout). Both are implemented above as `add_dropout_and_build`.

---

*Notebook prepared in a student style: concise theory answers and runnable small-scale practical demos. For full GAN training on large datasets use larger compute, more epochs, and advanced techniques (WGAN-GP, spectral norm, progressive growing, etc.).*