
# 🎭 Face Generation with DCGANs (Clean, Reproducible, GitHub-Ready)

This notebook implements a **Deep Convolutional Generative Adversarial Network (DCGAN)** in TensorFlow/Keras to generate human face images using the **CelebA** dataset.  
It includes:
- Reproducible seeding
- Clean `tf.data` input pipeline (finite cardinality; no re-batching warnings)
- DCGAN architectures with recommended initializers
- Custom training loop (`tf.keras.Model`)
- Image grid callback per epoch
- Model checkpointing
- Loss curves
- Optional mixed precision
- Fully self-contained **Kaggle download** steps

> Tip: Clear all outputs before committing to GitHub to keep the notebook size < 1 MB.



## 1) Environment & Kaggle Setup

Run this section once per new environment (e.g., Colab).  
You will need your **`kaggle.json`** API key (from https://www.kaggle.com/settings/account).


In [None]:

# If running on Colab, you can quickly check for GPU:
# !nvidia-smi || echo "No GPU found."
# Install required packages (lightweight)
!pip -q install kaggle tensorflow numpy matplotlib


In [None]:

import os, json, pathlib

# Create Kaggle directory and place kaggle.json if not present
os.makedirs(os.path.expanduser("~/.kaggle"), exist_ok=True)

kaggle_path = os.path.expanduser("~/.kaggle/kaggle.json")
if not os.path.exists(kaggle_path):
    print("➡️ Please upload your kaggle.json in the next cell if not already present at ~/.kaggle/kaggle.json")
else:
    print("✅ Found kaggle.json at ~/.kaggle/kaggle.json")
    os.chmod(kaggle_path, 0o600)



If you **don't** have `~/.kaggle/kaggle.json` already, run the next cell to upload it now (only needed on Colab).


In [None]:

# Colab-only: uncomment to use the file picker to upload kaggle.json
# from google.colab import files
# uploaded = files.upload()  # select kaggle.json
# import shutil, os
# if 'kaggle.json' in uploaded:
#     dest = os.path.expanduser('~/.kaggle/kaggle.json')
#     shutil.move('kaggle.json', dest)
#     os.chmod(dest, 0o600)
#     print('✅ kaggle.json installed to', dest)
# else:
#     print('If running locally, place kaggle.json at ~/.kaggle/kaggle.json')



## 2) Download CelebA Dataset (Kaggle)

We'll download **CelebA** via Kaggle. This dataset is large; ensure you have sufficient storage.


In [None]:

import os, subprocess, shlex, zipfile

DATA_ROOT = "/content/dataset"  # change to a local path if not on Colab
ZIP_PATH = os.path.join(DATA_ROOT, "celeba-dataset.zip")
EXTRACT_DIR = os.path.join(DATA_ROOT, "img_align_celeba")

os.makedirs(DATA_ROOT, exist_ok=True)

def run_cmd(cmd):
    print(">", cmd)
    p = subprocess.run(shlex.split(cmd), check=False, capture_output=True, text=True)
    print(p.stdout or "")
    if p.returncode != 0:
        print(p.stderr or "")
    return p.returncode == 0

# Download only if not present
if not os.path.exists(ZIP_PATH) and not os.path.isdir(EXTRACT_DIR):
    print("⬇️ Downloading CelebA via Kaggle (jessicali9530/celeba-dataset)...")
    ok = run_cmd(f'kaggle datasets download -d jessicali9530/celeba-dataset -p {DATA_ROOT}')
    if not ok:
        print("❌ Kaggle download failed. Ensure kaggle.json is configured.")

# Unzip if not already extracted
if os.path.exists(ZIP_PATH) and not os.path.isdir(EXTRACT_DIR):
    print("📦 Extracting dataset... (this may take a few minutes)")
    with zipfile.ZipFile(ZIP_PATH, 'r') as zf:
        zf.extractall(DATA_ROOT)
    print("✅ Extraction complete.")

# Derive final images directory (inside the unzipped folder structure)
IMG_DIR = None
if os.path.isdir(EXTRACT_DIR):
    candidate_dirs = [
        EXTRACT_DIR,
        os.path.join(EXTRACT_DIR, "img_align_celeba"),
        os.path.join(DATA_ROOT, "celeba-dataset", "img_align_celeba"),
    ]
    for c in candidate_dirs:
        if os.path.isdir(c) and len(os.listdir(c)) > 10000:  # celebA has many files
            IMG_DIR = c
            break

if IMG_DIR is None:
    # fall back to a reasonable default
    fallback = os.path.join(DATA_ROOT, "img_align_celeba", "img_align_celeba")
    if os.path.isdir(fallback):
        IMG_DIR = fallback

print("🗂 IMG_DIR:", IMG_DIR)
assert IMG_DIR is not None, "Could not locate the CelebA images directory after extraction."



## 3) Reproducibility & Config
Set a global seed and define training configuration.


In [None]:

import os, random, numpy as np, tensorflow as tf

SEED = 42
tf.random.set_seed(SEED)
np.random.seed(SEED)
random.seed(SEED)
os.environ["PYTHONHASHSEED"] = str(SEED)

IM_SHAPE = (64, 64, 3)
BATCH_SIZE = 128
LATENT_DIM = 100
EPOCHS = 50

print("✅ Seeds set. Config →", {"IM_SHAPE": IM_SHAPE, "BATCH_SIZE": BATCH_SIZE, "LATENT_DIM": LATENT_DIM, "EPOCHS": EPOCHS})



### (Optional) Mixed Precision
Enable if your GPU supports it (saves memory & speeds up training). If unsure, skip.


In [None]:

USE_MIXED_PRECISION = False  # set True if you know your GPU supports it well
if USE_MIXED_PRECISION:
    from tensorflow.keras import mixed_precision
    mixed_precision.set_global_policy("mixed_float16")
    print("✅ Mixed precision enabled.")
else:
    print("ℹ️ Mixed precision disabled.")



## 4) `tf.data` Pipeline (Finite, Shuffled, Prefetched)

We create a finite dataset with shuffling and prefetching. Images are resized to 64×64 and normalized to **[-1, 1]** to match the generator's `tanh` output.


In [None]:

import tensorflow as tf
from tensorflow.keras.preprocessing import image_dataset_from_directory

def preprocess(image):
    image = tf.image.resize(image, (IM_SHAPE[0], IM_SHAPE[1]))
    image = tf.cast(image, tf.float32) / 127.5 - 1.0
    return image

# CelebA is in a flat directory; we use label_mode=None.
train_dataset = image_dataset_from_directory(
    IMG_DIR,
    label_mode=None,
    image_size=(IM_SHAPE[0], IM_SHAPE[1]),
    batch_size=BATCH_SIZE,
    shuffle=True,
    seed=SEED
).map(preprocess, num_parallel_calls=tf.data.AUTOTUNE
).prefetch(tf.data.AUTOTUNE)

train_dataset



### (Optional) Preview a Few Real Images


In [None]:

import matplotlib.pyplot as plt
sample_batch = next(iter(train_dataset.take(1)))
sample_imgs = (sample_batch + 1.0) / 2.0  # back to [0,1]

plt.figure(figsize=(6,6))
n = 16
for i in range(n):
    plt.subplot(4,4,i+1)
    plt.imshow(sample_imgs[i])
    plt.axis("off")
plt.show()



## 5) Models: Generator & Discriminator (DCGAN)

We follow standard DCGAN design using transposed convolutions in the generator and strided convolutions in the discriminator. We use a **RandomNormal(0, 0.02)** initializer per DCGAN best practices.


In [None]:

init = tf.keras.initializers.RandomNormal(mean=0.0, stddev=0.02)

def build_generator(latent_dim=LATENT_DIM):
    model = tf.keras.Sequential([
        tf.keras.layers.Dense(8*8*256, use_bias=False, input_shape=(latent_dim,), kernel_initializer=init),
        tf.keras.layers.BatchNormalization(),
        tf.keras.layers.ReLU(),
        tf.keras.layers.Reshape((8, 8, 256)),

        tf.keras.layers.Conv2DTranspose(128, 5, strides=2, padding='same', use_bias=False, kernel_initializer=init),
        tf.keras.layers.BatchNormalization(), tf.keras.layers.ReLU(),

        tf.keras.layers.Conv2DTranspose(64, 5, strides=2, padding='same', use_bias=False, kernel_initializer=init),
        tf.keras.layers.BatchNormalization(), tf.keras.layers.ReLU(),

        tf.keras.layers.Conv2DTranspose(3, 5, strides=2, padding='same', use_bias=False, activation='tanh', kernel_initializer=init)
    ], name="generator")
    return model

def build_discriminator(input_shape=IM_SHAPE):
    layers = [
        tf.keras.layers.Conv2D(64, 5, strides=2, padding='same', kernel_initializer=init, input_shape=input_shape),
        tf.keras.layers.LeakyReLU(0.2),
        tf.keras.layers.Dropout(0.3),

        tf.keras.layers.Conv2D(128, 5, strides=2, padding='same', kernel_initializer=init),
        tf.keras.layers.LeakyReLU(0.2),
        tf.keras.layers.Dropout(0.3),

        tf.keras.layers.Flatten(),
        tf.keras.layers.Dense(1, kernel_initializer=init)
    ]
    model = tf.keras.Sequential(layers, name="discriminator_logits")
    # Wrap logits with a Sigmoid activation in float32 to avoid potential fp16 issues
    inp = tf.keras.Input(shape=input_shape)
    x = model(inp)
    out = tf.keras.layers.Activation('sigmoid', dtype='float32', name="discriminator_output")(x)
    return tf.keras.Model(inp, out, name="discriminator")

generator = build_generator(LATENT_DIM)
discriminator = build_discriminator(IM_SHAPE)
generator.summary()
discriminator.summary()



## 6) Custom GAN (`tf.keras.Model`) with Training Step
We implement the standard two-step GAN update: first train **D** on real+fake, then train **G** to fool **D**.  
We use label smoothing (0.9 for real) to stabilize training.


In [None]:

class GAN(tf.keras.Model):
    def __init__(self, generator, discriminator, latent_dim):
        super().__init__()
        self.generator = generator
        self.discriminator = discriminator
        self.latent_dim = latent_dim
        self.g_loss_tracker = tf.keras.metrics.Mean(name="g_loss")
        self.d_loss_tracker = tf.keras.metrics.Mean(name="d_loss")

    @property
    def metrics(self):
        return [self.g_loss_tracker, self.d_loss_tracker]

    def compile(self, g_optimizer, d_optimizer, d_loss_fn, g_loss_fn=None):
        super().compile()
        self.g_optimizer = g_optimizer
        self.d_optimizer = d_optimizer
        # same BCE for both by default
        self.d_loss_fn = d_loss_fn
        self.g_loss_fn = g_loss_fn if g_loss_fn is not None else d_loss_fn

    def train_step(self, real_images):
        batch_size = tf.shape(real_images)[0]

        # ---------------------
        # Train Discriminator
        # ---------------------
        random_latent_vectors = tf.random.normal(shape=(batch_size, self.latent_dim))
        generated_images = self.generator(random_latent_vectors, training=True)

        # Label smoothing for real; no noise for simplicity
        real_labels = tf.ones((batch_size, 1)) * 0.9
        fake_labels = tf.zeros((batch_size, 1))

        with tf.GradientTape() as tape:
            pred_real = self.discriminator(real_images, training=True)
            pred_fake = self.discriminator(generated_images, training=True)
            d_loss_real = self.d_loss_fn(real_labels, pred_real)
            d_loss_fake = self.d_loss_fn(fake_labels, pred_fake)
            d_loss = (d_loss_real + d_loss_fake) / 2.0
        grads = tape.gradient(d_loss, self.discriminator.trainable_variables)
        self.d_optimizer.apply_gradients(zip(grads, self.discriminator.trainable_variables))

        # ---------------------
        # Train Generator
        # ---------------------
        random_latent_vectors = tf.random.normal(shape=(batch_size, self.latent_dim))
        misleading_labels = tf.ones((batch_size, 1))

        with tf.GradientTape() as tape:
            fake_images = self.generator(random_latent_vectors, training=True)
            pred = self.discriminator(fake_images, training=True)
            g_loss = self.g_loss_fn(misleading_labels, pred)
        grads = tape.gradient(g_loss, self.generator.trainable_variables)
        self.g_optimizer.apply_gradients(zip(grads, self.generator.trainable_variables))

        self.g_loss_tracker.update_state(g_loss)
        self.d_loss_tracker.update_state(d_loss)
        return {"g_loss": self.g_loss_tracker.result(), "d_loss": self.d_loss_tracker.result()}



## 7) Callbacks: Image Grid per Epoch & Model Checkpoints
We save a fixed grid of generated faces every epoch to monitor progress, and checkpoint generator weights.


In [None]:

import matplotlib.pyplot as plt
import numpy as np

class ShowImage(tf.keras.callbacks.Callback):
    def __init__(self, latent_dim, num_images=16, outdir="generated"):
        self.latent_dim = latent_dim
        self.num_images = num_images
        self.seed = tf.random.normal([num_images, latent_dim])
        self.outdir = outdir
        os.makedirs(outdir, exist_ok=True)

    def on_epoch_end(self, epoch, logs=None):
        generated = self.model.generator(self.seed, training=False)
        generated = (generated + 1.0) / 2.0  # [-1,1] -> [0,1]

        plt.figure(figsize=(6,6))
        for i in range(self.num_images):
            plt.subplot(4,4,i+1)
            plt.imshow(generated[i].numpy())
            plt.axis("off")
        fname = os.path.join(self.outdir, f"epoch_{epoch+1:03d}.png")
        plt.savefig(fname, bbox_inches="tight")
        plt.close()

checkpoint_cb = tf.keras.callbacks.ModelCheckpoint(
    filepath="checkpoints/generator_epoch{epoch:03d}.weights.h5",
    save_weights_only=True,
    save_freq="epoch"
)
os.makedirs("checkpoints", exist_ok=True)



## 8) Compile & Train
Train with Adam optimizers and Binary Cross-Entropy loss.


In [None]:

gan = GAN(generator, discriminator, LATENT_DIM)
gan.compile(
    g_optimizer=tf.keras.optimizers.Adam(learning_rate=2e-4, beta_1=0.5),
    d_optimizer=tf.keras.optimizers.Adam(learning_rate=2e-4, beta_1=0.5),
    d_loss_fn=tf.keras.losses.BinaryCrossentropy(from_logits=False)
)

history = gan.fit(
    train_dataset,
    epochs=EPOCHS,
    callbacks=[ShowImage(LATENT_DIM), checkpoint_cb],
    verbose=1
)



## 9) Loss Curves


In [None]:

import matplotlib.pyplot as plt

plt.figure()
plt.plot(history.history['g_loss'], label="Generator Loss")
plt.plot(history.history['d_loss'], label="Discriminator Loss")
plt.xlabel("Epoch")
plt.ylabel("Loss")
plt.legend()
plt.show()



## 10) Inference: Generate Fresh Samples
Use the trained generator to sample new faces.


In [None]:

num_samples = 25
z = tf.random.normal([num_samples, LATENT_DIM])
imgs = gan.generator(z, training=False)
imgs = (imgs + 1.0) / 2.0

import matplotlib.pyplot as plt
plt.figure(figsize=(6,6))
for i in range(num_samples):
    plt.subplot(5,5,i+1)
    plt.imshow(imgs[i].numpy())
    plt.axis("off")
plt.tight_layout()
plt.show()



## 11) (Optional) Save Final Generator
Save weights for later inference, and a small snippet showing how to load and generate.


In [None]:

# Save final generator weights
os.makedirs("artifacts", exist_ok=True)
gen_w_path = "artifacts/generator_final.weights.h5"
gan.generator.save_weights(gen_w_path)
print("✅ Saved:", gen_w_path)

# Example: reload and generate
gen2 = build_generator(LATENT_DIM)
gen2.load_weights(gen_w_path)
z = tf.random.normal([4, LATENT_DIM])
out = gen2(z, training=False)
print("Reloaded generator output shape:", out.shape)



## 12) Next Steps
- Add **FID** evaluation (even on a subset) for quantitative tracking.
- Try **WGAN-GP** for improved stability.
- Progressive growing (start 32×32 → 64×64 → …).
- Add light data augmentations (e.g., flips, jitter).
- Run for more epochs with mixed precision if supported.
