![Banner](https://i.imgur.com/a3uAqnb.png)

# Image Translation using Conditional GAN (Pix2Pix) - Homework Assignment

In this homework, you will implement a **Conditional Image-to-Image GAN** for translating edge images to shoe images. This is based on the **Pix2Pix** architecture that learns to map from one image domain to another.

## 📌 Project Overview
- **Task**: Edge-to-Shoe image translation
- **Architecture**: Conditional GAN with U-Net Generator and PatchGAN Discriminator
- **Dataset**: Edge2Shoes dataset (provided)
- **Goal**: Generate realistic shoe images from edge sketches

## 📚 Learning Objectives
By completing this assignment, you will:
- Understand conditional GANs and image-to-image translation
- Implement U-Net architecture with skip connections
- Build a PatchGAN discriminator
- Learn about combined loss functions (adversarial + L1)
- Practice training GANs with proper loss balancing

## 1️⃣ Dataset Setup (PROVIDED)

The Edge2Shoes dataset has been downloaded and prepared for you. The dataset structure is as follows:
- `train/` folder contains training images
- `val/` folder contains validation images
- Each image contains edge sketch (left half) and corresponding shoe (right half)


In [None]:
import kagglehub
import os
from dotenv import load_dotenv

# Dataset already downloaded and prepared
load_dotenv()
path = kagglehub.dataset_download("balraj98/edges2shoes-dataset")
print("Path to dataset files:", path)

## 2️⃣ Import Libraries and Configuration

**Task**: Import all necessary libraries and set up configuration parameters.

**Requirements**:
- Import PyTorch, torchvision, and related libraries
- Import matplotlib, PIL, numpy, and other utilities
- Set random seeds for reproducibility
- Configure hyperparameters with reasonable values

In [None]:
# TODO: Import all necessary libraries

# TODO: Set random seeds for reproducibility (use seed=42)

# TODO: Check device availability and print

# TODO: Define configuration parameters:
IMG_SIZE = 128  # Image size 
BATCH_SIZE = 16  # Batch size
LEARNING_RATE = 0.0002  # Learning rate
BETA1 = 0.5  # Adam optimizer beta1
BETA2 = 0.999  # Adam optimizer beta2  
LAMBDA_L1 = 100  # Weight for L1 loss
NUM_EPOCHS = 5  # Number of training epochs

## 3️⃣ Custom Dataset Class

**Task**: Create a custom dataset class that handles the Edge2Shoes data format.

**Requirements**:
- Split each image into left half (edge) and right half (shoe)
- Apply transformations to both images
- Return edge image as input and shoe image as target


In [None]:
# TODO: Create EdgeShoeDataset class inheriting from torch.utils.data.Dataset
# TODO: In __init__:
#       - Store root_dir and transform
#       - Get list of all .jpg files
# TODO: Implement __len__ to return number of images
# TODO: Implement __getitem__ to:
#       - Load image and convert to RGB
#       - Split into left half (edge) and right half (shoe)
#       - Apply transforms if provided
#       - Return (edge_img, shoe_img) tuple

## 4️⃣ Data Preprocessing and Loading

**Task**: Set up data transformations and create data loaders.

**Requirements**:
- Resize images to target size (128x128)
- Convert to tensors and normalize to [-1, 1] range
- Create train and validation datasets and loaders

In [None]:
# TODO: Define transforms using transforms.Compose:
#       - Resize to (IMG_SIZE, IMG_SIZE)
#       - ToTensor() 
#       - Normalize with mean=[0.5, 0.5, 0.5], std=[0.5, 0.5, 0.5]

# TODO: Create train_dataset using EdgeShoeDataset with train folder
# TODO: Create val_dataset using EdgeShoeDataset with val folder

# TODO: Create train_loader and val_loader with DataLoader
#       - Use appropriate batch_size and shuffle settings

# TODO: Print dataset sizes
# TODO: Test by loading one sample and printing shapes


## 5️⃣ Generator Network (U-Net Architecture)

**Task**: Implement a U-Net generator with encoder-decoder structure and skip connections.

**Requirements**:
- Encoder: Progressive downsampling using Conv2d layers
- Decoder: Progressive upsampling using ConvTranspose2d layers  
- Skip connections between corresponding encoder-decoder layers
- Final output uses Tanh activation for [-1,1] range

In [None]:
# TODO: Create Generator class inheriting from nn.Module
# TODO: In __init__(self, in_channels=3, out_channels=3):
#       
#       Build Encoder (downsampling):
#       - Layer 1: Conv2d(3, 64, 4, 2, 1) + LeakyReLU (no BatchNorm)
#       - Layer 2: Conv2d(64, 128, 4, 2, 1) + BatchNorm2d + LeakyReLU  
#       - Layer 3: Conv2d(128, 256, 4, 2, 1) + BatchNorm2d + LeakyReLU
#       - Layer 4: Conv2d(256, 512, 4, 2, 1) + BatchNorm2d + LeakyReLU
#       - Layer 5: Conv2d(512, 512, 4, 2, 1) + BatchNorm2d + LeakyReLU
#       - Layer 6: Conv2d(512, 512, 4, 2, 1) + BatchNorm2d + LeakyReLU (bottleneck)
#
#       Build Decoder (upsampling with skip connections):
#       - Layer 1: ConvTranspose2d(512, 512, 4, 2, 1) + BatchNorm2d + Dropout + ReLU
#       - Layer 2: ConvTranspose2d(1024, 512, 4, 2, 1) + BatchNorm2d + Dropout + ReLU
#       - Layer 3: ConvTranspose2d(1024, 256, 4, 2, 1) + BatchNorm2d + ReLU
#       - Layer 4: ConvTranspose2d(512, 128, 4, 2, 1) + BatchNorm2d + ReLU
#       - Layer 5: ConvTranspose2d(256, 64, 4, 2, 1) + BatchNorm2d + ReLU
#       - Final: ConvTranspose2d(128, 3, 4, 2, 1) + Tanh
#
# TODO: In forward(self, x):
#       - Pass through encoder, save intermediate outputs
#       - Pass through decoder, concatenating skip connections
#       - Return final output
#
# TODO: Initialize generator and print parameter count
# TODO: Test with random input to verify output shape


## 6️⃣ Discriminator Network (PatchGAN)

**Task**: Implement a PatchGAN discriminator that classifies image patches as real/fake.

**Requirements**:
- Accept concatenated input (edge + shoe = 6 channels)
- Use strided convolutions for downsampling
- Output a patch-wise classification matrix (not single value)
- Use LeakyReLU activations

In [2]:
# TODO: Create Discriminator class inheriting from nn.Module
# TODO: In __init__(self, in_channels=6):  # 3 for edge + 3 for shoe
#       Build discriminator layers:
#       - Layer 1: Conv2d(6, 64, 4, 2, 1) + LeakyReLU (no BatchNorm)
#       - Layer 2: Conv2d(64, 128, 4, 2, 1) + BatchNorm2d + LeakyReLU
#       - Layer 3: Conv2d(128, 256, 4, 2, 1) + BatchNorm2d + LeakyReLU  
#       - Layer 4: Conv2d(256, 512, 4, 1, 1) + BatchNorm2d + LeakyReLU
#       - Final: Conv2d(512, 1, 4, 1, 1) (no activation)
#
# TODO: In forward(self, img_A, img_B):
#       - Concatenate img_A and img_B along channel dimension
#       - Pass through discriminator layers
#       - Return patch predictions
#
# TODO: Initialize discriminator and print parameter count  
# TODO: Test with random inputs to verify output shape

## 7️⃣ Loss Functions and Optimizers

**Task**: Set up loss functions and optimizers for GAN training.

**Requirements**:
- Use appropriate loss functions for adversarial and reconstruction objectives
- Initialize optimizers with given hyperparameters
- Implement weight initialization for stable training

In [None]:
# TODO: Define loss functions:
#       - criterion_GAN = nn.BCEWithLogitsLoss() for adversarial loss
#       - criterion_L1 = nn.L1Loss() for reconstruction loss

# TODO: Create optimizers:
#       - optimizer_G for generator with Adam(lr=LEARNING_RATE, betas=(BETA1, BETA2))
#       - optimizer_D for discriminator with Adam(lr=LEARNING_RATE, betas=(BETA1, BETA2))

# TODO: Implement weights_init(m) function:
#       - For Conv layers: init with normal_(mean=0.0, std=0.02)
#       - For BatchNorm layers: weight normal_(1.0, 0.02), bias constant_(0)

# TODO: Apply weights_init to both generator and discriminator

## 8️⃣ Training Loop

**Task**: Implement the main GAN training loop with alternating updates.

**Requirements**:
- Train generator to fool discriminator and match target images
- Train discriminator to distinguish real from generated images
- Balance adversarial loss with L1 reconstruction loss
- Track and display training progress

In [None]:
# TODO: Create training loop for NUM_EPOCHS:
#       
#       For each batch in train_loader:
#       - Move edge_imgs and real_shoes to device
#       - Get batch_size
#       
#       Train Generator:
#       - Generate fake_shoes from edge_imgs
#       - Get discriminator prediction on (edge_imgs, fake_shoes)
#       - Calculate adversarial loss (try to fool discriminator)
#       - Calculate L1 loss between fake_shoes and real_shoes  
#       - Total loss = adversarial_loss + LAMBDA_L1 * L1_loss
#       - Backpropagate and update generator
#       
#       Train Discriminator:
#       - Get prediction on real pair (edge_imgs, real_shoes)
#       - Get prediction on fake pair (edge_imgs, fake_shoes.detach())
#       - Calculate loss for real (should predict 1) and fake (should predict 0)
#       - Total loss = (real_loss + fake_loss) / 2
#       - Backpropagate and update discriminator
#       
# TODO: Use appropriate label creation for discriminator training
# TODO: Track losses and display training progress


## 9️⃣ Evaluation and Visualization

**Task**: Evaluate your trained model and visualize results.

**Requirements**:
- Generate shoes from validation edge images
- Compare with ground truth shoes
- Create side-by-side visualizations


In [None]:
# TODO: Set models to evaluation mode
# TODO: Create function to denormalize images from [-1,1] to [0,1]
# TODO: Create visualization function that:
#       - Takes several validation samples
#       - Generates fake shoes using trained generator
#       - Displays edge input, generated output, and real target
#       - Shows results in a grid format (3 columns: edge, generated, real)
# TODO: Display results for 10 validation samples
# TODO: Plot training loss curves for generator and discriminator

## 🔟 Analysis

**Task**: Analyze your results

**Requirements**:
- Evaluate the quality of generated images
- Discuss strengths and limitations of your model
- Test the effect of different hyperparameters (optional)


In [None]:
# TODO: Analyze the quality of generated shoes, for example: 
#       - Are edges properly converted to realistic shoes?
#       - Do shoes maintain the shape from edge inputs?
#       - How realistic do the textures and colors look?
# TODO: Document any interesting observations or failure cases


## 📝 Evaluation Criteria

Your homework will be evaluated based on:

1. **Implementation Correctness (40%)**
   - Proper U-Net generator implementation
   - Correct PatchGAN discriminator
   - Working training loop with appropriate losses

2. **Training and Results (30%)**
   - Model trains without errors
   - Reasonable loss convergence
   - Generated images show edge-to-shoe translation

3. **Code Quality (20%)**
   - Clean, readable code with comments
   - Proper tensor shapes and data flow
   - Efficient implementation

4. **Analysis (10%)**
   - Discussion of results
   - Understanding of model behavior
   - Insights about GAN training