***Rhea Chainani | 22070126086 | AIML B1***

## Installing Dependencies

In [15]:
pip install torch torchvision torchsummary medmnist tensorboard tqdm --quiet

[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m87.2/87.2 kB[0m [31m2.7 MB/s[0m eta [36m0:00:00[0m
[?25h  Preparing metadata (setup.py) ... [?25l[?25hdone
  Building wheel for fire (setup.py) ... [?25l[?25hdone
Note: you may need to restart the kernel to use updated packages.


In [16]:
pip install torch-fidelity --quiet

Note: you may need to restart the kernel to use updated packages.


In [17]:
pip install torchmetrics[image] --quiet

Note: you may need to restart the kernel to use updated packages.


## Import Libraries

In [18]:
import torch
import torch.nn as nn
import torch.optim as optim
import torchvision.transforms as transforms
import torchvision.utils as vutils
import medmnist
from medmnist import BreastMNIST
from torch.utils.data import DataLoader
from torch.utils.tensorboard import SummaryWriter
import numpy as np
import os
from tqdm import tqdm

In [19]:
from torchmetrics.image.inception import InceptionScore
from torchmetrics.image.fid import FrechetInceptionDistance

In [20]:
from torch.cuda.amp import autocast, GradScaler

In [21]:
# Checking if GPU is available
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
print(device)

cuda


## Load MedMNIST Dataset

**BreastMNIST** is a medical imaging dataset from the **MedMNIST** collection, consisting of **28×28 grayscale images** derived from breast ultrasound scans. It includes a total of **780 images**, categorized into **two classes**—benign and malignant—making it suitable for binary classification tasks. The dataset is split into training (546 images), validation (78 images), and test (156 images) sets. BreastMNIST serves as a benchmark for developing and evaluating deep learning models in breast cancer detection, aiding in automated diagnostics and medical image analysis research.

In [22]:
# Define dataset transformation
transform = transforms.Compose([
    transforms.Grayscale(num_output_channels=1),
    transforms.ToTensor(),
    transforms.Normalize((0.5,), (0.5,))  # Normalize images to [-1, 1] for better GAN performance
])

In [23]:
# Load the MedMNIST dataset (BreastMNIST)
root_dir = "/kaggle/working/medmnist_data"
os.makedirs(root_dir, exist_ok=True)

dataset = BreastMNIST(root=root_dir, split="train", transform=transform, download=True, as_rgb=False)
dataloader = DataLoader(dataset, batch_size=64, shuffle=True)

print("Dataset loaded successfully!")

Downloading https://zenodo.org/records/10519652/files/breastmnist.npz?download=1 to /kaggle/working/medmnist_data/breastmnist.npz


100%|██████████| 560k/560k [00:00<00:00, 719kB/s] 

Dataset loaded successfully!





## Define Generator & Discriminator

In [24]:
class Generator(nn.Module):
    """
    Generator Network for GANs.
    Generates synthetic images from random noise (latent vector).
    """
    def __init__(self, latent_dim=100, img_shape=(1, 28, 28)):
        super(Generator, self).__init__()
        self.img_shape = img_shape
        
        self.model = nn.Sequential(
            nn.Linear(latent_dim, 128),
            nn.ReLU(),
            nn.Linear(128, 256),
            nn.BatchNorm1d(256),
            nn.ReLU(),
            nn.Linear(256, 512),
            nn.BatchNorm1d(512),
            nn.ReLU(),
            nn.Linear(512, int(np.prod(img_shape))),  # Output flattened image
            nn.Tanh()  # Normalize output to [-1, 1]
        )

    def forward(self, z):
        """
        Forward pass: Generates an image from latent vector z.
        """
        img = self.model(z)
        img = img.view(img.size(0), *self.img_shape)  # Reshape to image format
        return img

In [25]:
class Discriminator(nn.Module):
    """
    Discriminator Network for GANs.
    Determines whether an image is real or fake.
    """
    def __init__(self, img_shape=(1, 28, 28)):
        super(Discriminator, self).__init__()
        
        self.model = nn.Sequential(
            nn.Linear(int(np.prod(img_shape)), 512),
            nn.LeakyReLU(0.2),
            nn.Linear(512, 256),
            nn.LeakyReLU(0.2),
            nn.Linear(256, 1)  # Output single score (real or fake)
        )

    def forward(self, img):
        """
        Forward pass: Classifies an image as real or fake.
        """
        img_flat = img.view(img.size(0), -1)  # Flatten image
        return self.model(img_flat)

In [26]:
def save_generated_images(generator, epoch, latent_dim=100, num_images=16, gan_type="WGAN", base_folder="generated_images"):
    folder = os.path.join(base_folder, gan_type)  # Separate folders per GAN type
    os.makedirs(folder, exist_ok=True)  

    z = torch.randn(num_images, latent_dim).to(device)
    with torch.no_grad():
        fake_imgs = generator(z)

    image_path = f"{folder}/epoch_{epoch}.png"
    vutils.save_image(fake_imgs, image_path, normalize=True)
    
    print(f"Saved: {image_path}")

In [27]:
def evaluate_performance(generator, inception, fid, latent_dim, num_images=100):
    generator.eval()
    with torch.no_grad():
        z = torch.randn(num_images, latent_dim).to(next(generator.parameters()).device)
        fake_imgs = generator(z)
        
    fake_imgs = (fake_imgs + 1) * 127.5  # Convert from [-1,1] to [0,255]
    fake_imgs = fake_imgs.clamp(0, 255).to(torch.uint8)  # Ensure values are within 0-255 and convert to uint8
    
    inception.update(fake_imgs.repeat(1, 3, 1, 1))
    fid.update((fake_imgs.repeat(1, 3, 1, 1) * 255).to(torch.uint8), real=False)
    generator.train()

## Define Training Functions

### LS-GAN Training

LSGAN replaces the binary cross-entropy loss used in standard GANs with a **least squares loss function** to stabilize training. By minimizing the **mean squared error** between real and fake labels, LSGAN reduces mode collapse and produces smoother gradients, leading to more realistic image generation.
But it is still susceptible to mode collapse and vanishing gradients.

In [14]:
def train_ls_gan(generator, discriminator, dataloader, epochs=50, latent_dim=100):
    """
    Trains a Least Squares GAN (LS-GAN).
    """
    writer = SummaryWriter(log_dir="./logs/LSGAN")
    criterion = nn.MSELoss()  # Least Squares loss
    optimizer_G = optim.Adam(generator.parameters(), lr=0.0002)
    optimizer_D = optim.Adam(discriminator.parameters(), lr=0.0002)

    inception = InceptionScore().to(device)
    fid = FrechetInceptionDistance().to(device)

    for epoch in range(epochs):
        fid.reset()  # Reset FID at the beginning of each epoch

        for i, (imgs, _) in enumerate(dataloader):
            imgs = imgs.to(device)
            
            # Train Discriminator
            optimizer_D.zero_grad()
            z = torch.randn(imgs.size(0), latent_dim).to(device)
            fake_imgs = generator(z).detach()
            
            real_preds = discriminator(imgs)
            fake_preds = discriminator(fake_imgs)

            real_loss = criterion(real_preds, torch.ones_like(real_preds))
            fake_loss = criterion(fake_preds, torch.zeros_like(fake_preds))
            
            d_loss = 0.5 * (real_loss + fake_loss)
            d_loss.backward()
            optimizer_D.step()

            # Train Generator
            optimizer_G.zero_grad()
            z = torch.randn(imgs.size(0), latent_dim).to(device)
            gen_imgs = generator(z)
            
            g_loss = criterion(discriminator(gen_imgs), torch.ones_like(discriminator(gen_imgs)))
            g_loss.backward()
            optimizer_G.step()

            # Update FID with real & fake images
            fid.update((imgs.repeat(1, 3, 1, 1) * 255).to(torch.uint8), real=True)  # Convert grayscale to 3-channel
            fid.update((fake_imgs.repeat(1, 3, 1, 1) * 255).to(torch.uint8), real=False)

        # Compute FID only if enough samples exist
        if fid.real_features_num_samples > 1 and fid.fake_features_num_samples > 1:
            fid_score = fid.compute().item()
        else:
            fid_score = float("inf")  # Placeholder until enough samples accumulate

        if epoch % 5 == 0:
            save_generated_images(generator, epoch, latent_dim, gan_type="LSGAN")
            evaluate_performance(generator, inception, fid, latent_dim)

            # Generate sample images for TensorBoard
            z = torch.randn(16, latent_dim).to(device)  
            sample_imgs = generator(z).detach().cpu()
            writer.add_images(f"Generated Images", sample_imgs.repeat(1, 3, 1, 1), epoch)

            inception_score = inception.compute()[0].item()
            writer.add_scalar("Inception Score", inception_score, epoch)
            writer.add_scalar("FID Score", fid_score, epoch)

        writer.add_scalar("D Loss", d_loss.item(), epoch)
        writer.add_scalar("G Loss", g_loss.item(), epoch)

        print(f"Epoch {epoch+1}/{epochs} | D Loss: {d_loss.item():.4f} | G Loss: {g_loss.item():.4f} | FID: {fid_score:.3f}")

### WGAN Training

WGAN improves training stability by using the **Wasserstein distance (Earth Mover’s distance)** instead of the Jensen-Shannon divergence. It replaces the discriminator with a **critic**, which does not classify but rather **scores** real and fake samples. **Weight clipping** is applied to enforce the Lipschitz constraint. But one of its disadvantages is that weight clipping can limit learning capacity.



In [28]:
def train_wgan(generator, discriminator, dataloader, epochs=50, latent_dim=100, c=0.01):
    """
    Trains a Wasserstein GAN (WGAN) with weight clipping.
    """
    writer = SummaryWriter(log_dir="./logs/WGAN")
    optimizer_G = optim.RMSprop(generator.parameters(), lr=0.00005)
    optimizer_D = optim.RMSprop(discriminator.parameters(), lr=0.00005)

    inception = InceptionScore().to(device)
    fid = FrechetInceptionDistance().to(device)
    
    for epoch in range(epochs):
        fid.reset()
        
        for i, (imgs, _) in enumerate(dataloader):
            imgs = imgs.to(device)

            # Train Discriminator
            optimizer_D.zero_grad()
            z = torch.randn(imgs.size(0), latent_dim).to(device)
            fake_imgs = generator(z).detach()
            
            d_loss = -(discriminator(imgs).mean() - discriminator(fake_imgs).mean())
            d_loss.backward()
            optimizer_D.step()

            # Apply weight clipping
            for p in discriminator.parameters():
                p.data.clamp_(-c, c)

            # Update FID
            fid.update((imgs.repeat(1, 3, 1, 1) * 255).to(torch.uint8), real=True)
            fid.update((fake_imgs.repeat(1, 3, 1, 1) * 255).to(torch.uint8), real=False)
        
            # Train Generator (every 5 steps)
            if i % 5 == 0:
                optimizer_G.zero_grad()
                z = torch.randn(imgs.size(0), latent_dim).to(device)
                g_loss = -discriminator(generator(z)).mean()
                g_loss.backward()
                optimizer_G.step()

        if epoch % 5 == 0:
            save_generated_images(generator, epoch, latent_dim)
            evaluate_performance(generator, inception, fid, latent_dim)
            inception_score = inception.compute()[0].item()
            fid_score = fid.compute().item()
            writer.add_scalar("Inception Score", inception_score, epoch)
            writer.add_scalar("FID Score", fid_score, epoch)
            # Generate sample images for TensorBoard
            z = torch.randn(16, latent_dim).to(device)  # Generate 16 images
            sample_imgs = generator(z).detach().cpu()
            writer.add_images(f"Generated Images", sample_imgs.repeat(1, 3, 1, 1), epoch)
        
        writer.add_scalar("D Loss", d_loss.item(), epoch)
        writer.add_scalar("G Loss", g_loss.item(), epoch)
        
        print(f"Epoch {epoch+1}/{epochs} | D Loss: {d_loss.item():.4f} | G Loss: {g_loss.item():.4f} | FID: {fid_score:.3f}")

### WGAN-GP Training

WGAN-GP enhances WGAN by replacing weight clipping with a **gradient penalty**, enforcing the Lipschitz constraint more effectively. This eliminates the need for manual weight clipping and results in smoother and more realistic generated samples. But it incurrs a higher computational cost due to gradient penalty computation.

In [29]:
def compute_gradient_penalty(D, real_samples, fake_samples):
    """
    Computes the gradient penalty for WGAN-GP.
    """
    alpha = torch.rand(real_samples.size(0), 1, 1, 1).to(real_samples.device)
    interpolates = (alpha * real_samples + (1 - alpha) * fake_samples).requires_grad_(True)
    d_interpolates = D(interpolates)
    
    gradients = torch.autograd.grad(
        outputs=d_interpolates,
        inputs=interpolates,
        grad_outputs=torch.ones_like(d_interpolates),
        create_graph=True,
        retain_graph=True
    )[0]
    gradient_penalty = ((gradients.norm(2, dim=1) - 1) ** 2).mean()
    
    return gradient_penalty

In [30]:
def train_wgan_gp(generator, discriminator, dataloader, epochs=50, latent_dim=100, lambda_gp=10):
    """
    Trains a Wasserstein GAN with Gradient Penalty (WGAN-GP).
    """
    writer = SummaryWriter(log_dir="./logs/WGAN-GP")
    optimizer_G = optim.Adam(generator.parameters(), lr=0.0001, betas=(0.5, 0.9))
    optimizer_D = optim.Adam(discriminator.parameters(), lr=0.0001, betas=(0.5, 0.9))

    inception = InceptionScore().to(device)
    fid = FrechetInceptionDistance().to(device)

    for epoch in range(epochs):
        fid.reset()
        
        for i, (imgs, _) in enumerate(dataloader):
            imgs = imgs.to(device)

            optimizer_D.zero_grad()
            z = torch.randn(imgs.size(0), latent_dim).to(device)
            fake_imgs = generator(z).detach()
            
            gp = compute_gradient_penalty(discriminator, imgs, fake_imgs)
            d_loss = -(discriminator(imgs).mean() - discriminator(fake_imgs).mean()) + lambda_gp * gp
            d_loss.backward()
            optimizer_D.step()

            # Update FID
            fid.update((imgs.repeat(1, 3, 1, 1) * 255).to(torch.uint8), real=True)
            fid.update((fake_imgs.repeat(1, 3, 1, 1) * 255).to(torch.uint8), real=False)
        
            optimizer_G.zero_grad()
            z = torch.randn(imgs.size(0), latent_dim).to(device)
            g_loss = -discriminator(generator(z)).mean()
            g_loss.backward()
            optimizer_G.step()

        if epoch % 3 == 0:
            save_generated_images(generator, epoch, latent_dim, gan_type="WGAN-GP")
            evaluate_performance(generator, inception, fid, latent_dim)
            inception_score = inception.compute()[0].item()
            fid_score = fid.compute().item()
            writer.add_scalar("Inception Score", inception_score, epoch)
            writer.add_scalar("FID Score", fid_score, epoch)
            # Generate sample images for TensorBoard
            z = torch.randn(16, latent_dim).to(device)  # Generate 16 images
            sample_imgs = generator(z).detach().cpu()
            writer.add_images(f"Generated Images", sample_imgs.repeat(1, 3, 1, 1), epoch)
    
        writer.add_scalar("D Loss", d_loss.item(), epoch)
        writer.add_scalar("G Loss", g_loss.item(), epoch)
        
        print(f"Epoch {epoch+1}/{epochs} | D Loss: {d_loss.item():.4f} | G Loss: {g_loss.item():.4f}  | FID: {fid_score:.3f}")

#### Train all three GANs

In [31]:
generator = Generator().to(device)
discriminator = Discriminator().to(device)

In [32]:
import shutil

def reset_generated_images(folder="generated_images"):
    """Deletes the folder if it exists, then recreates it."""
    if os.path.exists(folder):
        shutil.rmtree(folder)  # Delete the folder
    os.makedirs(folder, exist_ok=True)  # Recreate it
    print(f"Reset {folder} before training.")

reset_generated_images()

Reset generated_images before training.


In [20]:
train_ls_gan(generator, discriminator, dataloader)

Downloading: "https://github.com/toshas/torch-fidelity/releases/download/v0.2.0/weights-inception-2015-12-05-6726825d.pth" to /root/.cache/torch/hub/checkpoints/weights-inception-2015-12-05-6726825d.pth
100%|██████████| 91.2M/91.2M [00:03<00:00, 24.9MB/s]


Saved: generated_images/LSGAN/epoch_0.png
Epoch 1/50 | D Loss: 0.2505 | G Loss: 0.3445 | FID: 423.612
Epoch 2/50 | D Loss: 0.0885 | G Loss: 1.0771 | FID: 440.188
Epoch 3/50 | D Loss: 0.0761 | G Loss: 0.6992 | FID: 442.460
Epoch 4/50 | D Loss: 0.1582 | G Loss: 0.7380 | FID: 420.051
Epoch 5/50 | D Loss: 0.1090 | G Loss: 1.0785 | FID: 398.543
Saved: generated_images/LSGAN/epoch_5.png
Epoch 6/50 | D Loss: 0.0638 | G Loss: 1.7692 | FID: 387.249
Epoch 7/50 | D Loss: 0.0534 | G Loss: 2.1980 | FID: 383.999
Epoch 8/50 | D Loss: 0.0645 | G Loss: 2.2805 | FID: 390.629
Epoch 9/50 | D Loss: 0.0371 | G Loss: 2.0461 | FID: 389.741
Epoch 10/50 | D Loss: 0.0303 | G Loss: 1.8922 | FID: 392.449
Saved: generated_images/LSGAN/epoch_10.png
Epoch 11/50 | D Loss: 0.0356 | G Loss: 1.7393 | FID: 384.502
Epoch 12/50 | D Loss: 0.0445 | G Loss: 1.7926 | FID: 398.521
Epoch 13/50 | D Loss: 0.0162 | G Loss: 1.7767 | FID: 416.562
Epoch 14/50 | D Loss: 0.0226 | G Loss: 1.6892 | FID: 416.684
Epoch 15/50 | D Loss: 0.0253

In [22]:
folder_name = "logs/LSGAN"
zip_name = f"{folder_name}.zip"

shutil.make_archive(zip_name.replace(".zip", ""), 'zip', f"/kaggle/working/{folder_name}")

from IPython.display import FileLink
FileLink(zip_name)

In [24]:
folder_name = "generated_images/LSGAN"
zip_name = f"{folder_name}.zip"

shutil.make_archive(zip_name.replace(".zip", ""), 'zip', f"/kaggle/working/{folder_name}")

from IPython.display import FileLink
FileLink(zip_name)

In [26]:
log_dir = '/kaggle/working/logs/LSGAN'
print("Log Directory Exists:", os.path.exists(log_dir))
print("Files in Log Directory:", os.listdir(log_dir) if os.path.exists(log_dir) else "No logs found")

Log Directory Exists: True
Files in Log Directory: ['events.out.tfevents.1743242983.efab702df5ab.31.0']


**Generated Images**

![image.png](attachment:f4cd22aa-71c2-41d5-96d0-9ec3a1f37ef0.png)

**Tensorboard Charts**

![image.png](attachment:5211ce54-ca30-48dc-8a30-455ec1b29125.png)

![image.png](attachment:d520f4b3-48a5-42d3-8a13-cb471c55c7ea.png)

![image.png](attachment:107aa543-3d3e-4961-9636-e2699d5f4bbb.png)

![image.png](attachment:51b835f0-760a-4c1f-a1df-f775d26f3f64.png)

In [27]:
from tensorboard.backend.event_processing.event_accumulator import EventAccumulator
import os

# Path to logs folder
log_dir = "/kaggle/working/logs/LSGAN"

# Get the latest event file
event_files = [f for f in os.listdir(log_dir) if "events.out.tfevents" in f]
event_files.sort()
event_file = os.path.join(log_dir, event_files[-1])  # Last event file

# Load TensorBoard logs
event_acc = EventAccumulator(event_file, size_guidance={'scalars': 0})
event_acc.Reload()

# Extract FID and IS values
fid_values = event_acc.Scalars("FID Score") if "FID Score" in event_acc.Tags()['scalars'] else []
is_values = event_acc.Scalars("Inception Score") if "Inception Score" in event_acc.Tags()['scalars'] else []

# Get the final recorded values
final_fid = fid_values[-1].value if fid_values else "Not Found"
final_is = is_values[-1].value if is_values else "Not Found"

print(f"Final FID: {final_fid}")
print(f"Final IS: {final_is}")

Final FID: 430.6221008300781
Final IS: 1.8236175775527954


A lower FID is better (ideal: ≤10 for high-quality images). Here, we have 430.62, which is quite high, meaning the generated images are **significantly different from real images** in terms of distribution.
This suggests that the LSGAN may still be generating **low-quality or unrealistic** images, and there might be **mode collapse or instability**.

A higher IS (ideal: >8 for high-quality images) means images are more diverse and contain meaningful objects.
1.82 is very low, indicating that the images **lack diversity** and do **not have clearly recognizable features**. This suggests that the generator is struggling to produce distinct, high-quality outputs.

Possible reasons include training instability, mode collapse, insufficient training or architecture issues.

Steps to improve include training for more epochs, tweaking the learning rate or using a better architecture (such as WGAN-GP or StyleGAN).

In [34]:
train_wgan(generator, discriminator, dataloader)

Saved: generated_images/WGAN/epoch_0.png
Epoch 1/50 | D Loss: -0.3487 | G Loss: -0.2400 | FID: 382.468
Epoch 2/50 | D Loss: -0.1594 | G Loss: -0.2389 | FID: 382.468
Epoch 3/50 | D Loss: -0.2809 | G Loss: 0.2251 | FID: 382.468
Epoch 4/50 | D Loss: -0.2920 | G Loss: 0.2242 | FID: 382.468
Epoch 5/50 | D Loss: -0.1272 | G Loss: 0.0196 | FID: 382.468
Saved: generated_images/WGAN/epoch_5.png
Epoch 6/50 | D Loss: -0.2335 | G Loss: -0.2006 | FID: 375.974
Epoch 7/50 | D Loss: -0.2158 | G Loss: -0.5091 | FID: 375.974
Epoch 8/50 | D Loss: -0.0822 | G Loss: -0.6787 | FID: 375.974
Epoch 9/50 | D Loss: -0.1001 | G Loss: -0.5647 | FID: 375.974
Epoch 10/50 | D Loss: -0.1226 | G Loss: -0.3179 | FID: 375.974
Saved: generated_images/WGAN/epoch_10.png
Epoch 11/50 | D Loss: -0.0276 | G Loss: -0.2159 | FID: 376.241
Epoch 12/50 | D Loss: 0.0159 | G Loss: -0.0896 | FID: 376.241
Epoch 13/50 | D Loss: 0.0111 | G Loss: -0.0880 | FID: 376.241
Epoch 14/50 | D Loss: -0.0742 | G Loss: -0.2719 | FID: 376.241
Epoch 15

In [35]:
folder_name = "logs/WGAN"
zip_name = f"{folder_name}.zip"

shutil.make_archive(zip_name.replace(".zip", ""), 'zip', f"/kaggle/working/{folder_name}")

from IPython.display import FileLink
FileLink(zip_name)

In [36]:
folder_name = "generated_images/WGAN"
zip_name = f"{folder_name}.zip"

shutil.make_archive(zip_name.replace(".zip", ""), 'zip', f"/kaggle/working/{folder_name}")

from IPython.display import FileLink
FileLink(zip_name)

In [37]:
log_dir = "/kaggle/working/logs/WGAN"
print("Log Directory Exists:", os.path.exists(log_dir))
print("Files in Log Directory:", os.listdir(log_dir) if os.path.exists(log_dir) else "No logs found")

Log Directory Exists: True
Files in Log Directory: ['events.out.tfevents.1743245149.bae3b12f71b8.31.0', 'events.out.tfevents.1743245276.bae3b12f71b8.31.1']


**Generated Images**

![image.png](attachment:b0632f99-3db0-4549-be27-c8c0c2e56ebc.png)

**Tensorboard Charts**

![image.png](attachment:4bd726d6-f078-4296-b4d7-08fcd5fc214f.png)

![image.png](attachment:91089c61-f9ae-4319-a5f4-76583646f889.png)

![image.png](attachment:95b3341a-0f5f-4d91-8718-0c845e6f959e.png)

![image.png](attachment:c423436a-fa1b-4ad6-acac-459d42fd87e0.png)

In [40]:
from tensorboard.backend.event_processing.event_accumulator import EventAccumulator

In [41]:
# Get the latest event file
event_files = [f for f in os.listdir(log_dir) if "events.out.tfevents" in f]
event_files.sort()
event_file = os.path.join(log_dir, event_files[-1])  # Last event file

# Load TensorBoard logs
event_acc = EventAccumulator(event_file, size_guidance={'scalars': 0})
event_acc.Reload()

# Extract FID and IS values
fid_values = event_acc.Scalars("FID Score") if "FID Score" in event_acc.Tags()['scalars'] else []
is_values = event_acc.Scalars("Inception Score") if "Inception Score" in event_acc.Tags()['scalars'] else []

# Get the final recorded values
final_fid = fid_values[-1].value if fid_values else "Not Found"
final_is = is_values[-1].value if is_values else "Not Found"

print(f"Final FID: {final_fid}")
print(f"Final IS: {final_is}")

Final FID: 370.5666809082031
Final IS: 1.4049781560897827


The **FID score of 370.56** suggests that the generated images are far from real, while the **Inception Score of 1.4** indicates low diversity and quality. But we could notice the **reducing FID** after each epoch, indicating the model's improvement. WGAN may need more training, better hyperparameters, or a stronger generator to improve results. Issues like weight clipping limitations or an overly strong discriminator could be affecting performance. Switching to **WGAN-GP**, increasing training epochs, or adjusting model architecture may help enhance image quality.

In [38]:
torch.save(generator.state_dict(), "generator_wgan.pth")
torch.save(discriminator.state_dict(), "discriminator_wgan.pth")

In [42]:
train_wgan_gp(generator, discriminator, dataloader)

Saved: generated_images/WGAN-GP/epoch_0.png
Epoch 1/50 | D Loss: 9.9720 | G Loss: 0.0531  | FID: 371.776
Epoch 2/50 | D Loss: 9.8634 | G Loss: -0.2506  | FID: 371.776
Epoch 3/50 | D Loss: 9.8475 | G Loss: -0.0250  | FID: 371.776
Saved: generated_images/WGAN-GP/epoch_3.png
Epoch 4/50 | D Loss: 9.9479 | G Loss: 0.1988  | FID: 378.573
Epoch 5/50 | D Loss: 9.8578 | G Loss: -0.4671  | FID: 378.573
Epoch 6/50 | D Loss: 9.6163 | G Loss: -1.3916  | FID: 378.573
Saved: generated_images/WGAN-GP/epoch_6.png
Epoch 7/50 | D Loss: 10.0737 | G Loss: -2.1481  | FID: 359.113
Epoch 8/50 | D Loss: 10.1458 | G Loss: -1.3162  | FID: 359.113
Epoch 9/50 | D Loss: 9.7417 | G Loss: 0.4104  | FID: 359.113
Saved: generated_images/WGAN-GP/epoch_9.png
Epoch 10/50 | D Loss: 9.5206 | G Loss: 0.9009  | FID: 351.906
Epoch 11/50 | D Loss: 9.6341 | G Loss: 1.1553  | FID: 351.906
Epoch 12/50 | D Loss: 10.0676 | G Loss: 0.3786  | FID: 351.906
Saved: generated_images/WGAN-GP/epoch_12.png
Epoch 13/50 | D Loss: 9.5248 | G Lo

In [43]:
import shutil
folder_name = "logs/WGAN-GP"
zip_name = f"{folder_name}.zip"

shutil.make_archive(zip_name.replace(".zip", ""), 'zip', f"/kaggle/working/{folder_name}")

from IPython.display import FileLink
FileLink(zip_name)

In [44]:
folder_name = "generated_images/WGAN-GP"
zip_name = f"{folder_name}.zip"

shutil.make_archive(zip_name.replace(".zip", ""), 'zip', f"/kaggle/working/{folder_name}")

from IPython.display import FileLink
FileLink(zip_name)

In [45]:
log_dir = "/kaggle/working/logs/WGAN-GP"
print("Log Directory Exists:", os.path.exists(log_dir))
print("Files in Log Directory:", os.listdir(log_dir) if os.path.exists(log_dir) else "No logs found")

Log Directory Exists: True
Files in Log Directory: ['events.out.tfevents.1743245544.bae3b12f71b8.31.2']


**Generated Images**

![image.png](attachment:bfb918bc-a963-40cd-9b37-bcde37ae4948.png)

**Tensorboard Charts**

![image.png](attachment:9a30071a-3438-4433-b8cf-55a3282aff7c.png)

![image.png](attachment:83473b14-48eb-4092-b758-f5eb67bf483f.png)

![image.png](attachment:4be703df-0229-41bd-826d-069bc6573efd.png)

![image.png](attachment:81641b28-25a8-49de-8aeb-a32892679dff.png)

In [46]:
from tensorboard.backend.event_processing.event_accumulator import EventAccumulator

In [47]:
# Get the latest event file
event_files = [f for f in os.listdir(log_dir) if "events.out.tfevents" in f]
event_files.sort()
event_file = os.path.join(log_dir, event_files[-1])  # Last event file

# Load TensorBoard logs
event_acc = EventAccumulator(event_file, size_guidance={'scalars': 0})
event_acc.Reload()

# Extract FID and IS values
fid_values = event_acc.Scalars("FID Score") if "FID Score" in event_acc.Tags()['scalars'] else []
is_values = event_acc.Scalars("Inception Score") if "Inception Score" in event_acc.Tags()['scalars'] else []

# Get the final recorded values
final_fid = fid_values[-1].value if fid_values else "Not Found"
final_is = is_values[-1].value if is_values else "Not Found"

print(f"Final FID: {final_fid}")
print(f"Final IS: {final_is}")

Final FID: 379.4053039550781
Final IS: 1.5786017179489136


The final **FID of 379.41** and **Inception Score (IS) of 1.57** suggest that the generated images exhibit **moderate realism** but still **lack diversity and structural fidelity**. The relatively high FID indicates a noticeable gap between generated and real distributions, implying that the model struggles with fine-grained details. The low IS reflects limited sample diversity, suggesting mode collapse or insufficient feature variation. Refining training stability, adjusting hyperparameters, or employing architectural improvements like spectral normalization or self-attention could enhance generation quality.

In [48]:
torch.save(generator.state_dict(), "generator_wgan_gp.pth")
torch.save(discriminator.state_dict(), "discriminator_wgan_gp.pth")

**Based on the FID and IS, we can conclude that, compared to LSGAN, WGAN and WGAN-GP generated more realistic and diverse images, while having the same architecture and being trained for the same number of epochs.**