***Rhea Chainani | 22070126086 | AIML B1***

## Installing Dependencies

In [1]:
pip install torch torchvision torchsummary medmnist tensorboard tqdm --quiet

[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m87.2/87.2 kB[0m [31m2.4 MB/s[0m eta [36m0:00:00[0m
[?25h  Preparing metadata (setup.py) ... [?25l[?25hdone
  Building wheel for fire (setup.py) ... [?25l[?25hdone
Note: you may need to restart the kernel to use updated packages.


In [2]:
pip install torch-fidelity --quiet

Note: you may need to restart the kernel to use updated packages.


In [3]:
pip install torchmetrics[image] --quiet

Note: you may need to restart the kernel to use updated packages.


## Import Libraries

In [4]:
import torch
import torch.nn as nn
import torch.optim as optim
import torchvision.transforms as transforms
import torchvision.utils as vutils
import medmnist
from medmnist import OCTMNIST
from torch.utils.data import DataLoader
from torch.utils.tensorboard import SummaryWriter
import numpy as np
import os
from tqdm import tqdm

In [5]:
from torchmetrics.image.inception import InceptionScore
from torchmetrics.image.fid import FrechetInceptionDistance

In [6]:
from torch.cuda.amp import autocast, GradScaler

In [7]:
# Checking if GPU is available
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
print(device)

cuda


## Load MedMNIST Dataset

**OCTMNIST** is a large-scale medical imaging dataset containing **109,309** Optical Coherence Tomography (OCT) scans, categorized into **four classes**: Choroidal Neovascularization (CNV), Diabetic Macular Edema (DME), Drusen, and Normal. The dataset, derived from real-world ophthalmic scans, is split into training (97,477 samples), validation 10,832 samples), and test (1,000 samples), with grayscale images standardized to **28×28 pixels** for efficient processing. Designed for medical AI research, OCTMNIST serves as a benchmark for deep learning models in **disease classification, anomaly detection, and automated diagnostics**, playing a crucial role in advancing ophthalmic healthcare.

In [8]:
# Define dataset transformation
transform = transforms.Compose([
    transforms.Grayscale(num_output_channels=1),
    transforms.ToTensor(),
    transforms.Normalize((0.5,), (0.5,))  # Normalize images to [-1, 1] for better GAN performance
])

In [9]:
# Load the MedMNIST dataset (BreastMNIST)
root_dir = "/kaggle/working/medmnist_data"
os.makedirs(root_dir, exist_ok=True)

dataset = OCTMNIST(root=root_dir, split="train", transform=transform, download=True, as_rgb=False)
dataloader = DataLoader(dataset, batch_size=64, shuffle=True)

print("Dataset loaded successfully!")

Downloading https://zenodo.org/records/10519652/files/octmnist.npz?download=1 to /kaggle/working/medmnist_data/octmnist.npz


100%|██████████| 54.9M/54.9M [00:03<00:00, 13.9MB/s]


Dataset loaded successfully!


## Define Generator & Discriminator

In [11]:
class Generator(nn.Module):
    """
    Generator Network for GANs.
    Generates synthetic images from random noise (latent vector).
    """
    def __init__(self, latent_dim=100, img_shape=(1, 28, 28)):
        super(Generator, self).__init__()
        self.img_shape = img_shape
        
        self.model = nn.Sequential(
            nn.Linear(latent_dim, 128),
            nn.ReLU(),
            nn.Linear(128, 256),
            nn.BatchNorm1d(256),
            nn.ReLU(),
            nn.Linear(256, 512),
            nn.BatchNorm1d(512),
            nn.ReLU(),
            nn.Linear(512, int(np.prod(img_shape))),  # Output flattened image
            nn.Tanh()  # Normalize output to [-1, 1]
        )

    def forward(self, z):
        """
        Forward pass: Generates an image from latent vector z.
        """
        img = self.model(z)
        img = img.view(img.size(0), *self.img_shape)  # Reshape to image format
        return img

In [12]:
class Discriminator(nn.Module):
    """
    Discriminator Network for GANs.
    Determines whether an image is real or fake.
    """
    def __init__(self, img_shape=(1, 28, 28)):
        super(Discriminator, self).__init__()
        
        self.model = nn.Sequential(
            nn.Linear(int(np.prod(img_shape)), 512),
            nn.LeakyReLU(0.2),
            nn.Linear(512, 256),
            nn.LeakyReLU(0.2),
            nn.Linear(256, 1)  # Output single score (real or fake)
        )

    def forward(self, img):
        """
        Forward pass: Classifies an image as real or fake.
        """
        img_flat = img.view(img.size(0), -1)  # Flatten image
        return self.model(img_flat)

In [13]:
def save_generated_images(generator, epoch, latent_dim=100, num_images=16, gan_type="WGAN", base_folder="generated_images"):
    folder = os.path.join(base_folder, gan_type)  # Separate folders per GAN type
    os.makedirs(folder, exist_ok=True)  

    z = torch.randn(num_images, latent_dim).to(device)
    with torch.no_grad():
        fake_imgs = generator(z)

    image_path = f"{folder}/epoch_{epoch}.png"
    vutils.save_image(fake_imgs, image_path, normalize=True)
    
    print(f"Saved: {image_path}")

In [14]:
def evaluate_performance(generator, inception, fid, latent_dim, num_images=100):
    generator.eval()
    with torch.no_grad():
        z = torch.randn(num_images, latent_dim).to(next(generator.parameters()).device)
        fake_imgs = generator(z)
        
    fake_imgs = (fake_imgs + 1) * 127.5  # Convert from [-1,1] to [0,255]
    fake_imgs = fake_imgs.clamp(0, 255).to(torch.uint8)  # Ensure values are within 0-255 and convert to uint8
    
    inception.update(fake_imgs.repeat(1, 3, 1, 1))
    fid.update((fake_imgs.repeat(1, 3, 1, 1) * 255).to(torch.uint8), real=False)
    generator.train()

## Define Training Functions

### LS-GAN Training

LSGAN replaces the binary cross-entropy loss used in standard GANs with a **least squares loss function** to stabilize training. By minimizing the **mean squared error** between real and fake labels, LSGAN reduces mode collapse and produces smoother gradients, leading to more realistic image generation.
But it is still susceptible to mode collapse and vanishing gradients.

In [15]:
def train_ls_gan(generator, discriminator, dataloader, epochs=50, latent_dim=100):
    """
    Trains a Least Squares GAN (LS-GAN).
    """
    writer = SummaryWriter(log_dir="./logs/LSGAN")
    criterion = nn.MSELoss()  # Least Squares loss
    optimizer_G = optim.Adam(generator.parameters(), lr=0.0002)
    optimizer_D = optim.Adam(discriminator.parameters(), lr=0.0002)

    inception = InceptionScore().to(device)
    fid = FrechetInceptionDistance().to(device)

    for epoch in range(epochs):
        fid.reset()  # Reset FID at the beginning of each epoch

        for i, (imgs, _) in enumerate(dataloader):
            imgs = imgs.to(device)
            
            # Train Discriminator
            optimizer_D.zero_grad()
            z = torch.randn(imgs.size(0), latent_dim).to(device)
            fake_imgs = generator(z).detach()
            
            real_preds = discriminator(imgs)
            fake_preds = discriminator(fake_imgs)

            real_loss = criterion(real_preds, torch.ones_like(real_preds))
            fake_loss = criterion(fake_preds, torch.zeros_like(fake_preds))
            
            d_loss = 0.5 * (real_loss + fake_loss)
            d_loss.backward()
            optimizer_D.step()

            # Train Generator
            optimizer_G.zero_grad()
            z = torch.randn(imgs.size(0), latent_dim).to(device)
            gen_imgs = generator(z)
            
            g_loss = criterion(discriminator(gen_imgs), torch.ones_like(discriminator(gen_imgs)))
            g_loss.backward()
            optimizer_G.step()

            # Update FID with real & fake images
            fid.update((imgs.repeat(1, 3, 1, 1) * 255).to(torch.uint8), real=True)  # Convert grayscale to 3-channel
            fid.update((fake_imgs.repeat(1, 3, 1, 1) * 255).to(torch.uint8), real=False)

        # Compute FID only if enough samples exist
        if fid.real_features_num_samples > 1 and fid.fake_features_num_samples > 1:
            fid_score = fid.compute().item()
        else:
            fid_score = float("inf")  # Placeholder until enough samples accumulate

        if epoch % 5 == 0 or epoch == epochs-1:
            save_generated_images(generator, epoch, latent_dim, gan_type="LSGAN")
            evaluate_performance(generator, inception, fid, latent_dim)

            # Generate sample images for TensorBoard
            z = torch.randn(25, latent_dim).to(device)  
            sample_imgs = generator(z).detach().cpu()
            writer.add_images(f"Generated Images", sample_imgs.repeat(1, 3, 1, 1), epoch)

            inception_score = inception.compute()[0].item()
            writer.add_scalar("Inception Score", inception_score, epoch)
            writer.add_scalar("FID Score", fid_score, epoch)

        writer.add_scalar("D Loss", d_loss.item(), epoch)
        writer.add_scalar("G Loss", g_loss.item(), epoch)

        print(f"Epoch {epoch+1}/{epochs} | D Loss: {d_loss.item():.4f} | G Loss: {g_loss.item():.4f} | FID: {fid_score:.3f}")

### WGAN Training

WGAN improves training stability by using the **Wasserstein distance (Earth Mover’s distance)** instead of the Jensen-Shannon divergence. It replaces the discriminator with a **critic**, which does not classify but rather **scores** real and fake samples. **Weight clipping** is applied to enforce the Lipschitz constraint. But one of its disadvantages is that weight clipping can limit learning capacity.



In [16]:
def train_wgan(generator, discriminator, dataloader, epochs=50, latent_dim=100, c=0.01):
    """
    Trains a Wasserstein GAN (WGAN) with weight clipping.
    """
    writer = SummaryWriter(log_dir="./logs/WGAN")
    optimizer_G = optim.RMSprop(generator.parameters(), lr=0.00005)
    optimizer_D = optim.RMSprop(discriminator.parameters(), lr=0.00005)

    inception = InceptionScore().to(device)
    fid = FrechetInceptionDistance().to(device)
    
    for epoch in range(epochs):
        fid.reset()
        
        for i, (imgs, _) in enumerate(dataloader):
            imgs = imgs.to(device)

            # Train Discriminator
            optimizer_D.zero_grad()
            z = torch.randn(imgs.size(0), latent_dim).to(device)
            fake_imgs = generator(z).detach()
            
            d_loss = -(discriminator(imgs).mean() - discriminator(fake_imgs).mean())
            d_loss.backward()
            optimizer_D.step()

            # Apply weight clipping
            for p in discriminator.parameters():
                p.data.clamp_(-c, c)

            # Update FID
            fid.update((imgs.repeat(1, 3, 1, 1) * 255).to(torch.uint8), real=True)
            fid.update((fake_imgs.repeat(1, 3, 1, 1) * 255).to(torch.uint8), real=False)
        
            # Train Generator (every 5 steps)
            if i % 5 == 0:
                optimizer_G.zero_grad()
                z = torch.randn(imgs.size(0), latent_dim).to(device)
                g_loss = -discriminator(generator(z)).mean()
                g_loss.backward()
                optimizer_G.step()

        if epoch % 5 == 0 or epoch == epochs-1:
            save_generated_images(generator, epoch, latent_dim)
            evaluate_performance(generator, inception, fid, latent_dim)
            inception_score = inception.compute()[0].item()
            fid_score = fid.compute().item()
            writer.add_scalar("Inception Score", inception_score, epoch)
            writer.add_scalar("FID Score", fid_score, epoch)
            # Generate sample images for TensorBoard
            z = torch.randn(25, latent_dim).to(device)
            sample_imgs = generator(z).detach().cpu()
            writer.add_images(f"Generated Images", sample_imgs.repeat(1, 3, 1, 1), epoch)
        
        writer.add_scalar("D Loss", d_loss.item(), epoch)
        writer.add_scalar("G Loss", g_loss.item(), epoch)
        
        print(f"Epoch {epoch+1}/{epochs} | D Loss: {d_loss.item():.4f} | G Loss: {g_loss.item():.4f} | FID: {fid_score:.3f}")

### WGAN-GP Training

WGAN-GP enhances WGAN by replacing weight clipping with a **gradient penalty**, enforcing the Lipschitz constraint more effectively. This eliminates the need for manual weight clipping and results in smoother and more realistic generated samples. But it incurrs a higher computational cost due to gradient penalty computation.

In [17]:
def compute_gradient_penalty(D, real_samples, fake_samples):
    """
    Computes the gradient penalty for WGAN-GP.
    """
    alpha = torch.rand(real_samples.size(0), 1, 1, 1).to(real_samples.device)
    interpolates = (alpha * real_samples + (1 - alpha) * fake_samples).requires_grad_(True)
    d_interpolates = D(interpolates)
    
    gradients = torch.autograd.grad(
        outputs=d_interpolates,
        inputs=interpolates,
        grad_outputs=torch.ones_like(d_interpolates),
        create_graph=True,
        retain_graph=True
    )[0]
    gradient_penalty = ((gradients.norm(2, dim=1) - 1) ** 2).mean()
    
    return gradient_penalty

In [18]:
def train_wgan_gp(generator, discriminator, dataloader, epochs=50, latent_dim=100, lambda_gp=10):
    """
    Trains a Wasserstein GAN with Gradient Penalty (WGAN-GP).
    """
    writer = SummaryWriter(log_dir="./logs/WGAN-GP")
    optimizer_G = optim.Adam(generator.parameters(), lr=0.0001, betas=(0.5, 0.9))
    optimizer_D = optim.Adam(discriminator.parameters(), lr=0.0001, betas=(0.5, 0.9))

    inception = InceptionScore().to(device)
    fid = FrechetInceptionDistance().to(device)

    for epoch in range(epochs):
        fid.reset()
        
        for i, (imgs, _) in enumerate(dataloader):
            imgs = imgs.to(device)

            optimizer_D.zero_grad()
            z = torch.randn(imgs.size(0), latent_dim).to(device)
            fake_imgs = generator(z).detach()
            
            gp = compute_gradient_penalty(discriminator, imgs, fake_imgs)
            d_loss = -(discriminator(imgs).mean() - discriminator(fake_imgs).mean()) + lambda_gp * gp
            d_loss.backward()
            optimizer_D.step()

            # Update FID
            fid.update((imgs.repeat(1, 3, 1, 1) * 255).to(torch.uint8), real=True)
            fid.update((fake_imgs.repeat(1, 3, 1, 1) * 255).to(torch.uint8), real=False)
        
            optimizer_G.zero_grad()
            z = torch.randn(imgs.size(0), latent_dim).to(device)
            g_loss = -discriminator(generator(z)).mean()
            g_loss.backward()
            optimizer_G.step()

        if epoch % 5 == 0 or epoch == epochs-1:
            save_generated_images(generator, epoch, latent_dim, gan_type="WGAN-GP")
            evaluate_performance(generator, inception, fid, latent_dim)
            inception_score = inception.compute()[0].item()
            fid_score = fid.compute().item()
            writer.add_scalar("Inception Score", inception_score, epoch)
            writer.add_scalar("FID Score", fid_score, epoch)
            # Generate sample images for TensorBoard
            z = torch.randn(25, latent_dim).to(device) 
            sample_imgs = generator(z).detach().cpu()
            writer.add_images(f"Generated Images", sample_imgs.repeat(1, 3, 1, 1), epoch)
    
        writer.add_scalar("D Loss", d_loss.item(), epoch)
        writer.add_scalar("G Loss", g_loss.item(), epoch)
        
        print(f"Epoch {epoch+1}/{epochs} | D Loss: {d_loss.item():.4f} | G Loss: {g_loss.item():.4f}  | FID: {fid_score:.3f}")

#### Train all three GANs

In [19]:
generator = Generator().to(device)
discriminator = Discriminator().to(device)

In [35]:
import shutil

def reset_generated_images(folder="generated_images"):
    """Deletes the folder if it exists, then recreates it."""
    if os.path.exists(folder):
        shutil.rmtree(folder)  # Delete the folder
    os.makedirs(folder, exist_ok=True)  # Recreate it
    print(f"Reset {folder} before training.")

reset_generated_images()

Reset generated_images before training.


In [36]:
train_ls_gan(generator, discriminator, dataloader)

Saved: generated_images/LSGAN/epoch_0.png
Epoch 1/50 | D Loss: 0.0018 | G Loss: 1.0274 | FID: 429.284
Epoch 2/50 | D Loss: 0.0017 | G Loss: 1.0360 | FID: 438.066
Epoch 3/50 | D Loss: 0.0010 | G Loss: 1.0093 | FID: 430.696
Epoch 4/50 | D Loss: 0.0026 | G Loss: 1.0710 | FID: 412.544
Epoch 5/50 | D Loss: 0.0140 | G Loss: 1.0661 | FID: 405.591
Saved: generated_images/LSGAN/epoch_5.png
Epoch 6/50 | D Loss: 0.0032 | G Loss: 1.0479 | FID: 401.628
Epoch 7/50 | D Loss: 0.0019 | G Loss: 0.9890 | FID: 384.843
Epoch 8/50 | D Loss: 0.0026 | G Loss: 0.9834 | FID: 376.907
Epoch 9/50 | D Loss: 0.0024 | G Loss: 1.0366 | FID: 355.945
Epoch 10/50 | D Loss: 0.0043 | G Loss: 1.0529 | FID: 372.588
Saved: generated_images/LSGAN/epoch_10.png
Epoch 11/50 | D Loss: 0.0136 | G Loss: 1.1006 | FID: 352.131
Epoch 12/50 | D Loss: 0.0147 | G Loss: 0.9495 | FID: 337.498
Epoch 13/50 | D Loss: 0.0081 | G Loss: 1.0099 | FID: 335.126
Epoch 14/50 | D Loss: 0.0233 | G Loss: 0.9610 | FID: 322.940
Epoch 15/50 | D Loss: 0.0085

In [37]:
folder_name = "logs/LSGAN"
zip_name = f"{folder_name}.zip"

shutil.make_archive(zip_name.replace(".zip", ""), 'zip', f"/kaggle/working/{folder_name}")

from IPython.display import FileLink
FileLink(zip_name)

In [38]:
folder_name = "generated_images/LSGAN"
zip_name = f"{folder_name}.zip"

shutil.make_archive(zip_name.replace(".zip", ""), 'zip', f"/kaggle/working/{folder_name}")

from IPython.display import FileLink
FileLink(zip_name)

In [39]:
log_dir = '/kaggle/working/logs/LSGAN'
print("Log Directory Exists:", os.path.exists(log_dir))
print("Files in Log Directory:", os.listdir(log_dir) if os.path.exists(log_dir) else "No logs found")

Log Directory Exists: True
Files in Log Directory: ['events.out.tfevents.1743312967.f942ef1cea5a.31.0', 'events.out.tfevents.1743313170.f942ef1cea5a.31.1', 'events.out.tfevents.1743313367.f942ef1cea5a.31.2']


**Generated Images**

![image.png](attachment:c2b306df-eb05-4085-adcb-ff2071ebd41c.png)

**Tensorboard Charts**

![image.png](attachment:cc267a49-3786-45ed-9d4f-822ccc450787.png)

![image.png](attachment:336be6af-7207-4997-ba02-51091b6ebbbd.png)

![image.png](attachment:a0faac3d-e3f5-42b7-91a1-479358ff5b4f.png)

![image.png](attachment:a457c7d4-df3d-44c3-926c-7084fcaa4b14.png)

In [40]:
from tensorboard.backend.event_processing.event_accumulator import EventAccumulator
import os

# Path to logs folder
log_dir = "/kaggle/working/logs/LSGAN"

# Get the latest event file
event_files = [f for f in os.listdir(log_dir) if "events.out.tfevents" in f]
event_files.sort()
event_file = os.path.join(log_dir, event_files[-1])  # Last event file

# Load TensorBoard logs
event_acc = EventAccumulator(event_file, size_guidance={'scalars': 0})
event_acc.Reload()

# Extract FID and IS values
fid_values = event_acc.Scalars("FID Score") if "FID Score" in event_acc.Tags()['scalars'] else []
is_values = event_acc.Scalars("Inception Score") if "Inception Score" in event_acc.Tags()['scalars'] else []

# Get the final recorded values
final_fid = fid_values[-1].value if fid_values else "Not Found"
final_is = is_values[-1].value if is_values else "Not Found"

print(f"Final FID: {final_fid}")
print(f"Final IS: {final_is}")

Final FID: 322.7936096191406
Final IS: 2.030306339263916


A lower FID is better (ideal: ≤10 for high-quality images). Here, we have 322.79, which is quite high, meaning the generated images are **significantly different from real images** in terms of distribution.
This suggests that the LSGAN may still be generating **low-quality or unrealistic** images, and there might be **mode collapse or instability**.

A higher IS (ideal: >8 for high-quality images) means images are more diverse and contain meaningful objects.
2.03 is very low, indicating that the images **lack diversity** and do **not have clearly recognizable features**. This suggests that the generator is struggling to produce distinct, high-quality outputs.

Possible reasons include training instability, mode collapse, insufficient training or architecture issues.

Steps to improve include training for more epochs, tweaking the learning rate or using a better architecture (such as WGAN-GP or StyleGAN).

In [41]:
torch.save(generator.state_dict(), "generator_lsgan_oct.pth")
torch.save(discriminator.state_dict(), "discriminator_lsgan_oct.pth")

In [20]:
train_wgan(generator, discriminator, dataloader)

Downloading: "https://github.com/toshas/torch-fidelity/releases/download/v0.2.0/weights-inception-2015-12-05-6726825d.pth" to /root/.cache/torch/hub/checkpoints/weights-inception-2015-12-05-6726825d.pth
100%|██████████| 91.2M/91.2M [00:00<00:00, 271MB/s]


Saved: generated_images/WGAN/epoch_0.png
Epoch 1/50 | D Loss: -0.0144 | G Loss: -0.3791 | FID: 339.232
Epoch 2/50 | D Loss: -0.4078 | G Loss: 0.2648 | FID: 339.232
Epoch 3/50 | D Loss: -0.8080 | G Loss: 0.1110 | FID: 339.232
Epoch 4/50 | D Loss: -0.4253 | G Loss: -0.2184 | FID: 339.232
Epoch 5/50 | D Loss: -0.1454 | G Loss: -0.9751 | FID: 339.232
Saved: generated_images/WGAN/epoch_5.png
Epoch 6/50 | D Loss: -0.1890 | G Loss: -1.0349 | FID: 289.139
Epoch 7/50 | D Loss: -0.1310 | G Loss: -0.1320 | FID: 289.139
Epoch 8/50 | D Loss: -0.6477 | G Loss: -0.2450 | FID: 289.139
Epoch 9/50 | D Loss: -0.4273 | G Loss: -0.1183 | FID: 289.139
Epoch 10/50 | D Loss: -0.4824 | G Loss: -0.3299 | FID: 289.139
Saved: generated_images/WGAN/epoch_10.png
Epoch 11/50 | D Loss: -0.3240 | G Loss: -0.2695 | FID: 268.945
Epoch 12/50 | D Loss: -0.3473 | G Loss: -0.2722 | FID: 268.945
Epoch 13/50 | D Loss: -0.2455 | G Loss: -0.5886 | FID: 268.945
Epoch 14/50 | D Loss: -0.0294 | G Loss: -0.5596 | FID: 268.945
Epoch

In [22]:
import shutil

In [23]:
folder_name = "logs/WGAN"
zip_name = f"{folder_name}.zip"

shutil.make_archive(zip_name.replace(".zip", ""), 'zip', f"/kaggle/working/{folder_name}")

from IPython.display import FileLink
FileLink(zip_name)

In [24]:
folder_name = "generated_images/WGAN"
zip_name = f"{folder_name}.zip"

shutil.make_archive(zip_name.replace(".zip", ""), 'zip', f"/kaggle/working/{folder_name}")

from IPython.display import FileLink
FileLink(zip_name)

In [25]:
log_dir = "/kaggle/working/logs/WGAN"
print("Log Directory Exists:", os.path.exists(log_dir))
print("Files in Log Directory:", os.listdir(log_dir) if os.path.exists(log_dir) else "No logs found")

Log Directory Exists: True
Files in Log Directory: ['events.out.tfevents.1743322151.37549b4a5133.31.0']


**Generated Images**

![image.png](attachment:9030616e-f2ef-4917-b979-f96a92afab14.png)

**Tensorboard Charts**

![image.png](attachment:52d6c8b5-83a2-4a73-a0ce-509a6a517f8a.png)

![image.png](attachment:7bc8e735-61a8-43d4-9dea-3127267c7b2c.png)

![image.png](attachment:ce98db96-3dd7-421b-acb6-c4d5f0bb6376.png)

![image.png](attachment:f7f233ce-eea0-4cb5-bfd9-01235afac4e7.png)

In [26]:
from tensorboard.backend.event_processing.event_accumulator import EventAccumulator

In [27]:
# Get the latest event file
event_files = [f for f in os.listdir(log_dir) if "events.out.tfevents" in f]
event_files.sort()
event_file = os.path.join(log_dir, event_files[-1])  # Last event file

# Load TensorBoard logs
event_acc = EventAccumulator(event_file, size_guidance={'scalars': 0})
event_acc.Reload()

# Extract FID and IS values
fid_values = event_acc.Scalars("FID Score") if "FID Score" in event_acc.Tags()['scalars'] else []
is_values = event_acc.Scalars("Inception Score") if "Inception Score" in event_acc.Tags()['scalars'] else []

# Get the final recorded values
final_fid = fid_values[-1].value if fid_values else "Not Found"
final_is = is_values[-1].value if is_values else "Not Found"

print(f"Final FID: {final_fid}")
print(f"Final IS: {final_is}")

Final FID: 231.15093994140625
Final IS: 1.9110716581344604


The **FID score of 231.15** suggests that the generated images are far from real, while the **Inception Score of 1.91** indicates low diversity and quality. But we could notice the **reducing FID** after each epoch, indicating the model's improvement. WGAN may need more training, better hyperparameters, or a stronger generator to improve results. Issues like weight clipping limitations or an overly strong discriminator could be affecting performance. Switching to **WGAN-GP**, increasing training epochs, or adjusting model architecture may help enhance image quality.

In [28]:
torch.save(generator.state_dict(), "generator_wgan_oct.pth")
torch.save(discriminator.state_dict(), "discriminator_wgan_oct.pth")

In [29]:
train_wgan_gp(generator, discriminator, dataloader)

Saved: generated_images/WGAN-GP/epoch_0.png
Epoch 1/50 | D Loss: 9.6384 | G Loss: 6.9591  | FID: 233.323
Epoch 2/50 | D Loss: 6.7523 | G Loss: 10.1603  | FID: 233.323
Epoch 3/50 | D Loss: 12.3391 | G Loss: -7.0281  | FID: 233.323
Epoch 4/50 | D Loss: 7.5336 | G Loss: -12.4987  | FID: 233.323
Epoch 5/50 | D Loss: 10.6248 | G Loss: -1.1303  | FID: 233.323
Saved: generated_images/WGAN-GP/epoch_5.png
Epoch 6/50 | D Loss: 9.3160 | G Loss: -1.2006  | FID: 230.585
Epoch 7/50 | D Loss: 9.4463 | G Loss: -0.5962  | FID: 230.585
Epoch 8/50 | D Loss: 10.0628 | G Loss: -3.9170  | FID: 230.585
Epoch 9/50 | D Loss: 8.5192 | G Loss: 2.6030  | FID: 230.585
Epoch 10/50 | D Loss: 8.5411 | G Loss: 4.1475  | FID: 230.585
Saved: generated_images/WGAN-GP/epoch_10.png
Epoch 11/50 | D Loss: 6.9804 | G Loss: -3.0808  | FID: 237.705
Epoch 12/50 | D Loss: 7.9945 | G Loss: -7.5394  | FID: 237.705
Epoch 13/50 | D Loss: 8.6302 | G Loss: -11.0686  | FID: 237.705
Epoch 14/50 | D Loss: 6.6470 | G Loss: -20.0143  | FID:

In [30]:
import shutil
folder_name = "logs/WGAN-GP"
zip_name = f"{folder_name}.zip"

shutil.make_archive(zip_name.replace(".zip", ""), 'zip', f"/kaggle/working/{folder_name}")

from IPython.display import FileLink
FileLink(zip_name)

In [31]:
folder_name = "generated_images/WGAN-GP"
zip_name = f"{folder_name}.zip"

shutil.make_archive(zip_name.replace(".zip", ""), 'zip', f"/kaggle/working/{folder_name}")

from IPython.display import FileLink
FileLink(zip_name)

In [32]:
log_dir = "/kaggle/working/logs/WGAN-GP"
print("Log Directory Exists:", os.path.exists(log_dir))
print("Files in Log Directory:", os.listdir(log_dir) if os.path.exists(log_dir) else "No logs found")

Log Directory Exists: True
Files in Log Directory: ['events.out.tfevents.1743325659.37549b4a5133.31.1']


**Generated Images**

![image.png](attachment:9d598534-5857-4e61-95e3-4d317529db82.png)

**Tensorboard Charts**

![image.png](attachment:f12bd854-50ad-45b4-9d22-f633fa63df68.png)

![image.png](attachment:55065f79-522e-4154-93fd-81e602ec5615.png)

![image.png](attachment:3cc895bb-beb9-48eb-a246-c43ad5ccbff4.png)

![image.png](attachment:a5b954b8-9085-4cf9-90ee-84669438405c.png)

In [33]:
from tensorboard.backend.event_processing.event_accumulator import EventAccumulator

In [34]:
# Get the latest event file
event_files = [f for f in os.listdir(log_dir) if "events.out.tfevents" in f]
event_files.sort()
event_file = os.path.join(log_dir, event_files[-1])  # Last event file

# Load TensorBoard logs
event_acc = EventAccumulator(event_file, size_guidance={'scalars': 0})
event_acc.Reload()

# Extract FID and IS values
fid_values = event_acc.Scalars("FID Score") if "FID Score" in event_acc.Tags()['scalars'] else []
is_values = event_acc.Scalars("Inception Score") if "Inception Score" in event_acc.Tags()['scalars'] else []

# Get the final recorded values
final_fid = fid_values[-1].value if fid_values else "Not Found"
final_is = is_values[-1].value if is_values else "Not Found"

print(f"Final FID: {final_fid}")
print(f"Final IS: {final_is}")

Final FID: 227.2685089111328
Final IS: 1.9778584241867065


The final **FID of 227.27** and **Inception Score (IS) of 1.98** suggest that the generated images exhibit **moderate realism** but still **lack diversity and structural fidelity**. The relatively high FID indicates a noticeable gap between generated and real distributions, implying that the model struggles with fine-grained details. The low IS reflects limited sample diversity, suggesting mode collapse or insufficient feature variation. Refining training stability, adjusting hyperparameters, or employing architectural improvements like spectral normalization or self-attention could enhance generation quality.

In [36]:
torch.save(generator.state_dict(), "generator_wgan_gp_oct.pth")
torch.save(discriminator.state_dict(), "discriminator_wgan_gp_oct.pth")

**As WGAN-GP had the lowest FID and highest IS, we can conclude that, compared to LSGAN and WGAN with the same architecture and same number of epochs, it had the most realistic and diverse generated images.**