# **Denoising and Augmenting Medical Images for Better Machine Learning Models**

## Minds behind this notebook
* [Ikjot Singh](https://www.kaggle.com/ikjotsingh221) 
* [Prisha Sawhney](https://www.kaggle.com/prishasawhney)

## **Overview**
Medical imaging is a cornerstone of modern healthcare, aiding in accurate diagnoses and treatment planning. However, one of the biggest challenges in leveraging machine learning for medical applications is the **limited availability of high-quality labeled datasets**. The sensitive nature of patient data and the high cost of annotations make large-scale data collection a formidable task.  

This notebook aims to address this challenge by employing **advanced image processing techniques and Generative Adversarial Networks (GANs)** to:
1. **Denoise medical images** — enhancing their quality for better interpretability and training outcomes.
2. **Augment datasets** — generating synthetic but realistic medical images to overcome the issue of insufficient data.  

## **Key Objectives**
- **Denoising**: Medical images, especially those from modalities like X-rays and MRIs, are often subject to noise due to equipment limitations or environmental factors. This notebook demonstrates how to preprocess these images to improve their usability.  
- **Augmentation**: Using GANs, we generate high-quality synthetic images that mimic the properties of real medical data, providing an effective solution for data scarcity.  

## **Why This Matters**
- **Improved Model Performance**: Denoised and augmented datasets improve the robustness and accuracy of machine learning models, especially in critical tasks like disease diagnosis and severity assessment.  
- **Bridging the Data Gap**: By leveraging GANs for synthetic data generation, this notebook contributes to bridging the gap between the growing demand for AI in medicine and the lack of sufficient data to train these systems.  
- **Scalable Solutions**: The techniques showcased here can be adapted to various types of medical imaging data, making it a versatile approach to data enhancement.  

## **Highlights**
- Use of **autoencoders** for denoising noisy medical images.
- Implementation of **state-of-the-art GANs** to generate synthetic medical images.
- Evaluation of generated data quality using metrics like **PSNR (Peak Signal-to-Noise Ratio)** and **SSIM (Structural Similarity Index)**.
- Seamless integration of denoising and augmentation pipelines for medical datasets.

## **Target Audience**
This notebook is designed for:
- **AI researchers** working in the healthcare domain.
- **Data scientists and ML practitioners** facing challenges with small medical datasets.
- **Healthcare professionals and radiologists** interested in understanding how AI can enhance medical imaging analysis.

---

Follow along as we explore how cutting-edge techniques can transform noisy, limited medical datasets into robust resources for building impactful machine learning models.  


# Phase 1: Denoising the Images

## Step 1: Import all the necessary libraries

In [None]:
# General-purpose libraries
import os
import random
import warnings
import numpy as np
import pandas as pd
import cv2
import matplotlib.pyplot as plt
from tqdm import tqdm
import seaborn as sns
from PIL import Image

# Suppress TensorFlow warnings
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3'
warnings.filterwarnings('ignore')

# Scikit-learn for data splitting and metrics
from sklearn.model_selection import train_test_split
from skimage.metrics import peak_signal_noise_ratio as psnr, structural_similarity as ssim

# TensorFlow and Keras for deep learning
import tensorflow as tf

gpus = tf.config.list_physical_devices('GPU')
if gpus:
    try:
        for gpu in gpus:
            tf.config.experimental.set_memory_growth(gpu, True)
    except RuntimeError as e:
        print(e)

from tensorflow.keras.layers import (
    Input, Conv2D, BatchNormalization, MaxPooling2D, UpSampling2D, Add, Dense, Flatten, 
    Reshape, Conv2DTranspose, Activation, LeakyReLU, Dropout, Resizing
)
from tensorflow.keras.models import Model
from tensorflow.keras.applications import VGG16
try:
    from tensorflow.keras.optimizers import Adam
except:
    from keras.optimizers import Adam

# Keras (standalone) for additional deep learning layers and models
from keras.models import Sequential
from keras.layers import concatenate

# PyTorch for custom neural network models and training
import torch
import torch.nn as nn
import torch.optim as optim
from torch.utils.data import DataLoader, Dataset
from torchvision import transforms


## Step 2: Loading the dataset

In [None]:
image_paths = [os.path.join(r'/kaggle/input/medical-image-dataset/Dataset', fname) for fname in os.listdir(r'/kaggle/input/medical-image-dataset/Dataset')]

## Step 3: Preparing the dataset
This section focuses on preparing a dataset of clean and noisy images for training and evaluation purposes. The prepare_dataset function processes a list of image paths, resizes the images to a specified dimension (default is 128x128), and normalizes their pixel values to the range [0, 1]. It also introduces Gaussian noise to the images using the add_gaussian_noise function.

The add_gaussian_noise function generates random noise following a Gaussian distribution, which is scaled by an alpha factor before being added to the original image. The noisy image values are clipped to ensure they remain within the valid pixel intensity range.

The result is two sets of images:

- Clean images: Original images normalized to [0, 1].
- Noisy images: Corresponding images with added Gaussian noise, also normalized to [0, 1].

Both datasets are returned with an added channel dimension to be compatible with deep learning models.

In [None]:
# Function to add Gaussian noise
def add_gaussian_noise(image, mean=0, stddev=1, alpha=1):
    noise = np.random.normal(mean, stddev, image.shape).astype(np.float32)
    noisy_image = image + noise*alpha
    noisy_image = np.clip(noisy_image, 0, 255)  # Clip values to valid range
    return noisy_image

In [None]:
# Trying out the noise function on the first image of the dataset
image = cv2.imread(r'/kaggle/input/medical-image-dataset/Dataset/1.jpg', cv2.IMREAD_GRAYSCALE)  # Replace with your image path
image = cv2.resize(image, (128, 128))  # Resize for model input compatibility

# Generate noisy images
gaussian_noisy_image = add_gaussian_noise(image,0,1,5)


# Display original and noisy images
plt.figure(figsize=(12, 6))
plt.subplot(1, 3, 1), plt.title("Original"), plt.imshow(image, cmap='gray')
plt.subplot(1, 3, 2), plt.title("Gaussian Noise"), plt.imshow(gaussian_noisy_image, cmap='gray')
plt.show()


In [None]:
# Load dataset and add noise
def prepare_dataset(image_paths, noise_function, resize_dim=(128,128)):
    clean_images = []
    noisy_images = []
    for path in image_paths:
        image = cv2.imread(path, cv2.IMREAD_GRAYSCALE)
        image = cv2.resize(image, resize_dim)
        clean_images.append(image)
        noisy_images.append(noise_function(image, 0,1,5))
    clean_images = np.array(clean_images).astype('float32') / 255.0  # Normalize to [0, 1]
    noisy_images = np.array(noisy_images).astype('float32') / 255.0  # Normalize to [0, 1]
    return clean_images[..., np.newaxis], noisy_images[..., np.newaxis]  # Add channel dimension

In [None]:
clean_images, noisy_images = prepare_dataset(image_paths, add_gaussian_noise)

In [None]:
# Split into train and validation sets
X_train, X_val, y_train, y_val = train_test_split(noisy_images, clean_images, test_size=0.2, random_state=42)

In [None]:
print(f"Shape of X_train: {X_train.shape}")
print(f"Shape of X_val: {X_val.shape}")

In [None]:
model_list =[]
rows = []
Model_analysis = pd.DataFrame(columns=['Model Name', 'PSNR', 'SSIM'])

## Step 4: Creating a few helper functions


### PSNR and SSIM: Evaluation Metrics for Image Quality

When evaluating image quality, especially in tasks like image denoising or compression, **Peak Signal-to-Noise Ratio (PSNR)** and **Structural Similarity Index (SSIM)** are commonly used metrics.  


#### **1. Peak Signal-to-Noise Ratio (PSNR)**

PSNR measures the ratio between the maximum possible pixel value and the distortion introduced in the image. It quantifies image reconstruction quality, with higher values indicating better quality.


#### **2. Structural Similarity Index (SSIM)**
SSIM measures the perceptual similarity between two images based on luminance, contrast, and structural information. It accounts for human visual perception, making it more aligned with perceived image quality.

#### **Key Differences**:
- **PSNR**: Focuses purely on numerical differences between pixel intensities.
- **SSIM**: Emphasizes structural and perceptual similarity, providing a better approximation of human visual judgment. 

Higher values of PSNR and SSIM indicate better image quality and closer similarity to the original image.

In [None]:
# Function for evaluating PSNR and SSIM
def evaluate_psnr_ssim(original_images, denoised_images):
    psnr_values = []
    ssim_values = []
    
    for original, denoised in zip(original_images, denoised_images):
        # Compute PSNR
        psnr_value = psnr(original, denoised, data_range=1.0)  # Normalized images, range [0, 1]
        psnr_values.append(psnr_value)
        
        # Compute SSIM
        ssim_value = ssim(original.squeeze(), denoised.squeeze(), data_range=1.0)  # Grayscale (squeeze for 2D)
        ssim_values.append(ssim_value)
    
    # Print average metrics
    print(f"Average PSNR: {np.mean(psnr_values):.2f} dB")
    print(f"Average SSIM: {np.mean(ssim_values):.4f}")
    return (np.mean(psnr_values), np.mean(ssim_values))

In [None]:
# Visualize predictions on the first 5 validation set images
def visualize_results(model, X_val, y_val, num_images=5):
    fig, axes = plt.subplots(num_images, 4, figsize=(20, 6 * num_images))
    fig.suptitle("Model Results: Original vs. Noisy vs. Denoised", fontsize=16)

    for i in range(num_images):
        # Original, Noisy, and Denoised images
        original = y_val[i]
        noisy = X_val[i]
        denoised = model.predict(noisy[np.newaxis, ...])[0]

        # Metrics
        psnr_score = psnr(original, denoised, data_range=1)
        ssim_score = ssim(original, denoised, data_range=1, win_size=5, channel_axis=-1)  # Adjust win_size

        # Plotting
        axes[i, 0].imshow(original, cmap='gray')
        axes[i, 0].set_title(f"Original")
        axes[i, 0].axis('off')

        axes[i, 1].imshow(noisy, cmap='gray')
        axes[i, 1].set_title(f"Noisy")
        axes[i, 1].axis('off')

        axes[i, 2].imshow(denoised, cmap='gray')
        axes[i, 2].set_title(f"Denoised\nPSNR: {psnr_score:.2f}, SSIM: {ssim_score:.3f}")
        axes[i, 2].axis('off')

        # Difference (optional for error visualization)
        diff = abs(original - denoised)
        axes[i, 3].imshow(diff, cmap='hot')
        axes[i, 3].set_title(f"Difference")
        axes[i, 3].axis('off')

    plt.tight_layout(rect=[0, 0, 1, 0.96])  # Adjust layout to fit the title
    plt.show()

In [None]:
# Function to evaluate and visualize results for vision transformers
def visualize_test_results(model, image_path, transform, mean, stddev, alpha, device):
    # Load and preprocess the clean image
    clean_img = Image.open(image_path).convert("L")
    clean_img = np.array(clean_img, dtype=np.float32)
    noisy_img = add_gaussian_noise(clean_img, mean=mean, stddev=stddev, alpha=alpha)

    # Convert images to PyTorch tensors
    clean_img_tensor = transform(Image.fromarray(clean_img.astype(np.uint8))).unsqueeze(0).to(device)
    noisy_img_tensor = transform(Image.fromarray(noisy_img.astype(np.uint8))).unsqueeze(0).to(device)

    # Denoise with the model
    model.eval()
    with torch.no_grad():
        denoised_img_tensor = model(noisy_img_tensor)

    # Convert tensors to numpy arrays
    denoised_img = denoised_img_tensor.squeeze(0).squeeze(0).cpu().numpy()
    clean_img = clean_img_tensor.squeeze(0).squeeze(0).cpu().numpy()

    # Compute metrics
    psnr_score = psnr(clean_img, denoised_img, data_range=1.0)
    ssim_score = ssim(clean_img, denoised_img, data_range=1.0)

    # Plot results
    fig, axes = plt.subplots(1, 4, figsize=(20, 5))
    axes[0].imshow(clean_img, cmap="gray")
    axes[0].set_title("Original")
    axes[0].axis("off")

    axes[1].imshow(noisy_img, cmap="gray")
    axes[1].set_title("Noisy")
    axes[1].axis("off")

    axes[2].imshow(denoised_img, cmap="gray")
    axes[2].set_title(f"Denoised\nPSNR: {psnr_score:.2f} dB, SSIM: {ssim_score:.4f}")
    axes[2].axis("off")

    axes[3].imshow(np.abs(clean_img - denoised_img), cmap="hot")
    axes[3].set_title("Difference")
    axes[3].axis("off")

    plt.tight_layout()
    plt.show()


## Step 5: Let's dive into analyzing a few architectures



### 1. Deep Autoencoder Architecture for 128x128 Grayscale Images

This code defines a **convolutional autoencoder** designed for denoising grayscale images of size 128x128. The architecture consists of two main components:

1. **Encoder**:
   - Extracts features while progressively reducing the spatial dimensions using **Conv2D**, **BatchNormalization**, and **MaxPooling2D** layers.

2. **Decoder**:
   - Reconstructs the image from the encoded representation, upsampling the spatial dimensions using **Conv2D**, **BatchNormalization**, and **UpSampling2D** layers.

---

#### **Architecture Details**
- **Total Layers**: 14 layers (7 Conv2D layers, 4 BatchNormalization layers, 3 Pooling/Upsampross all layers.

The model is compiled with the **Adam optimizer** and uses **Mean Squared Error (MSE)** as the loss function for training.

In [None]:
model_name = 'Deep Autoencoder'

In [None]:
def autoencoder_deep_128():
    input_img = Input(shape=(128, 128, 1), name='image_input')

    # Encoder
    x = Conv2D(64, (3, 3), activation='relu', padding='same', name='Conv1')(input_img)
    x = BatchNormalization()(x)
    x = MaxPooling2D((2, 2), padding='same', name='pool1')(x)
    x = Conv2D(128, (3, 3), activation='relu', padding='same', name='Conv2')(x)
    x = BatchNormalization()(x)
    x = MaxPooling2D((2, 2), padding='same', name='pool2')(x)
    x = Conv2D(256, (3, 3), activation='relu', padding='same', name='Conv3')(x)
    x = BatchNormalization()(x)
    x = MaxPooling2D((2, 2), padding='same', name='pool3')(x)

    # Decoder
    x = Conv2D(256, (3, 3), activation='relu', padding='same', name='Conv4')(x)
    x = BatchNormalization()(x)
    x = UpSampling2D((2, 2), name='upsample1')(x)
    x = Conv2D(128, (3, 3), activation='relu', padding='same', name='Conv5')(x)
    x = BatchNormalization()(x)
    x = UpSampling2D((2, 2), name='upsample2')(x)
    x = Conv2D(64, (3, 3), activation='relu', padding='same', name='Conv6')(x)
    x = BatchNormalization()(x)
    x = UpSampling2D((2, 2), name='upsample3')(x)
    output_img = Conv2D(1, (3, 3), activation='sigmoid', padding='same', name='Conv7')(x)

    # Model
    autoencoder = Model(inputs=input_img, outputs=output_img)
    autoencoder.compile(optimizer='adam', loss='mean_squared_error')
    
    return autoencoder

In [None]:
autoencoder_deep = autoencoder_deep_128()
autoencoder_deep.summary()

In [None]:
history = autoencoder_deep.fit(
    X_train, y_train, batch_size=1,
    epochs=100,
    validation_data=(X_val, y_val),
    verbose=1
)
model_list.append(autoencoder_deep)

In [None]:
# Generate denoised images
denoised_images = autoencoder_deep.predict(X_val)

# Evaluate PSNR and SSIM
s1, s2 = evaluate_psnr_ssim(y_val, denoised_images)
rows.append({'Model Name': model_name, 'PSNR': s1, 'SSIM': s2})

In [None]:
visualize_results(autoencoder_deep, X_val, y_val, num_images=5)

### 2. CNN-Based Denoising Model with PyTorch

This code implements a convolutional neural network (CNN) for denoising grayscale medical images. The pipeline includes the following components:

1. **Model Architecture**: 
   - The CNN is divided into:
     - **Encoder**: Two convolutional layers (64 filters, 3x3 kernel) with ReLU activation.
     - **Decoder**: Two convolutional layers (64 filters for intermediate, 1 filter for output) with ReLU activation.

2. **Dataset Preparation**:
   - Images are loaded from a specified folder and split into training and test datasets.
   - Gaussian noise is added to create noisy images for training.

3. **Training**:
   - The model is 2rained for **40 epochs** using the **Adam optimizer** and **Mean Squared Error (MSE)** loss function.
   - Training is performed on batches of size 16.

4. **Visualization**:
   - The results of denoising are visualized on a sampl have shown promising results.

In [None]:
model_name = "CNN-Denoising"

In [None]:
def create_denoising_cnn():
    encoder = nn.Sequential(
        nn.Conv2d(1, 64, kernel_size=3, padding=1),
        nn.ReLU(),
        nn.Conv2d(64, 64, kernel_size=3, padding=1),
        nn.ReLU(),
    )
    decoder = nn.Sequential(
        nn.Conv2d(64, 64, kernel_size=3, padding=1),
        nn.ReLU(),
        nn.Conv2d(64, 1, kernel_size=3, padding=1),
    )
    return nn.Sequential(encoder, decoder)

# Training function
def train_model(model, dataloader, optimizer, criterion, device, epochs=10):
    model.train()
    for epoch in range(epochs):
        epoch_loss = 0
        for noisy_imgs, clean_imgs in dataloader:
            noisy_imgs, clean_imgs = noisy_imgs.to(device), clean_imgs.to(device)

            optimizer.zero_grad()
            outputs = model(noisy_imgs)
            loss = criterion(outputs, clean_imgs)
            loss.backward()
            optimizer.step()

            epoch_loss += loss.item()

        print(f"Epoch {epoch + 1}/{epochs}, Loss: {epoch_loss / len(dataloader):.4f}")


# Paths and configuration
image_folder = "/kaggle/input/medical-image-dataset/Dataset"  # Replace with the folder path
all_image_paths = [os.path.join(image_folder, fname) for fname in os.listdir(image_folder) if fname.endswith(".jpg")]

# Split dataset
train_image_paths, test_image_paths = train_test_split(all_image_paths, test_size=0.2, random_state=42)

# Device and transformation setup
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
transform = transforms.Compose([transforms.ToTensor()])

# Dataset preparation
train_clean, train_noisy = prepare_dataset(train_image_paths, add_gaussian_noise)
train_dataset = torch.utils.data.TensorDataset(torch.tensor(train_noisy).permute(0, 3, 1, 2), torch.tensor(train_clean).permute(0, 3, 1, 2))
train_loader = DataLoader(train_dataset, batch_size=16, shuffle=True)

# Initialize model, optimizer, and loss function
model = create_denoising_cnn().to(device)
optimizer = optim.Adam(model.parameters(), lr=0.001)
criterion = nn.MSELoss()

# Train the model
train_model(model, train_loader, optimizer, criterion, device, epochs=20)

# Test and visualize results
for i, img_path in enumerate(test_image_paths[:5]):  # Visualize first 5 test results
    visualize_test_results(model, img_path, transform, mean=0, stddev=1, alpha=5, device=device)

In [None]:
model_list.append(model)
model.eval()

# Ensure input data is in the correct format
X_val_tensor = torch.tensor(X_val).permute(0, 3, 1, 2).to(device)

# Disable gradient computation for inference
with torch.no_grad():
    denoised_images = model(X_val_tensor)

# Move denoised images back to CPU and convert to NumPy for evaluation
denoised_images = denoised_images.permute(0, 2, 3, 1).cpu().numpy()
s1, s2 = evaluate_psnr_ssim(y_val, denoised_images)
rows.append({'Model Name': model_name, 'PSNR': s1, 'SSIM': s2})

### 3. Multiscale Autoencoder Architecture Overview

This code defines a **multiscale autoencoder** for image processing, designed to reconstruct input images with improved feature extraction through the use of skip connections. Here's a breakdown of the architecture:

- **Input Layer**: Takes grayscale images of size 128x128 pixels with a single channel.
- **Encoder**:
  - **First block**: Convolution with 64 filters and 3x3 kernel size, followed by a max-pooling layer, reducing the spatial dimensions to 64x64.
  - **Second block**: Convolution with 128 filters, followed by another max-pooling, further reducing dimensions to 32x32.
  - **Third block**: Convolution with 256 filters and max-pooling, reducing dimensions to 16x16.
- **Decoder**:
  - **First block**: Up-sampling followed by convolution with 128 filters, reconstructing the feature map to 32x32.
  - **Second block**: A skip connection concatenates the output with the corresponding encoder layer (32x32), followed by up-sampling and a convolution with 64 filters to reach 64x64.
  - **Third block**: Another skip connection connects the output with the first encoder block (64x64), followed by up-sampling and convolution to reconstruct the final output size of 128x128.
- **Output Layer**: A convolutional layer with 1 filter and a sigmoid activation function, producing the final denoised image with the shape (128x128, 1).

### Total Number of Layers:
- **Convolutional layers**: 8 layers (including encoder and decoder blocks)
- **Pooling layers**: 3 max-pooling layers
- **Up-sampling layers**: 3 layers
- **Skip connections**: 2 concatenations

This multiscale architecture effectively captures both local and global features from the input image, allowing for enhanced reconstruction quality and detail retention.

In [None]:
model_name = "Multiscale Autoencoder"

In [None]:
def autoencoder_multiscale_128():
    input_img = Input(shape=(128, 128, 1))
    
    # Encoder
    x1 = Conv2D(64, (3, 3), activation='relu', padding='same')(input_img)
    x1 = MaxPooling2D((2, 2), padding='same')(x1)  # (64x64)
    
    x2 = Conv2D(128, (3, 3), activation='relu', padding='same')(x1)
    x2 = MaxPooling2D((2, 2), padding='same')(x2)  # (32x32)
    
    x3 = Conv2D(256, (3, 3), activation='relu', padding='same')(x2)
    x3 = MaxPooling2D((2, 2), padding='same')(x3)  # (16x16)
    
    # Decoder
    y3 = UpSampling2D((2, 2))(x3)  # (32x32)
    y3 = Conv2D(128, (3, 3), activation='relu', padding='same')(y3)
    
    y2 = concatenate([y3, x2])  # Skip connection
    y2 = UpSampling2D((2, 2))(y2)  # (64x64)
    y2 = Conv2D(64, (3, 3), activation='relu', padding='same')(y2)
    
    y1 = concatenate([y2, x1])  # Skip connection
    y1 = UpSampling2D((2, 2))(y1)  # (128x128)
    y1 = Conv2D(64, (3, 3), activation='relu', padding='same')(y1)
    
    # Output Layer
    output_img = Conv2D(1, (3, 3), activation='sigmoid', padding='same')(y1)  # Match (128x128, 1)
    
    # Model
    autoencoder = Model(inputs=input_img, outputs=output_img)
    autoencoder.compile(optimizer='adam', loss='mean_squared_error')
    return autoencoder

In [None]:
autoencoder_multiscale = autoencoder_multiscale_128()
autoencoder_multiscale.summary()

In [None]:
history = autoencoder_multiscale.fit(
    X_train, y_train, batch_size=1,
    epochs=50,
    validation_data=(X_val, y_val),
    verbose=1
)

In [None]:
model_list.append(autoencoder_multiscale)

In [None]:
# Generate denoised images
denoised_images = autoencoder_multiscale.predict(X_val)

# Evaluate PSNR and SSIM
s1, s2 = evaluate_psnr_ssim(y_val, denoised_images)
rows.append({'Model Name': model_name, 'PSNR': s1, 'SSIM': s2})

In [None]:
visualize_results(autoencoder_multiscale, X_val, y_val, num_images=5)

### 4. Residual Autoencoder Architecture Overview

This code defines a **residual autoencoder** for image processing, which incorporates skip (residual) connections within the decoder to enhance feature propagation and gradient flow. Here's a breakdown of the architecture:

- **Input Layer**: Accepts grayscale images of size 128x128 pixels with a single channel.
- **Encoder**:
  - **First block**: A convolutional layer with 64 filters, a 3x3 kernel, and ReLU activation, followed by a max-pooling layer that reduces the spatial dimensions to 64x64.
  - **Second block**: Another convolutional layer with 64 filters, a 3x3 kernel, and ReLU activation, followed by another max-pooling layer, further reducing the dimensions to 32x32.
- **Decoder with Residual Connections**:
  - **First block**: A convolutional layer with 64 filters, a 3x3 kernel, and ReLU activation, followed by up-sampling to 64x64.
  - **Skip connection**: Adds the up-sampled feature map to the feature map from the encoder at the corresponding level (32x32).
  - **Second block**: A convolutional layer with 64 filters and a 3x3 kernel, followed by up-sampling to 128x128.
  - **Skip connection**: Adds the up-sampled feature map to the feature map from the first encoder block (64x64).
- **Output Layer**: A final convolutional layer with 1 filter and a sigmoid activation function, generating the reconstructed image (128x128, 1).

### Residual Connections:
- **Purpose**: These connections allow the model to learn the residual mapping, improving the model's ability to reconstruct complex features and alleviating the vanishing gradient problem.
- **Structure**: Two main skip connections are added in the decoder section, linking the up-sampled outputs with their corresponding encoder outputs.

### Total Number of Layers:
- **Convolutional layers**: 5 layers (including both encoder and decoder)
- **Pooling layers**: 2 max-pooling layers
- **Up-sampling layers**: 2 layers
- **Skip connections**: 2 residual connections

This **residual autoencoder** architecture helps achieve better reconstruction performance by leveraging deep learning principles that improve information flow through the network, making it more effective for tasks like image denoising and enhancement.

In [None]:
model_name = "Residual Autoencoder"

In [None]:
def autoencoder_residual():
    input_img = Input(shape=(128, 128, 1), name='image_input')

    # Encoder
    x1 = Conv2D(64, (3, 3), activation='relu', padding='same', name='Conv1')(input_img)
    x1_pool = MaxPooling2D((2, 2), padding='same', name='pool1')(x1)
    x2 = Conv2D(64, (3, 3), activation='relu', padding='same', name='Conv2')(x1_pool)
    x2_pool = MaxPooling2D((2, 2), padding='same', name='pool2')(x2)

    # Decoder with Residual Connections
    x3 = Conv2D(64, (3, 3), activation='relu', padding='same', name='Conv3')(x2_pool)
    x3_upsample = UpSampling2D((2, 2), name='upsample1')(x3)
    x3_add = Add()([x3_upsample, x2])  # Skip connection
    x4 = Conv2D(64, (3, 3), activation='relu', padding='same', name='Conv4')(x3_add)
    x4_upsample = UpSampling2D((2, 2), name='upsample2')(x4)
    x4_add = Add()([x4_upsample, x1])  # Skip connection
    output_img = Conv2D(1, (3, 3), activation='sigmoid', padding='same', name='Conv5')(x4_add)

    # Model
    autoencoder = Model(inputs=input_img, outputs=output_img)
    autoencoder.compile(optimizer='adam', loss='mean_squared_error')
    
    return autoencoder

In [None]:
# Build the model
resdae_model = autoencoder_residual()

# Train the model
history = resdae_model.fit(
    X_train, y_train, batch_size=1,
    epochs=50,
    validation_data=(X_val, y_val),
    verbose=1
)

In [None]:
# Generate denoised images

model_list.append(resdae_model)
denoised_images = resdae_model.predict(X_val)

# Evaluate PSNR and SSIM
s1, s2 = evaluate_psnr_ssim(y_val, denoised_images)
rows.append({'Model Name': model_name, 'PSNR': s1, 'SSIM': s2})

In [None]:
# Call the visualization function
visualize_results(resdae_model, X_val, y_val, num_images=5)

In [None]:
Model_analysis = pd.concat([Model_analysis, pd.DataFrame(rows)], ignore_index=True)

In [None]:
Model_analysis.head()

## Step 6: Model Selection for Denoising

After evaluating multiple autoencoder architectures, wwill be using the model withed the best performance, as measured by PSNR (Peak Signal-to-Noise Ratio) and SSIM (Structural Similarity Index). The higher scores indicate superior image quality and better reconstruction capabilities compared to other models testease.

In [None]:
# Best results are obtained from residual autoencoder
# Denoising the entire dataset using resdae model

# Define output directories
output_dir = "/kaggle/working/denoised_dataset"

# Create directories if they do not exist
os.makedirs(output_dir, exist_ok=True)

max_psnr_index = Model_analysis['PSNR'].idxmax()

# Retrieve the corresponding model
best_model = model_list[max_psnr_index]

# Output
print("The model with the highest PSNR is:", Model_analysis.loc[max_psnr_index, 'Model Name'])

# Function to denoise and save images one by one
def denoise_and_save_one_by_one(model, dataset, output_dir, prefix="img"):
    total_samples = dataset.shape[0]
    for idx in range(total_samples):
        # Get the current image and expand dimensions for prediction
        image = np.expand_dims(dataset[idx], axis=0)
        denoised_image = model.predict(image)  # Denoise the image
        
        # Normalize the denoised image to [0, 255]
        denoised_image = (denoised_image.squeeze() * 255).astype(np.uint8)
        
        # Save the image
        img_path = os.path.join(output_dir, f"{prefix}_{idx + 1}.png")
        cv2.imwrite(img_path, denoised_image)

# Denoise and save training images
denoise_and_save_one_by_one(best_model, X_train, output_dir, prefix = "set1 ")

# Denoise and save validation images
denoise_and_save_one_by_one(best_model, X_val, output_dir, prefix = "set2 ")

print(f"Denoised dataset saved in {output_dir}")


## Freeing up space (Optional)

In [None]:
torch.cuda.empty_cache()
del image_paths, add_gaussian_noise, image, gaussian_noisy_image
del prepare_dataset, clean_images, noisy_images, X_train, X_val, y_train, y_val
del evaluate_psnr_ssim, visualize_results, visualize_test_results
del autoencoder_deep_128, autoencoder_deep, history, denoised_images
del create_denoising_cnn, train_model, image_folder, all_image_paths
del train_image_paths, test_image_paths, device, transform
del train_clean, train_noisy, train_dataset, train_loader
del model, optimizer, criterion, X_val_tensor
del autoencoder_multiscale_128, autoencoder_multiscale, autoencoder_residual, best_model, max_psnr_index
del resdae_model, denoise_and_save_one_by_one, model_list, Model_analysis, s1, s2, model_name, rows
import gc
gc.collect()


# Phase 2: Synthetically generating Dental X-Ray images for Augmenting the dataset using DCGANs

## Step 1: Initializing necessary constants

In [None]:
NOISE_DIM = 100  
BATCH_SIZE = 4 
STEPS_PER_EPOCH = 3750
EPOCHS = 10
SEED = 40
WIDTH, HEIGHT, CHANNELS = 128, 128, 1

OPTIMIZER = Adam(0.0002, 0.5)

In [None]:
MAIN_DIR = "/kaggle/working/denoised_dataset"

## Step 2: Loading the denoised Dataset

In [None]:
def load_images(folder):
    
    imgs = []
    target = 1
    labels = []
    for i in os.listdir(folder):
        img_dir = os.path.join(folder,i)
        try:
            img = cv2.imread(img_dir)
            img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
            img = cv2.resize(img, (128,128))
            imgs.append(img)
            labels.append(target)
        except:
            continue
        
    imgs = np.array(imgs)
    labels = np.array(labels)
    
    return imgs, labels

In [None]:
data, labels = load_images(MAIN_DIR)
data.shape, labels.shape

In [None]:
np.random.seed(SEED)
idxs = np.random.randint(0, 120, 20)

In [None]:
X_train = data[idxs]
X_train.shape

## Step 3: Preparing the dataset for further use in GANs

In [None]:
# Normalize the Images
X_train = (X_train.astype(np.float32) - 127.5) / 127.5

# Reshape images 
X_train = X_train.reshape(-1, WIDTH,HEIGHT,CHANNELS)

# Check shape
X_train.shape

In [None]:
plt.figure(figsize=(20,8))
for i in range(10):
    axs = plt.subplot(2,5,i+1)
    plt.imshow(X_train[i], cmap="gray")
    plt.axis('off')
    axs.set_xticklabels([])
    axs.set_yticklabels([])
    plt.subplots_adjust(wspace=None, hspace=None)
plt.tight_layout()

## Step 4: Building the DCGANs Architecture

### DCGAN Architecture Overview

The **DCGAN (Deep Convolutional Generative Adversarial Network)** consists of two main components: the **Generator** and the **Discriminator**, each playing a critical role in the training process to generate realistic images from random noise.

#### Generator
The **Generator** takes a random latent vector (sampled from a normal distribution) as input and transforms it into an image through a series of transposed convolutional layers. This architecture learns to create increasingly detailed images as the model trains.

- **Input**: Random noise (latent vector)
- **Output**: Generated image
- **Architecture**:
  - Dense layer to reshape the noise into a (32, 32, 256) tensor.
  - Three `Conv2DTranspose` layers, which upsample the input to generate an image, with LeakyReLU activations for non-linearity.
  - Final `Conv2D` layer with `tanh` activation to produce the final output image.

#### Discriminator
The **Discriminator** is a binary classifier that distinguishes between real images and those generated by the Generator. Its aim is to learn how to identify authentic images effectively.

- **Input**: Real or generated image
- **Output**: Probability indicating whether the image is real or fake
- **Architecture**:
  - Series of `Conv2D` layers with LeakyReLU activations for feature extraction.
  - Dropout layer for regularization.
  - Final `Dense` layer with a sigmoid activation function for binary classification.

#### Training Process
The Generator and Discriminator are trained together in a competitive process, where the Generator tries to produce realistic images to fool the Discriminator, while the Discriminator aims to become better at identifying real images from fakes. This adversarial process improves both models until the Generator creates high-quality images indistinifferentiate from real images.

In [None]:
def build_generator():

    """
        Generator model "generates" images using random noise. The random noise AKA Latent Vector
        is sampled from a Normal Distribution which is given as the input to the Generator. Using
        Transposed Convolution, the latent vector is transformed to produce an image
        We use 3 Conv2DTranspose layers (which help in producing an image using features; opposite
        of Convolutional Learning)

        Input: Random Noise / Latent Vector
        Output: Image
    """

    model = Sequential([

        Dense(32*32*256, input_dim=NOISE_DIM),
        LeakyReLU(alpha=0.2),
        Reshape((32,32,256)),
        
        Conv2DTranspose(128, (4, 4), strides=2, padding='same'),
        LeakyReLU(alpha=0.2),

        Conv2DTranspose(128, (4, 4), strides=2, padding='same'),
        LeakyReLU(alpha=0.2),

        Conv2D(CHANNELS, (4, 4), padding='same', activation='tanh')
    ], 
    name="generator")
    model.summary()
    model.compile(loss="binary_crossentropy", optimizer=OPTIMIZER)

    return model

In [None]:
def build_discriminator():
    
    """
        Discriminator is the model which is responsible for classifying the generated images
        as fake or real. Our end goal is to create a Generator so powerful that the Discriminator
        is unable to classify real and fake images
        A simple Convolutional Neural Network with 2 Conv2D layers connected to a Dense output layer
        Output layer activation is Sigmoid since this is a Binary Classifier

        Input: Generated / Real Image
        Output: Validity of Image (Fake or Real)

    """

    model = Sequential([

        Conv2D(64, (3, 3), padding='same', input_shape=(WIDTH, HEIGHT, CHANNELS)),
        LeakyReLU(alpha=0.2),

        Conv2D(128, (3, 3), strides=2, padding='same'),
        LeakyReLU(alpha=0.2),

        Conv2D(128, (3, 3), strides=2, padding='same'),
        LeakyReLU(alpha=0.2),
        
        Conv2D(256, (3, 3), strides=2, padding='same'),
        LeakyReLU(alpha=0.2),
        
        Flatten(),
        Dropout(0.4),
        Dense(1, activation="sigmoid", input_shape=(WIDTH, HEIGHT, CHANNELS))
    ], name="discriminator")
    model.summary()
    model.compile(loss="binary_crossentropy",
                        optimizer=OPTIMIZER)

    return model

In [None]:
print('\n')
discriminator = build_discriminator()
print('\n')
generator = build_generator()

discriminator.trainable = False 

gan_input = Input(shape=(NOISE_DIM,))
fake_image = generator(gan_input)

gan_output = discriminator(fake_image)

gan = Model(gan_input, gan_output, name="gan_model")
gan.compile(loss="binary_crossentropy", optimizer=OPTIMIZER)

print("The Combined Network:\n")
gan.summary()

In [None]:
def sample_images(noise, subplots, figsize=(22, 8), save=False):
    # Create the directory if it doesn't exist
    output_dir = "augmented-dataset"
    if not os.path.exists(output_dir):
        os.makedirs(output_dir)
    
    generated_images = generator.predict(noise)
    plt.figure(figsize=figsize)
    
    for i, image in enumerate(generated_images):
        if save:
            img_name = f"{output_dir}/gen_{i}.png"
            # Save individual images directly
            if CHANNELS == 1:
                cv2.imwrite(img_name, (image.reshape((WIDTH, HEIGHT)) * 255).astype(np.uint8))
            else:
                cv2.imwrite(img_name, (image.reshape((WIDTH, HEIGHT, CHANNELS)) * 255).astype(np.uint8))
        
        # For plotting
        plt.subplot(subplots[0], subplots[1], i + 1)
        if CHANNELS == 1:
            plt.imshow(image.reshape((WIDTH, HEIGHT)), cmap='gray')    
        else:
            plt.imshow(image.reshape((WIDTH, HEIGHT, CHANNELS)))
        plt.subplots_adjust(wspace=None, hspace=None)
        plt.axis('off')
    
    plt.tight_layout()
    plt.show()


## Step 5: Training the DCGAN

In [None]:
np.random.seed(SEED)
for epoch in range(10):
    for batch in tqdm(range(STEPS_PER_EPOCH)):

        noise = np.random.normal(0,1, size=(BATCH_SIZE, NOISE_DIM))
        fake_X = generator.predict(noise)
        
        idx = np.random.randint(0, X_train.shape[0], size=BATCH_SIZE)
        real_X = X_train[idx]

        X = np.concatenate((real_X, fake_X))

        disc_y = np.zeros(2*BATCH_SIZE)
        disc_y[:BATCH_SIZE] = 1

        d_loss = discriminator.train_on_batch(X, disc_y)
        
        y_gen = np.ones(BATCH_SIZE)
        g_loss = gan.train_on_batch(noise, y_gen)

    print(f"EPOCH: {epoch + 1} Generator Loss: {g_loss:.4f} Discriminator Loss: {d_loss:.4f}")
    noise = np.random.normal(0, 1, size=(10,NOISE_DIM))
    sample_images(noise, (2,5))

## Step 6: Generating Images

In [None]:
noise = np.random.normal(0, 1, size=(120, NOISE_DIM))
sample_images(noise, (10, 12), (24, 20), save=True)

## Step 7: Testing the Generated sample: Plotting the Distributions¶

In this test, we compare the generated images with the real samples by plotting their distributions. If the distributions overlap, that indicates the generated samples are very close to the real ones

In [None]:
generated_images = generator.predict(noise)
generated_images.shape

In [None]:
fig, axs = plt.subplots(ncols=1, nrows=1, figsize=(18,10))

sns.distplot(X_train, label='Real Images', hist=True, color='#fc0328', ax=axs)
sns.distplot(generated_images, label='Generated Images', hist=True, color='#0c06c7', ax=axs)

axs.legend(loc='upper right', prop={'size': 12})

plt.show()