# Image Quality Quantitative Metrics - Structural Similarity Index Measure (SSIM)

## Implementation & Results

The purpose of this notebook is to implement and collect the results of the SSIM metric for quantitative evaluation of the synthetic image quality as outlined in sections 3.5.1 and 4.1.1 of the bachelor thesis.

The code provided in this notebook was developed using the Kaggle platform.

Since the SSIM is not directly related to synthetic data evaluation, this notebook focus in implement multiple runs in order to avoid bias introduced by sampling the datasets.

## Step 1 - Importing Dependencies

- Importing the necessary libraries to execute the code.

In [None]:
from torchvision.datasets import ImageFolder
import random
from skimage.metrics import structural_similarity as ssim
import numpy as np
import random
import pandas as pd

## Step 2 - Dataset Loading

- Loading the subsets as a PyTorch dataset from the `ImageFolder` method.

In [None]:
real_dataset = ImageFolder(root='path/to/subset')
dcgan_dataset = ImageFolder(root='path/to/subset')
vae_dataset = ImageFolder(root='path/to/subset')
ddpm_uncon_dataset = ImageFolder(root='path/to/subset')
ddpm_cond_dataset = ImageFolder(root='path/to/subset')

## Step 3 - Computing the SSIM Metric

- ``calculate_ssim``: Computes the SSIM between two images
- ``get_ssim_class``: Returns the average SSIM for one class based on the number of steps defined.
- ``get_mean_std``: Computes the mean and std of multiple runs on the function above.

In [None]:
def calculate_ssim(image1, image2):
    array1 = np.asarray(image1)
    array2 = np.asarray(image2)
    ssim_score = ssim(array1, array2, channel_axis=2)
    return ssim_score

def get_ssim_class(real_dataset, synthetic_dataset, steps):
    avg_ssim = []
    for class_idx in range(10):

        # Get indices of real and synthetic images for this class
        real_indices = [idx for idx, (image, label) in enumerate(real_dataset) if label == class_idx]
        synthetic_indices = [idx for idx, (image, label) in enumerate(synthetic_dataset) if label == class_idx]

        # Randomly select pairs of images and calculate SSIM
        ssim_scores = []
        for i in range(steps):
            real_idx = random.choice(real_indices)
            synthetic_idx = random.choice(synthetic_indices)
            real_image, _ = real_dataset[real_idx]
            synthetic_image, _ = synthetic_dataset[synthetic_idx]
            ssim_score = calculate_ssim(real_image, synthetic_image)
            ssim_scores.append(ssim_score)

        # Calculate average SSIM for this class
        avg_ssim_class = sum(ssim_scores) / len(ssim_scores)
        avg_ssim.append(avg_ssim_class)
        
    return avg_ssim

def get_mean_std(real_dataset, synthetic_dataset, steps):
    subset_ssim = []
    for i in range(5):
        subset_ssim.append(get_ssim_class(real_dataset,synthetic_dataset,steps))

    np_array = np.array(subset_ssim)
    means = np.mean(np_array, axis=0)
    stds = np.std(np_array, axis=0)

    data = {
        "Class": range(len(means)),
        "Mean": means,
        "Std": stds
    }
    df = pd.DataFrame(data)
    
    return df

## Step 4 - Evaluation of SSIM Metric on the Subsets 

- Evaluation of the SSIM metric per subset, there is one subset for each model type applied.
- The result is a pandasdataframe and a list with the SSIM mean for this subset.

In [None]:
# DCGAN
dcgan_df = get_mean_std(real_dataset, dcgan_dataset, steps=350)
dcgan_df['model type'] = 'DCGAN'
dcgan_list = list(dcgan_df["Mean"])

In [None]:
# VAE
vae_df = get_mean_std(real_dataset, vae_dataset, steps=350)
vae_df['model type'] = 'VAE'
vae_list = list(vae_df["Mean"])

In [None]:
# DDPM uncon
ddpm_uncon = get_mean_std(real_dataset, ddpm_uncon_dataset, steps=350)
ddpm_uncon['model type'] = 'DDPM Uncon'
ddpm_uncon_list = list(ddpm_uncon["Mean"])

In [None]:
# DDPM cond
ddpm_cond = get_mean_std(real_dataset, ddpm_cond_dataset, steps=350)
ddpm_cond["model type"] = "DDPM Cond"
ddpm_cond_list = list(ddpm_cond["Mean"])

## Step 5 - Download the Results

- Join each previous obtained pandas dataframe for the SSIM results per deep generative model type.
- Download the SSIM results for future analysis

In [None]:
SSIM_results = pd.concat([dcgan_df, vae_df, ddpm_uncon, ddpm_cond], ignore_index=True)
SSIM_results.to_excel("save_path.xlsx", index=False)

## Step 6 - Quick Python Analysis

- Provide a quickly insight about the SSIM metric redsults which can be further investigate based on the .xlsx file downloaded above.

In [None]:
final_SSIM = pd.DataFrame({'DCGAN': dcgan_list, 'VAE': vae_list, 'DDPM Uncon': ddpm_uncon_list, 'DDPM Cond': ddpm_cond_list})
final_SSIM.describe()