<a href="https://colab.research.google.com/github/alaeddinehamroun/Working-with-GANs/blob/main/FID_GANs.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

### Challenges With Evaluating GANS



*   **Loss is Uniformative of Performance**: loss tells us little about their performance. Unlike with classifiers, where a low loss on a test set indicates superior performance, a low loss for the generator or discriminator suggests that learning has stopped.
*   **No Clear Non-human Metric**:  The is no "perfect" discriminator that can differentiate reals from fakes.


Fréchet Inception Distance is a one method which aims to solve these issues.

### Imports

In [5]:
import torch
import numpy as np
from torch import nn
from tqdm.auto import tqdm
from torchvision import transforms
from torchvision.datasets import CelebA
from torchvision.utils import make_grid
from torch.utils.data import DataLoader
import matplotlib.pyplot as plt

### Generator

In [6]:
class Generator(nn.Module):
  '''
  Generator Class
  Values:
      z_dim: the dimension of the noise vector, a scalar
      im_chan: the number of channels in the images, fitted for the dataset used, a scalar
            (CelebA is rgb, so 3 is the default)
      hidden_dim: the inner dimension, a scalar 
  '''
  def __init__(self, z_dim=10, im_chan=3, hidden_dim=64):
    super(Generator, self).__init__()
    self.z_dim = z_dim
    # Build the neural network
    self.gen = nn.Sequential(
        self.make_gen_block(z_dim, hidden_dim * 8),
        self.make_gen_block(hidden_dim * 8, hidden_dim * 4),
        self.make_gen_block(hidden_dim * 4, hidden_dim * 2),
        self.make_gen_block(hidden_dim * 2, hidden_dim),
        self.make_gen_block(hidden_dim, im_chan, kernel_size=4, final_layer=True),
    )
  
  def make_gen_block(self, input_channels, output_channels, kernel_size=3, stride=2, final_layer=False):
    '''
    Function to return a sequence of operations corresponding to a generator block of DCGAN;
    a transposed convolution, a batchnorm (except in the final layer), and an activation.
    Parameters:
        input_channels: how many channels the input feature representation has
        output_channels: how many channels the output feature representation should have
        kernel_size: the size of each convolutional filter, equivalent to (kernel_size, kernel_size)
        stride: the stride of the convolution
        final_layer: a boolean, true if it is the final layer and false otherwise (affects activation and batchnorm)
    '''
    if not final_layer:
      return nn.Sequential(
          nn.ConvTranspose2d(input_channels, output_channels, kernel_size, stride),
          nn.BatchNorm2d(output_channels),
          nn.ReLU(inplace=True)
      )
    else:
      return nn.Sequential(
          nn.ConvTranspose2d(input_channels, output_channels, kernel_size, stride),
          nn.Tanh()
      )
  def forward(self, noise):
    '''
    Function for completing a forward pass of the generator: Given a noise tensor, returns generated images.
    Parameters:
        noise: a noise tensor with dimensions (n_samples, z_dim)
    '''
    x = noise.view(len(noise), self.z_dim, 1, 1)
    return self.gen(x)

### Noise

In [7]:
def get_noise(n_samples, z_dim, device='cpu'):
  '''
  Function for creating noise vectors: Given the dimensions (n_samples, z_dim)
  creates a tensor of that shape filled with random numbers from the normal distribution.
  Parameters:
      n_samples: the number of samples to generate, a scalar
      z_dim: the dimension of the noise vector, a scalar
      device: the device type
  '''
  return torch.randn(n_samples, z_dim, device=device)

### Loading the Pre-trained Model

In [8]:
z_dim = 64
image_size = 299
device = 'cuda'

transform = transforms.Compose([
    transforms.Resize(image_size),
    transforms.CenterCrop(image_size),
    transforms.ToTensor(),
    transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))
])

dataset = CelebA(".", download=True, transform=transform)

RuntimeError: ignored

In [None]:
gen = Generator(z_dim).to(device)
gen.load_state_dict(torch.load(f"pretrained_celeba.pth", map_location=torch.device(device))["gen"])
gen = gen.eval()

### Inception-v3 Network

Inception-V3 does a good job detecting features and classifying images.

In [None]:
from torchvision.models import inception_v3
inception_model = inception_v3(pretrained=False)
inception_model.load_state_dict(torch.load("inception_v3_google-1a9a5a14.pth"))
inception_model.to(device)
inception_model = inception_model.eval() # Evaluation mode

### Fréchet Inception Distance

FID was proposed as an improvement over Inception Score and still uses the Inception-v3 network as part of its calculation. However, instead of using the classification labels of the Inception-v3 network, it uses the output from the layer right before the labels=the feature layer.  

In [None]:
# Replace the final fully-connected (fc) layer with an identity function layer to cut off the classification layer
inception_model.fc = nn.Identity()

#### Fréchet Distance
**Univariate Fréchet Distance**

the distance between two normal distributions $X$ and $Y$ with means $\mu_X$ and $\mu_Y$ and standard deviations $\sigma_X$ and $\sigma_Y$, as:

$$d(X,Y) = (\mu_X-\mu_Y)^2 + (\sigma_X-\sigma_Y)^2 $$



**Multivariate Fréchet Distance**

**Covariance**

To find the Fréchet distance between two multivariate normal distributions, you first need to find the covariance instead of the standard deviation. The covariance, which is the multivariate version of variance (the square of standard deviation), is represented using a square matrix where the side length is equal to the number of dimensions. Since the feature vectors you will be using have 2048 values/weights, the covariance matrix will be 2048 x 2048. But for the sake of an example, this is a covariance matrix in a two-dimensional space:

$\Sigma = \left(\begin{array}{cc} 
1 & 0\\ 
0 & 1
\end{array}\right)
$

The value at location $(i, j)$ corresponds to the covariance of vector $i$ with vector $j$. Since the covariance of $i$ with $j$ and $j$ with $i$ are equivalent, the matrix will always be symmetric with respect to the diagonal. The diagonal is the covariance of that element with itself. In this example, there are zeros everywhere except the diagonal. That means that the two dimensions are independent of one another, they are completely unrelated.

The following code cell will visualize this matrix.

In [None]:
from torch.distributions import MultivariateNormal
import seaborn as sns # This is for visualization
mean = torch.Tensor([0, 0]) # Center the mean at the origin
covariance = torch.Tensor( # This matrix shows independence - there are only non-zero values on the diagonal
    [[1, 0],
     [0, 1]]
)
independent_dist = MultivariateNormal(mean, covariance)
samples = independent_dist.sample((10000,))
res = sns.jointplot(x=samples[:, 0], y=samples[:, 1], kind="kde")
plt.show()

Now, here's an example of a multivariate normal distribution that has covariance:

$\Sigma = \left(\begin{array}{cc} 
2 & -1\\ 
-1 & 2
\end{array}\right)
$

And see how it looks:


In [None]:
mean = torch.Tensor([0, 0])
covariance = torch.Tensor(
    [[2, -1],
     [-1, 2]]
)
covariant_dist = MultivariateNormal(mean, covariance)
samples = covariant_dist.sample((10000,))
res = sns.jointplot(x = samples[:, 0], y =samples[:, 1], kind="kde")
plt.show()

**Formula**

the Fréchet distance between two multivariate normal distributions $X$ and $Y$ is:

$d(X, Y) = \Vert\mu_X-\mu_Y\Vert^2 + \mathrm{Tr}\left(\Sigma_X+\Sigma_Y - 2 \sqrt{\Sigma_X \Sigma_Y}\right)$


In [None]:
import scipy
def matrix_sqrt(x):
  '''
  Function that takes in a matrix and returns the square root of that matrix.
  Parameters:
      x: a matrix
  '''
  y = x.cpu().detach().numpy()
  y = scipy.linalg.sqrtm(y)
  return torch.Tensor(y.real, device=x.device)

In [None]:
def frechet_distance(mu_x, mu_y, sigma_x, sigma_y):
  '''
  Function for returning the Fréchet distance between multivariate Gaussians,
  parameterized by their means and covariance matrices.
  Parameters:
      mu_x: the mean of the first Gaussian, (n_features)
      mu_y: the mean of the second Gaussian, (n_features)
      sigma_x: the covariance matrix of the first Gaussian, (n_features, n_features)
      sigma_y: the covariance matrix of the second Gaussian, (n_features, n_features)
  '''
  return (mu_x - mu_y).dot(mu_x - mu_y) + torch.trace(sigma_x) + torch.trace(sigma_y) - 2*torch.trace(matrix_sqrt(sigma_x @ sigma_y))

### Putting it all together

In [None]:
def preprocess(img):
  img = torch.nn.functional.interpolate(img, size=(299, 299), mode='bilinear', align_corners=False)
  return img

In [None]:
import numpy as np
def get_covariance(features):
  return torch.Tensor(np.cov(features.detach().numpy(), rowvar=False))

In [None]:
# Get the featurs of the real and fake images using the Inception-v3 model:
fake_features_list = []
real_features_list = []

gen.eval()
n_samples = 512 # The total number of samples
batch_size = 4 # Samples per iteration

dataloader = DataLoader(
    dataset,
    batch_size=batch_size,
    shuffle=True)

cur_samples = 0
with torch.no_grad(): # You don't need to calculate gradients here, so you do this to save memory
    try:
        for real_example, _ in tqdm(dataloader, total=n_samples // batch_size): # Go by batch
            real_samples = real_example
            real_features = inception_model(real_samples.to(device)).detach().to('cpu') # Move features to CPU
            real_features_list.append(real_features)

            fake_samples = get_noise(len(real_example), z_dim).to(device)
            fake_samples = preprocess(gen(fake_samples))
            fake_features = inception_model(fake_samples.to(device)).detach().to('cpu')
            fake_features_list.append(fake_features)
            cur_samples += len(real_samples)
            if cur_samples > n_samples:
                break
    except:
        print("Error in loop")

In [None]:
# Combine all of the values into large tensors
fake_features_all = torch.cat(fake_features_list)
real_features_all = torch.cat(real_features_list)

In [None]:
# Calculate the Covariance and means of these real and fake features:
mu_fake = fake_features_all.mean(0)
mu_real = real_features_all.mean(0)
sigma_fake = get_covariance(fake_features_all)
sigma_real = get_covariance(real_features_all)

In [None]:
# Visualize what the pairwise multivariate distributions of the inception features look like
indices = [2, 4, 5]
fake_dist = MultivariateNormal(mu_fake[indices], sigma_fake[indices][:, indices])
fake_samples = fake_dist.sample((5000,))
real_dist = MultivariateNormal(mu_real[indices], sigma_real[indices][:, indices])
real_samples = real_dist.sample((5000,))

import pandas as pd
df_fake = pd.DataFrame(fake_samples.numpy(), columns=indices)
df_real = pd.DataFrame(real_samples.numpy(), columns=indices)
df_fake["is_real"] = "no"
df_real["is_real"] = "yes"
df = pd.concat([df_fake, df_real])
sns.pairplot(data = df, plot_kws={'alpha': 0.1}, hue='is_real')
plt.show()

In [None]:
# Calculate the FID and evaluate you GAN
with torch.no_grad():
    print(frechet_distance(mu_real, mu_fake, sigma_real, sigma_fake).item())