# Homework 3: Generative Adversarial Networks
---
Here, you will study one of the most popular approaches to generative modeling. We will consider a toy problem and a simplified model since state-of-the-art generative models take weeks to converge on multiple GPUs. This will make the final results not that impressive, but still instructive.

For this assignment, it is advised to use a **GPU** accelerator.

In [37]:
# Uncomment and run if in Colab
!mkdir datasets
!gdown --id 1LPYTu85QYYe_d1IS0l0v3fKzG3gjXrwC -O datasets/flowers-17.tar.gz
!tar -xzf datasets/flowers-17.tar.gz -C datasets
!rm datasets/flowers-17.tar.gz
!gdown --id 1kpL8fGK2AkgCJmMcWklIuP8A8B8xCAC2
!tar -xzf part2_gans.tar.gz
!rm part2_gans.tar.gz

!pip install pytorch_lightning
!pip install scipy

mkdir: cannot create directory ‘datasets’: File exists
Downloading...
From: https://drive.google.com/uc?id=1LPYTu85QYYe_d1IS0l0v3fKzG3gjXrwC
To: /content/datasets/flowers-17.tar.gz
100% 1.71M/1.71M [00:00<00:00, 134MB/s]
Downloading...
From: https://drive.google.com/uc?id=1kpL8fGK2AkgCJmMcWklIuP8A8B8xCAC2
To: /content/part2_gans.tar.gz
100% 410k/410k [00:00<00:00, 109MB/s]


In [38]:
# Determine the locations of auxiliary libraries and datasets.
# `AUX_DATA_ROOT` is where 'tiny-imagenet-2022.zip' is.

# Detect if we are in Google Colaboratory
try:
    import google.colab
    IN_COLAB = True
except ImportError:
    IN_COLAB = False

from pathlib import Path
if IN_COLAB:
    google.colab.drive.mount("/content/drive")
    
    # Change this if you created the shortcut in a different location
    AUX_DATA_ROOT = Path("/content/drive/MyDrive/DL_HW3")
    
    assert AUX_DATA_ROOT.is_dir(), "Have you forgot to 'Add a shortcut to Drive'?"
    
    import sys
    sys.path.append(str(AUX_DATA_ROOT))
    sys.path.append('/content/drive/MyDrive/DL_HW3/part2_gans')
else:
    AUX_DATA_ROOT = Path(".")

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


## Datasets

We will use a pre-processed [17 category Flowers dataset](https://www.robots.ox.ac.uk/~vgg/data/flowers/17/). Below are few samples of the original images. We will use a processed version: images that are center square cropped and resized to 64 pixels.

<img src="https://i.imgur.com/OYQd8JY.jpg"/>

## Assignments and grading


- **Part 1. Code**: fill in the empty gaps (marked with `#TODO`) in the code of the assignment (36 points):
    - `model.py` -- 26 points
    - `loss.py` -- 8 points
    - `train.py` -- 2 points
- **Part 2. Train and benchmark** the performance of the required models (7 points):
    - All 3 checkpoints are provided -- 3 points
    - All 4 variants are evaluated -- 4 points
- **Part 3. Report** your findings (9 points)
    - Each task -- 3 points

- **Total score**: 50 points.

The grading policy is the same as in the semantic segmentation task. It is provided below:

For detailed grading of each coding assignment, please refer to the comments inside the files. Please use the materials provided during a seminar and during a lecture to do a coding part, as this will help you to further familiarize yourself with PyTorch. Copy-pasting the code from Google Search will get penalized.

In part 2 of this task, you should upload all your pre-trained checkpoints to your personal Google Drive, grant public access and provide a file ID, following the intructions in the notebook.

Note that for each task in part 3 to count towards your final grade, you should complete the corresponding tasks in part 2.

For example, if you are asked to compare Model X and Model Y, you should provide the checkpoints for these models in your submission trained for the required number of epochs.

## Part 1. Code

### `model.py`
**TODO: implement generator and discriminator models.**

We will use DCGAN architecture as a base, but with a few key modifications which significantly improve the quality of the results.

<img src="https://i.imgur.com/h4ubSt9.png"/>

#### 1. Each block is a pre-activation residual block.

<img src="https://i.imgur.com/CqQM9mO.jpg"/>

It has a much better gradient flow compared to the standard residual block and is now fairly common in generative models. If needed, upsampling is performed at the start of the block (before branching), and downsampling is performed at the end (after residual sum).

#### 2. We conditon on the noise vector multiple times throughout the network.

One of the most popular ways of doing that is via adaptive batch normalization:

$$
    x = \frac{ x - \mu }{ \sigma } \gamma + \beta,\quad \gamma = f(z),\ \beta = g(z)
$$

The first part of this operation is a standard batch normalization, but instead of optimizing $\gamma$ and $\beta$ as a vector, we optimize functions $f$ and $g$, which predict affine parameters from a noise vector $z$. Typically these functions are simple linear mappings.

#### 3. We condition both generation and discrimination on classes.

If our data is labeled with classes, we can use these to boost the performance of our GANs. The conditioning of the generator is straightforward: we simply train embeddings for each available class, and concatenate them with noise to use as inputs to the network and its adaptive batch normalization layers:

<img src="https://i.imgur.com/VFiaU6N.jpg"/>

Therefore, our model produces its outputs using not only the noise vector $z$ but also on a class $k$, for each of which we train an embedding vector $c_k$.

For the discriminator, one of the most popular ways of conditioning on a class label is by using the so-called "projection":

<img src="https://i.imgur.com/jCwkb5R.png"/>

In this scheme, $\phi$ denotes a convolutional part of the discriminator, which outputs a vector; $\psi$ is a linear layer with maps a vector into a single digit; $y$ is a trainable class embedding. The output of this projection layer is fed into an adversarial loss. This layer allows the discriminator to learn whether or not a synthesized image belongs to the class which we input into the generator.

To sum up, generator class embeddings $c$ and discriminator embeddings $y$ are vectors from trainable matrices of the shape $\text{number of classes} \times \text{dimensionality of the embeddings}$, corresponding to the class $k$ which we condition our sample on. In our case, these matrices will be different for generator and discriminator and will have different embedding dimensionalities.

### `loss.py`
**TODO: implement train and validation losses.**

#### Training

There are multiple ways to train generative adversarial networks. We will try out 3 of them, which historically preceded each other.

#### 1. Non-saturating GAN

$$
    \mathcal{L}_D = - \mathbb{E}_{x\sim p_\text{real}} [ \log D(x) ] - \mathbb{E}_{z\sim \mathcal{N}(0, \mathbb{I})} [ \log(1 - D(G(z)) ]
$$

$$
    \mathcal{L}_G = - \mathbb{E}_{z\sim \mathcal{N}(0, \mathbb{I})} [\log D(G(z))]
$$

It corresponds to using a standard binary cross-entropy loss for $D$, and BCE with fake data treated as real data for $G$.

#### 2. Hinge Loss GAN

$$
    \mathcal{L}_D = -\mathbb{E}_{x\sim p_\text{real}} \big[ \min\big(0, -1 + D(x) \big) \big] - \mathbb{E}_{z\sim \mathcal{N}(0, \mathbb{I})} \big[ \min\big(0, -1 - D(G(z)) \big) \big]
$$

$$
    \mathcal{L}_G = - \mathbb{E}_{z\sim \mathcal{N}(0, \mathbb{I})} D(G(z))
$$

This objective is derived from a hinge loss (used, for example, as an objective in SVMs). Arguably, it has the best gradient flow, and now it a go-to objective for GAN training.

#### Validation

For validation, we will use two main metrics: **Frechet Inception Distance (FID)** and **Inception Score (IS)**.

They are both calculated using the outputs of an **Inception v3** network (hence "inception" in their names), although any other pre-trained classification network can also be used in the same way to obtain similar metrics.

#### 1. Frechet Inception Distance

This metric is calculated using a feature vector right after global average pooling before the final classification head. The feature vector can be treated as a multi-dimensional random variable with some distribution. This distribution will be different, if we evaluate these features using real images from the dataset, or images generated using our generative models. The general idea behind **FID** is to try and approximate the difference between these two distributions and use it as a quality metric (the lower it is, the better).

To do that, we approximate these two distribution using a multivariate gaussian distribution. To do that, we need to calculate the mean vector $\mu$ and a covariance matrix $\Sigma$ using either samples from the dataset: $\mu_r$, $\Sigma_r$, or generated samples: $\mu_g$, $\Sigma_g$. Note that these are full covariance matrices.

Then, **FID** can be calculated using KL divergence between these two distributions:

$$
    \text{FID} = ||\mu_r - \mu_g||^2 + \text{tr}\,\big(\Sigma_r + \Sigma_g - 2(\Sigma_r \Sigma_g)^{1/2}\big)
$$

#### 2. Inception Score

For this metric, we will need the outputs of the classification head, which we should convert to class probabilities via a softmax. 

To calculate it, we will only use generated data, and try to evaluate two qualities: their "objectiveness", and the diversity.

For the "objectiveness" metric, we can look at the distribution of the class probabilities and check whether or not it has a pike. Here, we assume that outputs of our generative models should represent objects, which have a structure, similar to an ImageNet dataset. This would be a bad assumption if we generate X-ray or other medical images, but it's actually fairly true for natural images, thanks to the diversity of ImageNet. If our model generates a smeared blob of artifacts, it is unlikely to be classified as some object by an ImageNet classifier.

A good measure to determine if the distribution is "piky" is entropy. It is the lowest if predicted probability is a one-hot vector, and highest if it is uniform accross all calsses.

For "diversity", we are going to use the same idea: our samples are diverse, if their averaged class probability distribution is uniform.

Combining these two measurements, we can come up with the following objective:

$$
    \text{IS} = \exp \Bigg[ \mathbb{E}_{\hat{x}\sim p(\hat{x})}\ \text{KL} \big( p(y \mid \hat{x})\ \big\|\ p(y) \big) \Bigg] = \exp \Bigg[ \mathbb{E}_{\hat{x}\sim p(\hat{x})} \sum_{k=1}^{K}\ p(y_k \mid \hat{x}) \log \bigg[ \frac{ p(y_k \mid \hat{x} ) }{ p(y_k) } \bigg] \Bigg]
$$

where $K = 1000$ for ImageNet-pretrained networks.

For more details about derivation and applicability, you can refer to this [link](https://medium.com/octavian-ai/a-simple-explanation-of-the-inception-score-372dff6a8c7a).

### `train.py`

Here you will need to write a training step for GANs (alternating gradients descend, where we first update the generator, and thenn a discriminator), and also implement a neat feature called "truncation trick".

There are multiple ways to improve test-time performance of trained GANs (i.e., obtain better samples). Some are more complicated, like [usage of Langevin dynamics](https://arxiv.org/abs/2003.06060) for sampling, some are much simpler, like [rejection sampling](https://arxiv.org/abs/1810.06758). We will consider the simplest, yet one of the most effective and universally used approaches: [truncation trick](https://paperswithcode.com/method/truncation-trick).

The idea is based on an observation that if, instead of $\mathcal{N}(0, \mathbb{I})$, we sample from a truncated normal distribution, the results that we get will have a better visual quality. You will have to implement sampling from a truncated normal distribution and use it during evaluation.

## Part 2. Train and evaluate

You will have to train and evaluate the following variants for the generative model:

1. Non-class conditional setting: non-saturating GAN and hinge Loss GAN
2. Class conditional hinge loss GAN
3. Evaluate class conditional hinge loss GAN with truncation trick

For training, use the code example below, with the provided number of epochs. For evaluation use `GANValLoss` class that you have implemented. You need to obtain **FID** and **IS** values for all the 4 required experiments.

In [None]:
%load_ext autoreload
%autoreload 2

In [None]:
import pytorch_lightning as pl
from part2_gans.train import GAN



def train(model, experiment_name, use_gpu):
    assignment_dir = 'part2_gans'

    logger = pl.loggers.TensorBoardLogger(save_dir=f'{assignment_dir}/logs', name=experiment_name)

    trainer = pl.Trainer(
        max_epochs=150, 
        gpus=1 if use_gpu else None, 
        benchmark=True, 
        logger=logger) 
    
    trainer.fit(model)

In [None]:
use_gpu = True

In [None]:
import torch

model = GAN(
    loss_type='non_saturating',
    class_conditional=False,
    truncation_trick=False, 
    data_path='datasets/flowers-17')

train(model, 'non_saturating', use_gpu=use_gpu)

GPU available: True, used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs
  rank_zero_warn("You passed in a `val_dataloader` but have no `validation_step`. Skipping val loop.")
  f"The `LightningModule.{hook}` hook was deprecated in v1.6 and"
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name | Type          | Params
---------------------------------------
0 | gen  | Generator     | 3.6 M 
1 | dis  | Discriminator | 4.9 M 
2 | loss | GANLoss       | 0     
---------------------------------------
8.5 M     Trainable params
0         Non-trainable params
8.5 M     Total params
33.885    Total estimated model params size (MB)
  cpuset_checked))


Training: 0it [00:00, ?it/s]

  rank_zero_warn("Detected KeyboardInterrupt, attempting graceful shutdown...")


In [None]:
model = GAN(
    loss_type='hinge',
    class_conditional=False,
    truncation_trick=False, 
    data_path='datasets/flowers-17')

train(model, 'hinge', use_gpu=use_gpu)

GPU available: True, used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs
  rank_zero_warn("You passed in a `val_dataloader` but have no `validation_step`. Skipping val loop.")
  f"The `LightningModule.{hook}` hook was deprecated in v1.6 and"
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name | Type          | Params
---------------------------------------
0 | gen  | Generator     | 3.6 M 
1 | dis  | Discriminator | 4.9 M 
2 | loss | GANLoss       | 0     
---------------------------------------
8.5 M     Trainable params
0         Non-trainable params
8.5 M     Total params
33.885    Total estimated model params size (MB)
  cpuset_checked))


Training: 0it [00:00, ?it/s]

In [None]:
model = GAN(
    loss_type='hinge',
    class_conditional=True,
    truncation_trick=False, 
    data_path='datasets/flowers-17')

train(model, 'hinge_class-cond', use_gpu=use_gpu)

GPU available: True, used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs
  rank_zero_warn("You passed in a `val_dataloader` but have no `validation_step`. Skipping val loop.")
  f"The `LightningModule.{hook}` hook was deprecated in v1.6 and"
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name | Type          | Params
---------------------------------------
0 | gen  | Generator     | 5.4 M 
1 | dis  | Discriminator | 4.9 M 
2 | loss | GANLoss       | 0     
---------------------------------------
10.3 M    Trainable params
0         Non-trainable params
10.3 M    Total params
41.040    Total estimated model params size (MB)
  cpuset_checked))


Training: 0it [00:00, ?it/s]

In [None]:
model = GAN(
    loss_type='hinge',
    class_conditional=True,
    truncation_trick=True, 
    data_path='datasets/flowers-17')

train(model, 'hinge_class-cond-trunc', use_gpu=use_gpu)

GPU available: True, used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs
  rank_zero_warn("You passed in a `val_dataloader` but have no `validation_step`. Skipping val loop.")
  f"The `LightningModule.{hook}` hook was deprecated in v1.6 and"
Missing logger folder: part2_gans/logs/hinge_class-cond-trunc
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name | Type          | Params
---------------------------------------
0 | gen  | Generator     | 5.4 M 
1 | dis  | Discriminator | 4.9 M 
2 | loss | GANLoss       | 0     
---------------------------------------
10.3 M    Trainable params
0         Non-trainable params
10.3 M    Total params
41.040    Total estimated model params size (MB)
  cpuset_checked))


Training: 0it [00:00, ?it/s]

In [40]:
# Copy log folder to Google Drive
!cp logs.zip '{AUX_DATA_ROOT}/logs.zip'
!ls '{AUX_DATA_ROOT}'

cp: cannot stat 'logs.zip': No such file or directory
checkpoints	hw3-GANs-Mkrtchyan-Georgy-attempt-1.ipynb  part2_gans
flowers-17.tar	logs.zip


Again, images can be viewed via a TensorBoard:

In [None]:
%load_ext tensorboard
%tensorboard --logdir part2_gans/logs

Your trained weights are available in the `part2_gans/{experiment_name}/logs/version_{n}` folder. Upload them to your personal Google Drive folder. Provide file ids and checksums below. Use `!md5sum <PATH>` to compute the checksums.

To make sure that provided ids are correct, try running `!gdown --id <ID>` command from this notebook.

In [None]:
checkpoint_ids = {
    'non_saturating': ('1-ylh_jcz8ZcWjz_s_8yATc8FFnGgF9Qb', '0b3610881c9a5f2ad47d1240ec960d3b'), # TODO
    'hinge': ('1ZU1a3Txxwi_TmEpp4CNZ7yifIpNuThEF','591483058c1c6bbcf2df021a8e3b98cc'), # TODO
    'hinge_class-cond': ('1fXlB_yyb7k1oagOkOSJviUWvXB1NuoFF','9e663e52651b5429bbf62261e17a14cf'),
    'hinge_class-cond-trunc':('1fXlB_yyb7k1oagOkOSJviUWvXB1NuoFF','d14f50a16d65e1573d0c977947c98fa6') # TODO
}

**FID** and **IS** can be calculated like this:

In [None]:
import torch
from torchvision import utils, transforms
import glob
import os

from part2_gans.loss import ValLoss



def load_checkpoint(model, experiment_name):
    version = max([int(name.split('_')[-1]) for name in os.listdir(f'part2_gans/logs/{experiment_name}')])
    path_to_checkpoint = glob.glob(f'part2_gans/logs/{experiment_name}/version_{version}/checkpoints/*.ckpt')[0]
    model.load_state_dict(torch.load(path_to_checkpoint)['state_dict'], strict=False)

def calc_eval_metrics(model, device):
    dataloader = model.val_dataloader()
    
    if device == 'cuda':
        model = model.cuda()
    
    val_noise = model.val_noise

    noise_offset = 0
    
    with torch.no_grad():
        real_imgs = []
        fake_imgs = []

        for imgs, labels in dataloader:
            noise = val_noise[noise_offset : noise_offset + imgs.shape[0]]
            noise_offset += imgs.shape[0]

            if device == 'cuda':
                imgs = imgs.cuda()
                labels = labels.cuda()
                noise = noise.cuda()

            gen_imgs = model.forward(noise, labels)

            real_imgs.append(imgs)
            fake_imgs.append(gen_imgs)

        val_loss = ValLoss()

        if device == 'cuda':
            val_loss = val_loss.cuda()

        fid, inception_score = val_loss(real_imgs, fake_imgs)
    
    return fid, inception_score

def visualize_image_grid(model):
    noise = model.val_noise[:16 * model.num_classes]
    labels = torch.arange(model.num_classes).repeat_interleave(16, dim=0).to(noise.device)

    fake_imgs = model.forward(noise, labels)
    fake_imgs = fake_imgs.detach().cpu()

    grid = utils.make_grid(fake_imgs, nrow=16)
    
    return transforms.ToPILImage()(grid)

In [None]:
model = GAN(
    loss_type='non_saturating',
    class_conditional=False,
    truncation_trick=False, 
    data_path='datasets/flowers-17')
load_checkpoint(model, 'non_saturating')

fid, inception_score = calc_eval_metrics(model.eval().cuda(), 'cuda' if use_gpu else 'cpu')

print(f'FID: {fid:.2f}, IS: {inception_score:.2f}')
visualize_image_grid(model)

Output hidden; open in https://colab.research.google.com to view.

In [None]:
model = GAN(
    loss_type='hinge',
    class_conditional=False,
    truncation_trick=False, 
    data_path='datasets/flowers-17')
load_checkpoint(model, 'hinge')

fid, inception_score = calc_eval_metrics(model.eval().cuda(), 'cuda' if use_gpu else 'cpu')

print(f'FID: {fid:.2f}, IS: {inception_score:.2f}')
visualize_image_grid(model)

Output hidden; open in https://colab.research.google.com to view.

In [None]:
model = GAN(
    loss_type='hinge',
    class_conditional=True,
    truncation_trick=False, 
    data_path='datasets/flowers-17')
load_checkpoint(model, 'hinge_class-cond')

fid, inception_score = calc_eval_metrics(model.eval().cuda(), 'cuda' if use_gpu else 'cpu')

print(f'FID: {fid:.2f}, IS: {inception_score:.2f}')
visualize_image_grid(model)

Output hidden; open in https://colab.research.google.com to view.

In [None]:
model = GAN(
    loss_type='hinge',
    class_conditional=True,
    truncation_trick=True, 
    data_path='datasets/flowers-17')
load_checkpoint(model, 'hinge_class-cond-trunc')

fid, inception_score = calc_eval_metrics(model.eval().cuda(), 'cuda' if use_gpu else 'cpu')

print(f'FID: {fid:.2f}, IS: {inception_score:.2f}')

visualize_image_grid(model)

Output hidden; open in https://colab.research.google.com to view.

## Part 3. Report

In this part, you will need to analyze and compare the quality and performance of the trained models. Like semantic segmentation homework.

### Task 1.

Compare the performance of two evaluated GAN losses both qualitatively (comparing generated images side-by-side) and quantitatively (via metrics). What objective leads to the best results?




Unlike the earlier inception score (IS), which evaluates only the distribution of generated images, the FID compares the distribution of generated images with the distribution of real images that were used to train the generator.

Model|FID | IS
--- | --- | ---
non-staturating|170.76 | 2.43
hinge|166.62| 3.03
hinge class conditioned|122.91| 3.19 
hinge class conditioned trunc trick|133.29| 3.95

Generally, we can observe that all variations of the hinge loss outperform non-saturating loss. However let us now consider only non-saturating loss and simple hinge loss without class conditioning and truncation tricks to have equal conditions for comparison.



Let us take a look on the images:


non_saturating             |  hinge
:-------------------------:|:-------------------------:
![](https://drive.google.com/uc?export=view&id=11O1z3ZlirIVeYbhCLKPlgO0F3rh5P7DW)|![](https://drive.google.com/uc?export=view&id=1wEXCuAhaTjNo8gm2-JoFn0ZfoscXDH0w)



non_saturating             |  hinge
:-------------------------:|:-------------------------:
![](https://drive.google.com/uc?export=view&id=1cqmg0cXsKq5Agd4CQk8ECGdxO0QXNkXt)|![](https://drive.google.com/uc?export=view&id=143vfmU1pYBPqccsimdtfRDiLxjdhHAQA)

First subsample of images is associated with non-saturating loss.As we can see those images looks more like Van Gogh paintings , more precisely, they are poorly structured in comparison to hinge loss. Moreover, hinge loss produces better shapes than non-saturating loss. In addition to those advantages of hinge loss we can also state that , pictures of the hinge loss GAN are more diversed, i.e. as can be seen from the pictur non saturating loss produces images from similar color range - mosstly used Yellow, Green , White, and also flowers are seems to be same - set of 2-3 species. However, hinge loss was capable of capturing different species of flowers - at least 5-6 on the presented picture and also it uses in addition purple and red colors.
Thus given our architecture hinge loss outperforms non-saturating one.

### Task 2.
Compare (qualitatively and quantitatively) class conditional and non-class conditional models. Which one has better quality and metrics? Reflect and propose an explanation, why is that so?

Firstly, we can see from the table above that class conditioned hinge loss GAN has better metrics - FID HL = 166.62 vs FID CC HL = 122.91 and IS HL = 3.03 vs IS CC HL = 3.19, i.e CC HL GAN has better qulaity of generated images not only independently of real images (IS) but also in comparison to distribution of real image (FID).

hinge             |  hinge class-conditioned
:-------------------------:|:-------------------------:
![](https://drive.google.com/uc?export=view&id=17hjennF2_M8NXIqrDiKGPm_wsdg-o2nh)|![](https://drive.google.com/uc?export=view&id=18BAlgZ9zqZuOExK39fglnSYJ1Gb35q8b)


hinge             |  hinge class-conditioned
:-------------------------:|:-------------------------:
![](https://drive.google.com/uc?export=view&id=143vfmU1pYBPqccsimdtfRDiLxjdhHAQA)|![](https://drive.google.com/uc?export=view&id=1f07AzaA2ujId90m11v-6tqKtweeQLaLG)


Talking about the quality of the images. CC Conditioinal hinge loss gives a bit better quality then  simple hinge loss , since additional information in form of labels is provided. But generally both approaches have similar quality of the images. But the most important difference is that CC HL GAN was subject to mode collapse,i.e it produces same pictures for the same label, such a phenomenon appears due to the fact that generator withdraw good image for some observation, and thus it uses it for all other images. Such a problem may be solved via some regularization or use of different loss , one of the main suggestions in the community is to use Wasserstein loss( No adjustments were produced since grading requires exact implementation of the provided losses)

### Task 3.
Do the same comparison with and without truncation trick. Explain, what changes when this trick is applied, how it affects the results and their quality? Try to explain, why exactly truncation trick works this way?

*Quantitative comparison:*

FID CC HL = 122.91 vs FID CC HL with truncation  = 133.29 and FID CC HL = 3.19 vs FID CC HL with truncation = 3.95

As we can see truncation trick reduces performance of the GAN in comparison to the distribution of real images(FID higher) , however, it imporves the performance of the model in terms of own distribution(IS higer)
Such an issue may be described by the fact that we withdraw noise not from the whole range of distribution but only from some part which decreases the variability of the own distribution of the model. However, such an operation may have decreased the ability of the generator to learn crucial cases of images, and thus it has lower performance in comparison to the real images distribution.

*Qualitative comparison:*
Genarally, the images of both models are pretty similar, there is no severe distinction between them.However, some pictures of model with truncation trick are messy and a bit less structured, whereas another are better structured. 
To sum up , there is no severe observable (at least for my eyes) distinction between the results of both  models

hinge class-conditioned      |  hinge class-conditioned with truncation
:-------------------------:|:-------------------------:
![](https://drive.google.com/uc?export=view&id=18BAlgZ9zqZuOExK39fglnSYJ1Gb35q8b)|![](https://drive.google.com/uc?export=view&id=1mf4dS3VgD4OFHk_rlhZDIYhIWv1DOn-X)


hinge class-conditioned      |  hinge class-conditioned with truncation
:-------------------------:|:-------------------------:
![](https://drive.google.com/uc?export=view&id=1f07AzaA2ujId90m11v-6tqKtweeQLaLG)|![](https://drive.google.com/uc?export=view&id=1jmolL_6mr0RVcZsEK4Nrur03NbHdiL3I)
