# GANs

In the realm of machine learning, the ability to construct extensive image datasets and access unprecedented computational resources has facilitated the rise of deep learning techniques, notably revolutionizing the landscape of image generation. Specifically, Ian Goodfellow and his colleagues introduced Generative Adversarial Networks (GANs) in 2014, which have since emerged as a fundamental cornerstone in the field of computer vision.

At their core, GANs are composed of two neural networks – a generator and a discriminator – engaged in a dynamic adversarial training process. The generator strives to create synthetic data that is indistinguishable from real data, while the discriminator aims to accurately classify whether the given data is real or generated (see image below). 

<img src="figures/gans.png" style="height: 500px;"/>
\\

[source](https://www.analyticsvidhya.com/blog/2021/04/lets-talk-about-gans/)

The main idea is that the generator tries to fool the discriminator by showing it fake training data samples and the discriminator tries to be as clever as possible. To do so, it learns the probability distribution of our training data. Training aims to improve both of those neural nets that have competitive objective, hence the term *adversarial*.

The discriminator is a simple binary classifier, so it's loss is a simple binary cross-entropy.
By writing:
- $z$ the input noise
- $x$ a training sample
- $G(z)$ the image outputted by the generator
- $D(x)$ the probability of the image being real outputted by the discriminator

We get the loss, that will have to be maximized when training the discriminator :
$$
log(D(x)) + log(1-D(G(z)))
$$
Yet, the generator objective is to fool the discriminator: it tries to minimize $log(1-D(G(z)))$
This way, we get the adversarial objectif, encompassing adversarial training of our generator and discriminator:
$$
\operatorname*{min}_{G}\operatorname*{max}_{D}V(D,G)=\mathbb{E}_{x\sim p_{\mathrm{data}}}[\log D(x)]+\mathbb{E}_{z\sim p_{z}}[\log(1-D(G(z)))].
$$
In practice, we first train the discriminator by ascending gradient and train the generator every k-steps by descending gradient. Moreover, this training of the generator in actually rather done by maximizing $log(D(G(z)))$, providing better gradients in early training when  $log(1 - D(G(z)))$ saturates because of poor quality of $G(z)$.

In [1]:
# load github repo
try:
    !rm -r ./SinGAN
except:
    pass
get_ipython().system(f"git clone https://github.com/eustlb/SinGAN.git")

rm: ./SinGAN: No such file or directory
Clonage dans 'SinGAN'...
remote: Enumerating objects: 55, done.[K
remote: Counting objects: 100% (55/55), done.[K
remote: Compressing objects: 100% (45/45), done.[K
remote: Total 55 (delta 9), reused 52 (delta 6), pack-reused 0[K
Réception d'objets: 100% (55/55), 4.53 Mio | 24.57 Mio/s, fait.
Résolution des deltas: 100% (9/9), fait.
