# Generative Deep Learning

## Summary

- Neural Style Transfer
- Generative Adversarial Networks (GAN)

## Neural Style Transfer

### Neural Style Transfer in a nutshell

- Reproduce an image with a new artistic style provided by another image.
- Blend a *content* image and a *style reference* image in a stylized output image.
- First described in [A Neural Algorithm of Artistic Style](https://arxiv.org/abs/1508.06576) by Gatys et al (2015). Many refinements and variations since.

### Example (Prisma app)

[![Prisma style transfer example](images/style_transfer_prisma.png)](https://harishnarayanan.org/writing/artistic-style-transfer/)

### Underlying idea: loss minimization

![Content loss](images/content-loss.png)
![Style loss](images/style-loss.png)
![Total loss](images/total-loss.png)

### The content loss

- Content = high-level structure of an image.
- Can be captured by the upper layer of a convolutional neural network.
- Content loss for a layer = distance between the feature maps of the content and generated images.

### The style loss

- Style = low-level features of an image (textures, colors, visual patterns).
- Can be captured by using correlations across the different feature maps (filter responses) of a convnet.
- Feature correlations are computed via a Gram matrix (outer product of the feature maps for a given layer).
- Style loss for a layer = distance between the Gram matrices of the feature maps for the style and generated images.

### The total variation loss

- Sum of the absolute differences for neighboring pixel-values in an image. Measures how much noise is in the image.
- Encourage spatial continuity in the generated image (denoising).
- Act as a regularization loss.

### Gradient descent

- Objective: minimize the total loss.
- Optimizer: [L-BFGS](http://aria42.com/blog/2014/12/understanding-lbfgs) (original choice made by Gatys et al.) or Adam.

![Animation of style transfer](images/style_transfer_animated.gif)

## Generative Adversarial Networks (GAN)

### GAN in a nutshell

- Simultaneously train two models:
  - One tries to generate realistic data.
  - The other tries to discriminate between real and generated data.
- Each model is trained to best the other.
- First described in [Generative Adversarial Nets
](https://papers.nips.cc/paper/5423-generative-adversarial-nets.pdf) by Goodfellow et al. (2014).
- [NIPS 2016 Tutorial](https://arxiv.org/abs/1701.00160)


[![GAN overview](images/gan1.png)](https://www.tensorflow.org/tutorials/generative/dcgan)

[![GAN process](images/gan2.png)](https://www.tensorflow.org/tutorials/generative/dcgan)

### Training process

- The generator creates images from random noise.
- Generated images are mixed with real ones.
- The discriminator is trained on these mixed images.
- The generator's parameters are updated in a direction that makes the discriminator more likely to classify generated data as "real".

### Example: generate handwritten digits

This example trains a Deep Convolutional GAN on the MNIST dataset. Check it out on the [TensorFlow website](https://www.tensorflow.org/tutorials/generative/dcgan).

### Specificities and gotchas

- A GAN is a dynamic system that evolves at each training step.
- Interestingly, the generator never sees images froms the training set directly: all its informations come from the discriminator.
- Training can be tricky: noisy generated data, vanishing gradients, domination of one side...
- GAN convergence theory is an active area of research.
- [GAN Open Questions](https://distill.pub/2019/gan-open-problems/)

### GAN progress on face generation

[![GAN progress from 2014 to 2018](images/gan_2014_2018.jpg)](https://twitter.com/goodfellow_ian/status/1084973596236144640)

### The GAN landscape

[![GAN flavours](images/gan_flavours.png)](https://blog.floydhub.com/gans-story-so-far/)

### Some GAN flavours

- [DCGAN](https://arxiv.org/abs/1511.06434) (2016): use deep convolutional networks for generator and discriminator.
- [CycleGAN](https://arxiv.org/abs/1703.10593v6) (2017): image-to-image translation in the absence of any paired training examples.
- [StyleGAN](https://arxiv.org/abs/1812.04948) (2019): fine control of output images.
- [GAN - The Story So Far](https://blog.floydhub.com/gans-story-so-far/)

### GAN use cases: not just images!

- Writing a novel "in the style of an author".
- [Generating music](https://arxiv.org/abs/1805.07848) ([samples](https://www.youtube.com/watch?v=vdxCqNWTpUs)).
- Generating realistic passwords for hackers.
- Generating videos ([example](https://www.youtube.com/watch?time_continue=3&v=ab64TWzWn40&feature=emb_logo)).
- [Generating video game levels](https://arxiv.org/abs/1910.01603).
- ...