# Chapter 5: Paint

## CycleGAN

_Style transfer_ is a machine learning task where a neural network learns to transform instances from one set of data to make it appear to have come from a second set of training instances and vice versa. For example, a style transfer model may transform photos of landscapes to look like they were painted in a particular artist's style.

[_Cycle-consistent adverserial networks_](https://arxiv.org/pdf/1703.10593.pdf) (CycleGAN) is a type of neural network architecture which learns style transfer between two sets of data without the need for paired samples. This is in contrast to previous style transfer networks which required paired samples from either dataset, such as [_pix2pix_](https://arxiv.org/pdf/1611.07004.pdf).

### Model Overview

CycleGANs are composed of two generators and two discriminators. One generator, $G_{AB}$, takes samples from dataset $A$ and transforms it to look like a sample from dataset $B$, the other generator, $G_{BA}$, does the inverse transformation ($B$ to $A$). There are also two discriminators, the first, $D_{A}$, determines if an instance is from dataset $A$ or was transformed by $G_{BA}$ to look like it was sampled from dataset $A$. The other discriminator, $D_B$, determines if samples are from dataset $B$ or were transformed by $G_{AB}$.

### The Generators

Typically the generators are usually using the [_U-Net_](https://arxiv.org/pdf/1505.04597.pdf) or [_ResNet_](https://arxiv.org/pdf/1512.03385.pdf) (residual network) architecture.

The U-Net is similar to a VAE except that it contains _skip connections_ or connections between downsampling and upsampling layers of the same size. It also uses a new type of layer, [instance normalization](https://arxiv.org/pdf/1607.08022.pdf), which works similarly to the batch normalization layer except it normalizes individual observations instead of a batch. Instance normalization does not compute a mean and variance and also does not learn scaling and shift parameters.

TODO ResNet

In a CycleGAN, we use 3 separate loss functions for the generators:

- _Validity_, do the images lok like they were sampled from the target dataset?

- _Reconstruction_, do images passed through both generators look like the original image?

- _Identity_, if we apply generators to images from their own dataset, is the image unchanged?

### The Discriminators

The discriminators for CycleGAN's do not output a single number, rather they output an 8 by 8 single channel tensor. This is borrowed from _PatchGAN_, a GAN architecture which divides the image into patches and guesses if each patch is real or fake. This helps the discriminator determine if an image is real or fake based on the style rather than the content.

## Setup

In [0]:
# TODO install tensorflow-gpu, other dependencies

## Changing Pictures of Apples to Oranges (and Vice-Versa) with a CycleGAN