<h1>Table of Contents<span class="tocSkip"></span></h1>
<div class="toc"><ul class="toc-item"><li><span><a href="#GANs-with-tf.estimator-and-tf.data" data-toc-modified-id="GANs-with-tf.estimator-and-tf.data-1"><span class="toc-item-num">1&nbsp;&nbsp;</span>GANs with tf.estimator and tf.data</a></span><ul class="toc-item"><li><span><a href="#tf.estimator" data-toc-modified-id="tf.estimator-1.1"><span class="toc-item-num">1.1&nbsp;&nbsp;</span>tf.estimator</a></span></li><li><span><a href="#tf.data" data-toc-modified-id="tf.data-1.2"><span class="toc-item-num">1.2&nbsp;&nbsp;</span>tf.data</a></span><ul class="toc-item"><li><span><a href="#Feeding-the-estimator:-create-the-input-pipeline" data-toc-modified-id="Feeding-the-estimator:-create-the-input-pipeline-1.2.1"><span class="toc-item-num">1.2.1&nbsp;&nbsp;</span>Feeding the estimator: create the input pipeline</a></span></li></ul></li><li><span><a href="#DCGAN:-discriminator-using-tf.estimator" data-toc-modified-id="DCGAN:-discriminator-using-tf.estimator-1.3"><span class="toc-item-num">1.3&nbsp;&nbsp;</span>DCGAN: discriminator using tf.estimator</a></span></li></ul></li></ul></div>

# GANs with tf.estimator and tf.data

In this notebook, we're going we try to define a GAN and its input data pipeline using the new APIs `tf.estimator` and `tf.data`. Our aim is to build a **cat/dog generator** (unconditional - but it can be easily extended to become a conditional GAN that generates cats or dogs, depending on a condition: this task is let to the reader).

## tf.estimator

Estimator have **a lot** of advantages, the official guide<sup>[1](#1)</sup> perfectly describes them:

- You can run Estimator-based models on a local host or on a distributed multi-server environment without changing your model. Furthermore, you can run Estimator-based models on CPUs, GPUs, or TPUs without recoding your model.
- Estimators simplify sharing implementations between model developers.
- You can develop a state of the art model with high-level intuitive code. In short, it is generally much easier to create models with Estimators than with the low-level TensorFlow APIs.
- Estimators are themselves built on tf.layers, which simplifies customization.
- Estimators build the graph for you.
- Estimators provide a safe distributed training loop that controls how and when to:
    - build the graph
    - initialize variables
    - start queues
    - handle exceptions
    - create checkpoint files and recover from failures
    - save summaries for TensorBoard

When writing an application with Estimators, **you must separate the data input pipeline from the model**. This separation simplifies experiments with different data sets.

In order to correclty separate the data input pipeline from the model, let's introduce `tf.data`.

## tf.data

The `tf.data` API has been designed to write complex input pipelines in a very simple manner. It uses the **named pattern idiom** (also called **method chaining**) and its methods are inspired to the functional programming languages that applies transformations to lists.

The most imporant class is the `tf.data.Dataset` class that represents a sequence of elements: can apply transofrmation to this sequence of elements in order to create our dataset.

Once the dataset has been correctly defined it can be used trough an `tf.data.Iterator` that provides a pythonic way (a python iterator) to extract elements from a dataset. For a more comprehensive description, refer to the official guide <sup>[2](#2)</sup>.


### Feeding the estimator: create the input pipeline

Every method of `tf.estimator` that requires input data (`train`, `evaluation`, `train_and_evalute`, `predict`) expects as first parameter a `input_fn`. This function should construct and return one of the following

- A `tf.data.Dataset` object that once executed returns a tuple
- A tuple

The tuple can be the pair `(features, labels)` where features and labels can be batches. The cardinality of the tuple, however, is not constrained to be 2, it can be any: it depends on how we're going to use the return value of the `input_fn`.

In the next few lines we're going to dowload and analyze the dataset (the well known cat vs dog dataset) and after that we will implement an helper function that will create various `input_fn` for us (depending if we're in training on in evaluation).

In [1]:
! cd .. && python prepare_celeba_dataset.py

CelebA Dataset is already present, bravo.


Now that we have the dataset we can inspect its structure, in order to correctly build the input pipeline.

The folder contains 202599 celebrities facees, cropped, aligned and already to the same size (178x218).

Let's suppose that our model will generate images `64x64x3`, hence our `input_fn` need to resize the read images to that size.

In [2]:
import tensorflow as tf

In [3]:
def _get_train_images_input_fn(file_pattern, image_size=(64, 64, 3), shuffle=False,
                 batch_size=32, num_epochs=None, buffer_size=4096):
    """get_input_fn exploits the `file_pattern` to create an input_fn that reads all the content
    of the specified pattern, creating an object dataset.
    
    Args:
        file_pattern: python string, the pattern of the file to read to generate the dataset
        image_size: the new size of the read images
        shuffle: True if the order of the elements in the generated dataset shold be randomized
        batch_size: the size of the batches
        num_epochs: the number of epochs to repeat the dataset before throwing an exeption; None is unlimited
        buffer_size: how many images read before starting to generate output
    Returns:
        input_fn: the generated input_fn that returns a correctly instantiated iterator
    """
    
    def _img_string_to_tensor(image_string):
        """Decode an image as read from a `tf.decode_raw`, scales it between 0-1 and resize the
        image as specified in the parent method.
        Args:
            image_string: the raw image tensor
        Returns:
            image_resize: image in [0,1] correctly resized
        """
        
        nonlocal image_size
        
        image_decoded = tf.image.decode_jpeg(image_string, channels=image_size[-1])
        # The conversion to float automatically scales the values in [0., 1.]
        image_decoded_as_float = tf.image.convert_image_dtype(image_decoded, dtype=tf.float32)
        image_decoded = (image_decoded_as_float - 0.5) * 2
        image_resized = tf.image.resize_images(image_decoded, size=image_size[:2])
        

        return image_resized

    def _path_to_img(path):
        """Given the path of an image, returns the pair (image, label)
        where image is the corretly resized image, and label is the label associated with it.
        Args:
            path: the path of the image to read
        Returns:
            (image_resized, label): the image, label pair associated the path
        """

        image_string = tf.read_file(path) # read image and process it
        image_resized = _img_string_to_tensor(image_string)

        return image_resized
    
    def _input_fn():
        """The input function that builds the `tf.data.Dataset` object and instantiate
        the iterator correctly ready to be use.
        Returns:
            the iterator associated to the built Dataset object.
        """
        
        # Use the static method `list_files` that builds a dataset of all
        # files matching this pattern.
        dataset_path = tf.data.Dataset.list_files(file_pattern)

        if shuffle:
            dataset_path = dataset_path.apply(tf.contrib.data.shuffle_and_repeat(buffer_size, num_epochs))
        else:
            dataset_path = dataset_path.repeat(num_epochs)

        # The map function maps the path to the pair (image, label)
        dataset = dataset_path.map(_path_to_img)
        dataset = dataset.apply(tf.contrib.data.batch_and_drop_remainder(batch_size))
        dataset = dataset.prefetch(buffer_size)
        
        iterator = dataset.make_one_shot_iterator()
        return iterator.get_next()

    return _input_fn()

So far so good. We defined a function that will generate the correct `input_fn`s, for instance a possibile call for generating the training set is:
```python
train_files = os.path.join(os.getcwd(), 'dogscats', 'train', '**/*.jpg')
train_input_fn = get_input_fn(train_files, shuffle=True, num_epochs=10)[0]
```
Since we're not interested in creating a conditional GAN, we'll just discard the `labels` parameter (keeping only the first value of the returned pair)

Now we can start defining our `model_fn`, required to correclty work with `tf.estimator` API.

Since we're working on images, we'll use an architecture created for this purpose DCGAN <sup>[3](#3)</sup>.

## DCGAN: discriminator using tf.estimator

The discriminator of DCGAN is common CNN architecture: a stack of convolutional layers that downsample the input image (without using pooling layers) followed by 2 fully connected layers.

The output layer in the disciminator definition has the **linear** activation function (for the same reasong explained in the previous notebook).

When working with `tf.estimator`, we have to follow the API specification. The `tf.estimator.Estimator.__init__` method requires as first parameter a `model_fn` function.

The model function `model_fn` implements the ML algorithm and its behaviour in different conditions (train/eval/predict) and **must** have the following signature:

```python
def model_fn(
   features, # This is batch_features from input_fn
   labels,   # This is batch_labels from input_fn
   mode,     # An instance of tf.estimator.ModeKeys
   params):  # Additional configuration
```

and **must** return an instance of `tf.estimator.EstimatorSpec` that defines how the caller (the estimator) interacts with the model.

In [4]:
def discriminator_fn(features, labels, mode, params):
    """Build the Discriminator network.
    Args:
        features: a batch of images to classify, expected input shape (None, 64, 64 , 3)
        labels: a batch of labels
        mode: the tf.estimator.ModeKey
        params: a dict of optional parameters
    Returns:
            The tf.estimator.EstimatorSpec that descibes the desired behaviour
    """
    
    # Let'suppose that features is a batch of both, generated and real images
    
    # In every mode, define the model
    net =  tf.layers.conv2d(features, filters=128, activation=tf.nn.leaky_relu)
    net = tf.layers.batch_normalization(net, training=True)
    net = conv2d(net, filters=256, activation=tf.nn.leaky_relu)
    net = tf.layers.batch_normalization(net, training=True)
    net = conv2d(net, filters=512, activation=tf.nn.leaky_relu)
    net = tf.layers.batch_normalization(net, training=True)
    net = tf.reshape(net, (-1, net.shape[1] * net.shape[2] * net.shape[3]))
    net = tf.layers.dense(net, 1)
    D = tf.identity(net)
    
    # Let's suppose that labels is a batch of labels where 1 is the real image
    # and 0 is the label associated to the genrated image
    loss = tf.reduce_mean(
        tf.nn.sigmoid_cross_entropy_with_logits(logits=D, labels=labels))
    
    if mode == tf.estimator.ModeKeys.TRAIN:
        train_op = tf.train.AdamOptmizer(1e-5).minimize(loss)
        return tf.estimator.EstimatorSpec(
            mode, predictions=D, loss=loss, train_op=train_op)
    # in PREDICT or EVAL mode, just return the estimaor spec with the requested mode
    # and with the loss function (but NO the optimization step)
    return tf.estimator.EstimatorSpec(mode, loss=loss)

The discriminator has been correctly defined, **unfortunately** we had to suppose that the `features, labels` parameters contain what we do expect: both generated and real images.

*Is this really possibile when using the `tf.estimator` + `tf.data` API?*

Tecnically yes, but with a lot of struggle (that someone else at Google already did!):

- `tf.estimator.train` has been defined to train only one model at a time: how can we train both the generator and the discriminator using this function?
- The dataset we defined, that could be used in any classification problem, should be changed in order to add the noise vector required by the generator network -> `tf.data.Dataset` is no more an advantage and we have to change our code?
- How can we use two different `model_fn` (one for $G$ and one for $D$) and how can we connect the two models using a single estimator?

Maybe the simple estimator is not enough...
Lukily **an estimator tought to work with GANs has been introduced iin the TFGAN library: GANEstimator**.

---
<a id="1">[1]</a>: https://www.tensorflow.org/guide/estimators

<a id="2">[2]</a>: For a more complete description of the `tf.data` API: https://www.tensorflow.org/guide/datasets

<a id="3">[3]</a>: Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks https://arxiv.org/pdf/1511.06434