<h1>Table of Contents<span class="tocSkip"></span></h1>
<div class="toc"><ul class="toc-item"><li><span><a href="#Structre" data-toc-modified-id="Structre-1"><span class="toc-item-num">1&nbsp;&nbsp;</span>Structre</a></span></li><li><span><a href="#Creating-the-Estimator" data-toc-modified-id="Creating-the-Estimator-2"><span class="toc-item-num">2&nbsp;&nbsp;</span>Creating the Estimator</a></span></li><li><span><a href="#Dataset" data-toc-modified-id="Dataset-3"><span class="toc-item-num">3&nbsp;&nbsp;</span>Dataset</a></span><ul class="toc-item"><li><span><a href="#Explanation" data-toc-modified-id="Explanation-3.1"><span class="toc-item-num">3.1&nbsp;&nbsp;</span>Explanation</a></span><ul class="toc-item"><li><span><a href="#Extract" data-toc-modified-id="Extract-3.1.1"><span class="toc-item-num">3.1.1&nbsp;&nbsp;</span>Extract</a></span></li><li><span><a href="#Transform" data-toc-modified-id="Transform-3.1.2"><span class="toc-item-num">3.1.2&nbsp;&nbsp;</span>Transform</a></span></li><li><span><a href="#Load" data-toc-modified-id="Load-3.1.3"><span class="toc-item-num">3.1.3&nbsp;&nbsp;</span>Load</a></span></li></ul></li></ul></li><li><span><a href="#Input_fn" data-toc-modified-id="Input_fn-4"><span class="toc-item-num">4&nbsp;&nbsp;</span>Input_fn</a></span></li><li><span><a href="#Model_fn" data-toc-modified-id="Model_fn-5"><span class="toc-item-num">5&nbsp;&nbsp;</span>Model_fn</a></span><ul class="toc-item"><li><span><a href="#Predict-(“tf.estimator.ModeKeys.PREDICT”)" data-toc-modified-id="Predict-(“tf.estimator.ModeKeys.PREDICT”)-5.1"><span class="toc-item-num">5.1&nbsp;&nbsp;</span>Predict (“tf.estimator.ModeKeys.PREDICT”)</a></span></li><li><span><a href="#Train-(&quot;tf.estimator.ModeKeys.TRAIN&quot;)" data-toc-modified-id="Train-(&quot;tf.estimator.ModeKeys.TRAIN&quot;)-5.2"><span class="toc-item-num">5.2&nbsp;&nbsp;</span>Train ("tf.estimator.ModeKeys.TRAIN")</a></span></li><li><span><a href="#Evaluate-(&quot;tf.estimator.ModeKeys.EVAL&quot;)" data-toc-modified-id="Evaluate-(&quot;tf.estimator.ModeKeys.EVAL&quot;)-5.3"><span class="toc-item-num">5.3&nbsp;&nbsp;</span>Evaluate ("tf.estimator.ModeKeys.EVAL")</a></span></li></ul></li><li><span><a href="#Scaffolds-and-SessionRunHooks" data-toc-modified-id="Scaffolds-and-SessionRunHooks-6"><span class="toc-item-num">6&nbsp;&nbsp;</span>Scaffolds and SessionRunHooks</a></span></li></ul></div>

<h1>Structre</h1>

we will follow this basic structure:

<ol>
    <li>Create Estimator
        <ol>
            <li>creating model_fn</li>
        </ol>
    </li>
    <li>Data Loading</li>
    <li>Defining Train, Evaluate and Prediction phases</li>
    <li>Session and hooks</li>
    <li>Prediction</li>
</ol>

# Creating the Estimator

Creating an estimator is simple:
<pre><code>classifier = tf.estimator.Estimator(model_dir=model_dir,
                                          model_fn=model_fn,
                                          params=params)</code></pre>
In this call “model_dir” is the path to the folder where the Estimator should store and load checkpoints and event files. The “model_fn” parameter is a function that consumes the features, labels, mode and params 

# Dataset

The Tensorflow Dataset class is designed as an E.T.L. process, which stands for Extract, Transform and Load.
<img src = 'artifacts/etl.png'/>

<pre><code>
with tf.name_scope("tf_record_reader"):
    # generate file list
    files = tf.data.Dataset.list_files(glob_pattern, shuffle=training)

    # parallel fetch tfrecords dataset using the file list in parallel
    dataset = files.apply(tf.contrib.data.parallel_interleave(
        lambda filename: tf.data.TFRecordDataset(filename), cycle_length=threads))

    # shuffle and repeat examples for better randomness and allow training beyond one epoch
    dataset = dataset.apply(tf.contrib.data.shuffle_and_repeat(32*self.batch_size))

    # map the parse  function to each example individually in threads*2 parallel calls
    dataset = dataset.map(map_func=lambda example: _parse_function(example, self.image_size, self.num_classes,training=training), num_parallel_calls=threads)

    # batch the examples
    dataset = dataset.batch(batch_size=self.batch_size)

    #prefetch batch
    dataset = dataset.prefetch(buffer_size=self.batch_size)

    return dataset.make_one_shot_iterator()
</code></pre>

## Explanation

### Extract

The first step in a Dataset input pipeline is to load the data from the tfrecords into memory. This starts with making a list of tfrecords available using a <b>glob pattern</b> e.g. “./Datasets/train-*.tfrecords” and the <b>list_files</b> function of the Dataset class.<br>

The <b>parallel_interleave</b> function is applied to the list of files, which ensures parallel extraction of the data .<br>

Finally a merged shuffle and repeat function is used to prefetch a certain number of examples from the tfrecords and shuffle them. The repeat ensures that there are always examples available by repeating from the start once the last example of every tfrecord is read.

<pre><code>
with tf.name_scope("tf_record_reader"):
    # generate file list
    files = tf.data.Dataset.list_files(glob_pattern, shuffle=training)

    # parallel fetch tfrecords dataset using the file list in parallel
    dataset = files.apply(tf.contrib.data.parallel_interleave(
        lambda filename: tf.data.TFRecordDataset(filename), cycle_length=threads))

    # shuffle and repeat examples for better randomness and allow training beyond one epoch
    dataset = dataset.apply(tf.contrib.data.shuffle_and_repeat(32*self.batch_size))
</code></pre>

### Transform

Now that the data is available in memory the next step is to transform it, preferably into something that does not need any further processing in order to be fed to the neural network input.<br>
A call to the dataset’s map function is required to do this as shown below, where <b>“map_func”</b> is the function applied to every individual example on the CPU and <b>“num_parallel_calls”</b> the number of parallel invocations of the “map_func” to use.
    
<pre><code>
threads = multiprocessing.cpu_count()

 # map the parse  function to each example individually in threads*2 parallel calls
    dataset = dataset.map(map_func=lambda example: _parse_function(example, self.image_size, self.num_classes,training=training), num_parallel_calls=threads)
</code></pre>

### Load

The final step of the ETL process is loading the batched examples onto the accelerator (GPU) ready for processing. In the Dataset class this is achieved by prefetching, which is done by calling the prefetch function of the dataset.

<pre><code>
dataset = dataset.prefetch(buffer_size=self.batch_size)
</code></pre>

Prefetching uncouples the producer (Dataset object on CPU) from the consumer (GPU), this allows them to run in parallel for increased throughput.

# Input_fn

Once the whole E.T.L. process is fully defined and implemented, the “input_fn” can be created by initializing the iterator and grabbing the next example.
<pre><code>
input_fn = dataset.make_one_shot_iterator().get_next()
</code></pre>
This input function is used by the Estimator as an input for the model function.

# Model_fn

<img src='artifacts/estimator.jpeg'/>

The model function the estimator invokes during training, evaluation and prediction, should accept the following arguments:

<pre><code>
def model_fn(features, labels, mode, params):
</code></pre>

The code-path of every mode has to return an <b>“EstimatorSpec”</b> with the required fields for that mode.

## Predict (“tf.estimator.ModeKeys.PREDICT”)

It has to return an “EstimatorSpec” that includes the predictions field:

<pre><code>
return tf.estimator.EstimatorSpec(mode, predictions=predictions)
</code></pre>

<b>In this mode the “EstimatorSpec” expects a dictionary of tensors which will be executed and the results of which will be made available as numpy values to python.</b><br>

It is smart to define the prediction code-path first as it is the simplest, and since most of the code is used for training and evaluation as-well it can show problems early on.

## Train ("tf.estimator.ModeKeys.TRAIN")

It is necessary to create a so called <b>“train_op”</b>, this op is a tensor that when executed performs the back propagation to update the model. Simply put it is the minimize function of an optimizer such as the AdamOptimizer. The <b>“train_op”</b> and the <b>scalar loss</b> tensor are the minimum required arguments to create an “EstimatorSpec” for training. 

## Evaluate ("tf.estimator.ModeKeys.EVAL")

The most important thing in order to perform an eval is the the metrics dictionary, this should be structured as a dictionary of tuples, where the first element of the tuple is a tensor containing the actual metric value and the second element is the tensor that updates the metric value. The update operation is necessary to ensure a reliable metric calculation over the whole validation set. Since it will often be impossible to evaluate the whole validation set in one batch, multiple batches have to be used. To prevent noise in the metric value due to per batch differences, the update operation is used to keep a running average (or gather all results) over all batches. This setup ensures the metric value is calculated over the whole validation set and not a single batch.

# Scaffolds and SessionRunHooks

<img src='artifacts/scaffolds_and_session_run_hooks.png' />