# Tuning your Tensorflow model using Automatic Model Tuning
In this notebook, we will tune our Tensorflow model through Sagemaker. The model is written in an [*entry_point* file](https://docs.aws.amazon.com/sagemaker/latest/dg/tf-training-inference-code-template.html) and consists of two important definitions: 
1. Model definition using the **model_fn**
2. Data feeding definition using the **train_input_fn** and **eval_input_fn**.

To optimise our training time we use Sagemaker's [Pipe input mode](https://aws.amazon.com/blogs/machine-learning/using-pipe-input-mode-for-amazon-sagemaker-algorithms/). *Pipe input mode* streams datasets directly to the training instances instead of being downloaded first. This paradigm improves performance on multiple facets: 
1. Training jobs can start sooner since they don't need to wait for the full dataset
2. Training instances require less storage space
3. Streaming data from S3 is faster than streaming from a local file since s3 filehandles are highly optimised and multi-threaded.

In [1]:
import sagemaker

bucket = sagemaker.Session().default_bucket() 
prefix = 'radix/mnist_fashion_tutorial' 

role = sagemaker.get_execution_role() 

In [2]:
import boto3
from time import gmtime, strftime
from sagemaker.tensorflow import TensorFlow
from sagemaker.tuner import IntegerParameter, CategoricalParameter, ContinuousParameter, HyperparameterTuner

## Model definition in an entrypoint file

In [3]:
!cat 'cnn_fashion_mnist.py'

import os
import tensorflow as tf
from tensorflow.python.estimator.model_fn import ModeKeys as Modes
from sagemaker_tensorflow import PipeModeDataset
from tensorflow.contrib.data import map_and_batch

INPUT_TENSOR_NAME = 'inputs'
SIGNATURE_NAME = 'predictions'
PREFETCH_SIZE = 10
BATCH_SIZE = 256
NUM_PARALLEL_BATCHES = 10
MAX_EPOCHS = 20


def _conv_pool(inputs, kernel_shape, kernel_count, padding_type):
    # Convolutional Layer 
    conv = tf.layers.conv2d(
      inputs=inputs,
      filters=kernel_count,
      kernel_size=kernel_shape,
      padding=padding_type,
      activation=tf.nn.relu)

    # Pooling Layer 
    pool = tf.layers.max_pooling2d(inputs=conv, pool_size=[2, 2], strides=2)
    return pool

    

def model_fn(features, labels, mode, params):
    learning_rate = params.get("learning_rate", 0.0001)
    dropout_rate = params.get("dropout_rate", 0.8)
    nw_depth = params.get("nw_depth", 2)
    optimizer_type = params.get("optimizer_type", 

## Use Tensorflow model through Sagemaker's Tensorflow estimator
We use Sagemaker's Estimator wrapper called Tensorflow to make our own tensorflow model compatible with Sagemaker's services (more information can be found [here](https://sagemaker.readthedocs.io/en/latest/sagemaker.tensorflow.html)). Note that we explicitly set the *input_mode* to 'Pipe' to force the usage of Pipe file mode. 

In [4]:
estimator = TensorFlow(entry_point='cnn_fashion_mnist.py',
                       role=role,
                       input_mode='Pipe',
                       training_steps=20_000, 
                       evaluation_steps=100,
                       train_instance_count=1,
                       train_instance_type='ml.c5.2xlarge',
                       base_job_name='radix_mnist_fashion')

## Automatic Model Tuning
Automatic Model Tuning is a service that automatically optimises the provided model through its hyperparameters. The service finds the best hyperparameter configuration through Bayesian Optimisation. The tuning process parallelises its computations across multiple instances which improve performance significantly. 

Using automatic model tuning involves three steps.
1. Define which objective that has to be optimised. If you want to use an alternative objective such as accuracy you need to log this metric during training since the objective is fetched from the logs using a regular expression.
2. Define hyperparameter ranges. Avoid using IntegerParameter for continuous variables since this will limit hyperparameter exploration. 
3. Define a HyperparameterTuner instance which defines the number of (parallel) jobs that will be used during the optimisation process.

More information on how to use automatic model tuning can be found using [this link](https://docs.aws.amazon.com/sagemaker/latest/dg/automatic-model-tuning.html).

The results of the hyperparameter search can be viewed in the Amazon SageMaker console. AWS also provides a Jupiter notebook to analyse the results of the hyperparameter search which can be found [here](https://github.com/awslabs/amazon-sagemaker-examples/blob/master/hyperparameter_tuning/analyze_results/HPO_Analyze_TuningJob_Results.ipynb).

In [5]:
# 1. Define which objective has to be optimised
objective_metric_name = 'loss'
objective_type = 'Minimize'
metric_definitions = [{'Name': 'loss',
                       'Regex': 'loss = ([0-9\\.]+)'}]

In [6]:
# 2. Define hyperparameter ranges
hyperparameter_ranges = {
                            'learning_rate': ContinuousParameter(0.0001, 0.001), 
                            'dropout_rate': ContinuousParameter(0.3, 1.0),
                            'nw_depth': IntegerParameter(1, 4),
                            'optimizer_type': CategoricalParameter(['sgd', 'adam']),
                        }

In [7]:
# 3. Instantiate a HyperparameterTuner instance
tuner = HyperparameterTuner(estimator,
                            objective_metric_name,
                            hyperparameter_ranges,
                            metric_definitions,
                            max_jobs=16,
                            max_parallel_jobs=4,
                            objective_type=objective_type)

In [8]:
# Fit the HyperparameterTuner to start the hyperparameter optimisation process
train_data = 's3://sagemaker-eu-central-1-959924085179/radix/mnist_fashion_tutorial/data/mnist/train.tfrecords'
eval_data = 's3://sagemaker-eu-central-1-959924085179/radix/mnist_fashion_tutorial/data/mnist/validation.tfrecords'

tuner.fit({'train':train_data, 'eval':eval_data}, logs=False)

INFO:sagemaker:Creating hyperparameter tuning job with name: sagemaker-tensorflow-180824-0803


In [9]:
# Sanity check if the optimisation process has started 
boto3.client('sagemaker').describe_hyper_parameter_tuning_job(
    HyperParameterTuningJobName=tuner.latest_tuning_job.job_name)['HyperParameterTuningJobStatus']

'InProgress'