# Train and Host a Keras Sequential Model

This notebook shows how to train and host a Keras Sequential model on SageMaker. The model used for this notebook is a simple deep CNN that was extracted from [the Keras examples](https://github.com/keras-team/keras/blob/master/examples/cifar10_cnn.py).

## The dataset
The [CIFAR-10 dataset](https://www.cs.toronto.edu/~kriz/cifar.html) is one of the most popular machine learning datasets. It consists of 60,000 32x32 images belonging to 10 different classes (6,000 images per class). Here are the classes in the dataset, as well as 10 random images from each:

![cifar10](https://maet3608.github.io/nuts-ml/_images/cifar10.png)

In this tutorial, we will train a deep CNN to recognize these images.

## Set up the environment

In [25]:
import os
import sagemaker
import boto3
from sagemaker import get_execution_role

sagemaker_session = sagemaker.Session()

role = get_execution_role()

## Download the CIFAR-10 dataset
Downloading the test and training data will take around 5 minutes.

In [2]:
import utils

utils.cifar10_download()

## Upload the dataset to an S3 bucket

In [3]:
inputs = sagemaker_session.upload_data(path='/tmp/cifar10_data', key_prefix='data/DEMO-cifar10')

`sagemaker_session.upload_data` will upload the CIFAR-10 dataset from this machine to a bucket named **sagemaker-{region}-{*your aws account number*}**, if you don't have this bucket yet, `sagemaker_session` will create it for you.

## Complete source code
Here is the full source code for the model:

In [None]:
!cat cifar10_cnn.py

Lets take a closer look:

### The model function
This function constitutes the main difference between TensorFlow and Keras models on SageMaker; Keras models have a `keras_model_fn`:

In [4]:
def keras_model_fn(hyperparameters):
    """keras_model_fn receives hyperparameters from the training job and returns a compiled keras model.
    The model will be transformed into a TensorFlow Estimator before training and it will be saved in a 
    TensorFlow Serving SavedModel at the end of training.

    Args:
        hyperparameters: The hyperparameters passed to the SageMaker TrainingJob that runs your TensorFlow 
                         training script.
    Returns: A compiled Keras model
    """
    model = Sequential()

    model.add(Conv2D(32, (3, 3), padding='same', name='inputs', input_shape=(HEIGHT, WIDTH, DEPTH)))
    model.add(Activation('relu'))
    model.add(Conv2D(32, (3, 3)))
    model.add(Activation('relu'))
    model.add(MaxPooling2D(pool_size=(2, 2)))
    model.add(Dropout(0.25))

    model.add(Conv2D(64, (3, 3), padding='same'))
    model.add(Activation('relu'))
    model.add(Conv2D(64, (3, 3)))
    model.add(Activation('relu'))
    model.add(MaxPooling2D(pool_size=(2, 2)))
    model.add(Dropout(0.25))

    model.add(Flatten())
    model.add(Dense(512))
    model.add(Activation('relu'))
    model.add(Dropout(0.5))
    model.add(Dense(NUM_CLASSES))
    model.add(Activation('softmax'))
    
    opt = RMSPropOptimizer(learning_rate=hyperparameters['learning_rate'], decay=hyperparameters['decay'])

    model.compile(loss='categorical_crossentropy',
                  optimizer=opt,
                  metrics=['accuracy'])

    return model

This function builds and returns a compiled Keras model.

**Note:** The first layer is named `PREDICT_INPUTS`. This serves as a workaround for a known issue where TensorFlow does not recognize the default (or any custom) name for the first layer of Keras models. Furthermore, note that we are wrapping our model in a `tf.keras.Model` before returning it. This serves as a workaround for a known issue where a Sequential model cannot be directly converted into an Estimator. See [here](https://github.com/tensorflow/tensorflow/issues/20552) for more information about the issue.

### Input functions
These functions are similar to those required by any other model using the TensorFlow Estimator API.

In [5]:
def serving_input_fn(params):
    # Notice that the input placeholder has the same input shape as the Keras model input
    tensor = tf.placeholder(tf.float32, shape=[None, HEIGHT, WIDTH, DEPTH])
    
    # The inputs key INPUT_TENSOR_NAME matches the Keras InputLayer name
    inputs = {INPUT_TENSOR_NAME: tensor}
    return tf.estimator.export.ServingInputReceiver(inputs, inputs)


def train_input_fn(training_dir, params):
    return _input(tf.estimator.ModeKeys.TRAIN,
                    batch_size=BATCH_SIZE, data_dir=training_dir)


def eval_input_fn(training_dir, params):
    return _input(tf.estimator.ModeKeys.EVAL,
                    batch_size=BATCH_SIZE, data_dir=training_dir)

The `train_` and `eval_` functions call the `_input` function which returns a properly processed and shuffled (for training) set of images and labels.

## Create a training job using the SageMaker TensorFlow Estimator

In [21]:
from sagemaker.tensorflow import TensorFlow

estimator = TensorFlow(entry_point='cifar10_cnn.py',
                       role=role,
                       framework_version='1.12.0',
                       hyperparameters={'learning_rate': 1e-4, 'decay':1e-6},
                       training_steps=300, evaluation_steps=100,
                       train_instance_count=1, train_instance_type='ml.c4.xlarge')

estimator.fit(inputs)

W0907 23:28:30.799640 139920829364032 estimator.py:290] tensorflow py2 container will be deprecated soon.


2019-09-07 23:28:31 Starting - Starting the training job...
2019-09-07 23:28:33 Starting - Launching requested ML instances......
2019-09-07 23:29:34 Starting - Preparing the instances for training...
2019-09-07 23:30:23 Downloading - Downloading input data...
2019-09-07 23:30:51 Training - Training image download completed. Training in progress.
[31m2019-09-07 23:30:51,758 INFO - root - running container entrypoint[0m
[31m2019-09-07 23:30:51,759 INFO - root - starting train task[0m
[31m2019-09-07 23:30:51,773 INFO - container_support.training - Training starting[0m
[31mDownloading s3://sagemaker-us-west-2-415365883687/sagemaker-tensorflow-2019-09-07-23-28-30-800/source/sourcedir.tar.gz to /tmp/script.tar.gz[0m
[31m2019-09-07 23:30:54,858 INFO - tf_container - ----------------------TF_CONFIG--------------------------[0m
[31m2019-09-07 23:30:54,858 INFO - tf_container - {"environment": "cloud", "cluster": {"master": ["algo-1:2222"]}, "task": {"index": 0, "type": "master"}}[0

**Note**: Keras models have a known issue and cannot be used for distributed (multi-instance) training. Keep `train_instance_count == 1` until the TensorFlow/Keras team support this feature. See [here](https://github.com/tensorflow/tensorflow/issues/14504) for more information about the issue.


## Deploy the trained model

The deploy() method creates an endpoint which serves prediction requests in real-time.

In [7]:
predictor = estimator.deploy(initial_instance_count=1, instance_type='ml.m4.xlarge', py_version='py3')

W0907 22:49:07.801043 139920829364032 model.py:103] The Python 2 tensorflow images will be soon deprecated and may not be supported for newer upcoming versions of the tensorflow images.
Please set the argument "py_version='py3'" to use the Python 3 tensorflow image.


--------------------------------------------------------------------------------------!

## Make some predictions
Prediction is not the focus of this notebook, so to verify the endpoint's functionality, we'll simply generate random data in the correct shape and make a prediction.

In [10]:
# Creating fake prediction data
import numpy as np
data = np.random.randn(1, 32, 32, 3)

# The inputs key 'inputs_input' matches the Keras InputLayer name
predictor.predict({'inputs_input': data}) 

{'outputs': {'activation_5': {'dtype': 1,
   'tensor_shape': {'dim': [{'size': 1}, {'size': 10}]},
   'float_val': [0.003199951956048608,
    0.21724414825439453,
    0.0012326259166002274,
    0.003975691739469767,
    0.004360983148217201,
    0.0010368996299803257,
    0.40071144700050354,
    0.037878673523664474,
    0.0004917706828564405,
    0.32986781001091003]}},
 'model_spec': {'name': 'generic_model',
  'version': {'value': 1567896439},
  'signature_name': 'serving_default'}}

# Hyperparameter Tuning

In [22]:
from sagemaker.tuner import IntegerParameter, CategoricalParameter, ContinuousParameter, HyperparameterTuner

hyperparameter_ranges = {'learning_rate': ContinuousParameter(0.01, 0.2)}
objective_metric_name = 'loss'
objective_type = 'Minimize'
metric_definitions = [{'Name': 'loss',
                       'Regex': 'loss = ([0-9\\.]+)'}]

tuner = HyperparameterTuner(estimator,
                            objective_metric_name,
                            hyperparameter_ranges,
                            metric_definitions,
                            max_jobs=9,
                            max_parallel_jobs=3,
                            objective_type=objective_type)

In [23]:
tuner.fit(inputs)

In [26]:
boto3.client('sagemaker').describe_hyper_parameter_tuning_job(
    HyperParameterTuningJobName=tuner.latest_tuning_job.job_name)['HyperParameterTuningJobStatus']

'InProgress'

# Cleaning up
To avoid incurring charges to your AWS account for the resources used in this tutorial you need to delete the SageMaker Endpoint:

In [None]:
sagemaker.Session().delete_endpoint(predictor.endpoint)