## Training and deploying a custom estimator

In [2]:
import os
import sagemaker

from sagemaker import get_execution_role

sagemaker_session = sagemaker.Session()

role = get_execution_role()

## First we upload our data to S3

When we processed the data for the canned estimator we saved three datasets, with the prefix 'post,' to the data directory. The will be uploaded to S3.

In [3]:
inputs = sagemaker_session.upload_data(path='data', key_prefix='data/post')

### TensorFlow addresses the use of custom estimators [here](https://www.tensorflow.org/get_started/custom_estimators).

A ```model_fn``` function implements model training, evaluation, and prediction. SageMaker's [repo](https://github.com/awslabs/amazon-sagemaker-examples/blob/master/sagemaker-python-sdk/tensorflow_abalone_age_predictor_using_keras/tensorflow_abalone_age_predictor_using_keras.ipynb) has a in depth explanation of how these are constructed.



```python
def model_fn(features, labels, mode, params):
   # Logic to do the following:
   # 1. Configure the model via TensorFlow or Keras operations
   # 2. Define the loss function for training/evaluation
   # 3. Define the training operation/optimizer
   # 4. Generate predictions
   # 5. Return predictions/loss/train_op/eval_metric_ops in EstimatorSpec object
   return EstimatorSpec(mode, predictions, loss, train_op, eval_metric_ops)
   ```

If you are somewhat familiar with machine learning and deep learning, configuring the model, defining loss, and the optimizer may be familiar to you. A few issues to pay attention to:

* Make sure you are passing the correct outputs to your loss function. So predicted class for classification problems. For regression problems you will pass your output through a linear activation and reshape it:

```python

  # Connect the output layer to second hidden layer (no activation fn)
  output_layer = Dense(1, activation='linear')(second_hidden_layer)

  # Reshape output layer to 1-dim Tensor to return predictions
  predictions = tf.reshape(output_layer, [-1])
  
```

* Make sure to create a predictions dictionary with the output you want when serving predictions from your endpoint.


* Make sure to set the feature size in ```python def serving_input_fn(params)```. When we processed our data we tokenized the text, created a bag of words matrix, but we set the maximum number of words argument to 500. Make sure the inputs match this dimension.

```python

    inputs = {INPUT_TENSOR_NAME: tf.placeholder(tf.float32, [None, 500])}
```

* Make sure to set the correct datatypes in the input function ```python def _input_fn(training_dir, training_filename)```. You can see that the label and features have distinct datatypes:

```python filename=os.path.join(training_dir, training_filename), target_dtype=np.int, features_dtype=np.float32)```

* Make sure that your target label you are training on is set as a categorical encoding, which means do not use one-hot encoding. This is true for the canned TensorFlow estimators and custom estimators that perform classification. I was able to build and train custom Keras estimators locally with one-hot encoding, but this would always fail when submitting training to SagerMaker's infrastructure.


***

### Full estimator:

***

```python
import numpy as np
import os
import tensorflow as tf
from tensorflow.python.estimator.export.export import build_raw_serving_input_receiver_fn
from tensorflow.python.estimator.export.export_output import PredictOutput


INPUT_TENSOR_NAME = "inputs"
SIGNATURE_NAME = "serving_default"
LEARNING_RATE = 0.001


def model_fn(features, labels, mode, params):
    
    # 1. Configure the neural net, in this case a very simple two layer network.
    first_hidden_layer = tf.keras.layers.Dense(128, activation='relu', name='first-layer')(features[INPUT_TENSOR_NAME])
    second_hidden_layer = tf.keras.layers.Dense(256, activation='relu')(first_hidden_layer)
    logits = tf.keras.layers.Dense(20)(second_hidden_layer)

    # 1a. This is a classification example we need to find our class predicitons. 
    predicted_classes = tf.argmax(logits, axis=1)

    # Provide an estimator spec for `ModeKeys.PREDICT`.
    if mode == tf.estimator.ModeKeys.PREDICT:
        return tf.estimator.EstimatorSpec(
            mode=mode,
            predictions = {
            'class_ids': predicted_classes[:, tf.newaxis],
            'probabilities': tf.nn.softmax(logits),
            'logits': logits,},
            export_outputs={SIGNATURE_NAME: PredictOutput({"jobs": predicted_classes})})

    # 2. Define the loss function for training/evaluation using Tensorflow.
    loss = tf.losses.sparse_softmax_cross_entropy(tf.cast(labels, dtype=tf.int32), logits)

    # 3. Define the training operation/optimizer using Tensorflow operation/optimizer.
    train_op = tf.contrib.layers.optimize_loss(
        loss=loss,
        global_step=tf.contrib.framework.get_global_step(),
        learning_rate=params["learning_rate"],
        optimizer="Adam")

    # 4. Generate predictions as Tensorflow tensors.
    predictions_dict = {"jobs": predicted_classes,
                        "classes": logits}

    # 5. Generate necessary evaluation metrics.
    # Calculate accuracy
    eval_metric_ops = {
        "accuracy": tf.metrics.accuracy(labels, predicted_classes)
    }

    # Provide an estimator spec for `ModeKeys.EVAL` and `ModeKeys.TRAIN` modes.
    return tf.estimator.EstimatorSpec(
        mode=mode,
        loss=loss,
        train_op=train_op,
        eval_metric_ops=eval_metric_ops)
    

def serving_input_fn(params):
    inputs = {INPUT_TENSOR_NAME: tf.placeholder(tf.float32, [None, 500])}
    return tf.estimator.export.ServingInputReceiver(inputs, inputs)


params = {"learning_rate": LEARNING_RATE}


def train_input_fn(training_dir, params):
    return _input_fn(training_dir, 'post_train.csv')


def eval_input_fn(training_dir, params):
    return _input_fn(training_dir, 'post_test.csv')

def _input_fn(training_dir, training_filename):
    training_set = tf.contrib.learn.datasets.base.load_csv_without_header(
        filename=os.path.join(training_dir, training_filename), target_dtype=np.int, features_dtype=np.float32)

    return tf.estimator.inputs.numpy_input_fn(
        x={INPUT_TENSOR_NAME: np.array(training_set.data)},
        y=np.array(training_set.target),
        num_epochs=None,
        shuffle=True)()
```

## Submit model for training on SageMaker's infrastructure.

In [4]:
from sagemaker.tensorflow import TensorFlow

custom_estimator = TensorFlow(entry_point='custom_estimator.py',
                               role=role,
                               training_steps= 1000,                                  
                               evaluation_steps= 100,
                               hyperparameters={'learning_rate': 0.001},
                               train_instance_count=1,
                               train_instance_type='ml.c4.xlarge')

custom_estimator.fit(inputs)

  return f(*args, **kwds)
INFO:sagemaker:Creating training-job with name: sagemaker-tensorflow-py2-cpu-2018-02-13-15-09-28-445


................................................................
[31mexecuting startup script (first run)[0m
[31m2018-02-13 15:14:41,613 INFO - root - running container entrypoint[0m
[31m2018-02-13 15:14:41,613 INFO - root - starting train task[0m
[31m2018-02-13 15:14:43,307 INFO - botocore.vendored.requests.packages.urllib3.connectionpool - Starting new HTTP connection (1): 169.254.170.2[0m
[31m2018-02-13 15:14:44,246 INFO - botocore.vendored.requests.packages.urllib3.connectionpool - Starting new HTTPS connection (1): s3.amazonaws.com[0m
[31m2018-02-13 15:14:44,321 INFO - botocore.vendored.requests.packages.urllib3.connectionpool - Starting new HTTPS connection (1): s3.us-east-2.amazonaws.com[0m
[31m2018-02-13 15:14:44,418 INFO - botocore.vendored.requests.packages.urllib3.connectionpool - Starting new HTTPS connection (1): s3.amazonaws.com[0m
[31mINFO:tensorflow:----------------------TF_CONFIG--------------------------[0m
[31mINFO:tensorflow:{"environment": "cloud",

===== Job Complete =====


## Deploy model

In [5]:
custom_predictor = custom_estimator.deploy(initial_instance_count=1, instance_type='ml.m4.xlarge')

INFO:sagemaker:Creating model with name: sagemaker-tensorflow-py2-cpu-2018-02-13-15-09-28-445
INFO:sagemaker:Creating endpoint with name sagemaker-tensorflow-py2-cpu-2018-02-13-15-09-28-445


--------------------------------------------------------------------------------------------------!

## Test Predictions

In [6]:
import tensorflow as tf
import numpy as np

prediction_set = tf.contrib.learn.datasets.base.load_csv_without_header(
    filename=os.path.join('data/post_holdout.csv'), target_dtype=np.int, features_dtype=np.float32)

data = prediction_set.data[0]
tensor_proto = tf.make_tensor_proto(values=np.asarray(data), shape=[1, len(data)], dtype=tf.float32)

In [7]:
custom_predictor.predict(tensor_proto)

{'outputs': {'jobs': {'dtype': 'DT_INT64',
   'int64Val': ['6'],
   'tensorShape': {'dim': [{'size': '1'}]}}}}

## Delete endpoint (make sure to do this if just testing)

In [8]:
sagemaker.Session().delete_endpoint(custom_predictor.endpoint)

INFO:sagemaker:Deleting endpoint with name: sagemaker-tensorflow-py2-cpu-2018-02-13-15-09-28-445
