## Testing your algorithm on your local machine

When you're packaging you first algorithm to use with Amazon SageMaker, you probably want to test it yourself to make sure it's working correctly. We use the [SageMaker Python SDK](https://github.com/aws/sagemaker-python-sdk) to test both locally and on SageMaker. For more examples with the SageMaker Python SDK, see [Amazon SageMaker Examples](https://github.com/awslabs/amazon-sagemaker-examples/tree/master/sagemaker-python-sdk). In order to test our algorithm, we need our dataset.

## SageMaker Python SDK Local Training
To represent our training, we use the Estimator class, which needs to be configured in five steps. 
1. IAM role - our AWS execution role
2. train_instance_count - number of instances to use for training.
3. train_instance_type - type of instance to use for training. For training locally, we specify `local`.
4. image_name - our custom TensorFlow Docker image we created.
5. hyperparameters - hyperparameters we want to pass.

Let's start with setting up our IAM role. We make use of a helper function within the Python SDK. This function throw an exception if run outside of a SageMaker notebook instance, as it gets metadata from the notebook instance. If running outside, you must provide an IAM role with proper access stated above in [Permissions](#Permissions).

In [None]:
from sagemaker import get_execution_role

role = get_execution_role()

## Fit, Deploy, Predict

Now that the rest of our estimator is configured, we can call `fit()` with the path to our local CIFAR10 dataset prefixed with `file://`. This invokes our TensorFlow container with 'train' and passes in our hyperparameters and other metadata as json files in /opt/ml/input/config within the container.

After our training has succeeded, our training algorithm outputs our trained model within the /opt/ml/model directory, which is used to handle predictions.

We can then call `deploy()` with an instance_count and instance_type, which is 1 and `local`. This invokes our Tensorflow container with 'serve', which setups our container to handle prediction requests through TensorFlow Serving. What is returned is a predictor, which is used to make inferences against our trained model.

After our prediction, we can delete our endpoint.

We recommend testing and training your training algorithm locally first, as it provides quicker iterations and better debuggability.

In [None]:
# Lets set up our SageMaker notebook instance for local mode.
!/bin/bash ./utils/setup.sh

In [None]:
from sagemaker.estimator import Estimator
import subprocess

hyperparameters = {'algo-name': 'DeepAR', 'freq': '1D', 'prediction-length': 5, 'context-length': 5*30, 'epochs': 2, 'batch-size': 32, 'num-batches-per-epoch': 50}
# hyperparameters = {'algo-name': 'NPTS', 'freq': '1D', 'prediction-length': 5, 'context-length': 5*30, 'epochs': 2, 'batch-size': 32, 'num-batches-per-epoch': 50}
# hyperparameters = {'algo-name': 'SeasonalNaive', 'freq': '1D', 'prediction-length': 5, 'context-length': 5*30, 'epochs': 2, 'batch-size': 32, 'num-batches-per-epoch': 50}

instance_type = 'local'

# if subprocess.call('nvidia-smi') == 0:
#     ## Set type to GPU if one is present
#     instance_type = 'local_gpu'
    
print("Instance type = " + instance_type)

estimator = Estimator(role=role,
                      train_instance_count=1,
                      train_instance_type=instance_type,
                      image_uri='gluonts_sagemaker:latest',
                      hyperparameters=hyperparameters)

estimator.fit({'train': 'file:///home/ec2-user/SageMaker/gluonts_sagemaker/data/train_1D.json', 'test': 'file:///home/ec2-user/SageMaker/gluonts_sagemaker/data/test_1D.json'})

In [None]:
predictor = estimator.deploy(1, instance_type)

## Making predictions using Python SDK

To make predictions, we use an image that is converted using OpenCV into a json format to send as an inference request. We need to install OpenCV to deserialize the image that is used to make predictions.

The JSON reponse will be the probabilities of the image belonging to one of the 10 classes along with the most likely class the picture belongs to. The classes can be referenced from the [CIFAR-10 website](https://www.cs.toronto.edu/~kriz/cifar.html). Since we didn't train the model for that long, we aren't expecting very accurate results.

In [None]:
!head -n 1 /home/ec2-user/SageMaker/gluonts_sagemaker/data/test_1D.json

In [None]:
import requests
import json

url='http://localhost:8080/invocations'

test_data = {"start": "2020-01-22 00:00:00", "target": [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 4.0, 4.0, 5.0, 7.0, 7.0, 7.0, 11.0, 16.0, 21.0, 22.0, 22.0, 22.0, 24.0, 24.0, 40.0, 40.0, 74.0, 84.0, 94.0, 110.0, 110.0, 120.0, 170.0, 174.0, 237.0], "id": "Afghanistan_"}
test_data['target'] = test_data['target'][:-5]

data = {'instances': [test_data]}
data['freq'] = '1D'
data['target_quantile'] = 0.5

payload = json.dumps(data)

r = requests.post(url, data=payload)

#show result
print (r.text)

In [None]:
predictor.delete_endpoint()

# Part 2: Training and Hosting your Algorithm in Amazon SageMaker
Once you have your container packaged, you can use it to train and serve models. Let's do that with the algorithm we made above.

## Set up the environment
Here we specify the bucket to use and the role that is used for working with SageMaker.

In [None]:
# S3 prefix
prefix = 'gluonts_sagemaker'

## Create the session

The session remembers our connection parameters to SageMaker. We use it to perform all of our SageMaker operations.

In [None]:
import sagemaker as sage

sess = sage.Session()

## Upload the data for training

We will use the tools provided by the SageMaker Python SDK to upload the data to a default bucket.

In [None]:
WORK_DIRECTORY = '/home/ec2-user/SageMaker/gluonts_sagemaker/data'
data_location = sess.upload_data(WORK_DIRECTORY, key_prefix=prefix)
print(data_location)

## Training on SageMaker
Training a model on SageMaker with the Python SDK is done in a way that is similar to the way we trained it locally. This is done by changing our train_instance_type from `local` to one of our [supported EC2 instance types](https://aws.amazon.com/sagemaker/pricing/instance-types/).

In addition, we must now specify the ECR image URL, which we just pushed above.

Finally, our local training dataset has to be in Amazon S3 and the S3 URL to our dataset is passed into the `fit()` call.

Let's first fetch our ECR image url that corresponds to the image we just built and pushed.

In [None]:
import boto3

client = boto3.client('sts')
account = client.get_caller_identity()['Account']

my_session = boto3.session.Session()
region = my_session.region_name

algorithm_name = 'gluonts_sagemaker'

if region.startswith('cn'):
    ecr_image = '{}.dkr.ecr.{}.amazonaws.com.cn/{}:latest'.format(account, region, algorithm_name)
else:
    ecr_image = '{}.dkr.ecr.{}.amazonaws.com/{}:latest'.format(account, region, algorithm_name)

print(ecr_image)

In [None]:
from sagemaker.estimator import Estimator

instance_type = 'ml.c5.2xlarge'

estimator = Estimator(role=role,
                      instance_count=1,
                      instance_type=instance_type,
                      image_uri=ecr_image,
                      hyperparameters=hyperparameters,
                      base_job_name=algorithm_name.replace('_', '-'))

estimator.fit({'train': data_location+'/train_1D.json', 'test': data_location+'/test_1D.json'})

In [None]:
instance_type = 'ml.m5.xlarge'
predictor = estimator.deploy(1, instance_type)

In [None]:
from sagemaker.serializers import JSONSerializer
from sagemaker.deserializers import JSONDeserializer

predictor.serializer = JSONSerializer()
predictor.deserializer = JSONDeserializer()

result = predictor.predict(data)
print(result)

## Optional cleanup
When you're done with the endpoint, you should clean it up.

All of the training jobs, models and endpoints we created can be viewed through the SageMaker console of your AWS account.

In [None]:
predictor.delete_endpoint()

## Batch Transform

In [None]:
transformer = estimator.transformer(
    instance_count=1,
    instance_type="ml.m5.xlarge",
    strategy="SingleRecord",
    assemble_with="Line",
    output_path=data_location+'/output',
)

test_s3 = data_location + '/test_1D.json'

transformer.transform(test_s3, split_type="Line", wait=True)

# Reference
- [How Amazon SageMaker interacts with your Docker container for training](https://docs.aws.amazon.com/sagemaker/latest/dg/your-algorithms-training-algo.html)
- [How Amazon SageMaker interacts with your Docker container for inference](https://docs.aws.amazon.com/sagemaker/latest/dg/your-algorithms-inference-code.html)
- [CIFAR-10 Dataset](https://www.cs.toronto.edu/~kriz/cifar.html)
- [SageMaker Python SDK](https://github.com/aws/sagemaker-python-sdk)
- [Dockerfile](https://docs.docker.com/engine/reference/builder/)
- [scikit-bring-your-own](https://github.com/awslabs/amazon-sagemaker-examples/blob/master/advanced_functionality/scikit_bring_your_own/scikit_bring_your_own.ipynb)