# Intro

This notebook details how to use SageMaker for a custom in-house model using the sagemaker sdk.

Content:
- Training
- Hyperparameter Optimization
- Inference:
    - Batch Transform
    - Endpoint

#### Let's first set the proxy variables

In [1]:
import os

os.environ['HTTP_PROXY'] = "***"
os.environ['HTTPS_PROXY'] = "***"
os.environ['no_proxy'] = "***"

#### Configuration

* We have gathered all our program configuration in a `yml` configuration file.

In [2]:
import subprocess
import sys
sys.path.insert(0,'..')

import src.config as cf

In [3]:
config = cf.ProgramConfiguration("../conf/dev.yml",
                                 "../conf/functional.yml")

  self._config_tech = yaml.load(f)
  self._config_func = yaml.load(f)


## 1. Build Docker training image

#### Now let's build the image

> Let's checkout the content of `container/Dockerfile_train`

> What happens in `container/Dockerfile_train` ?

Well, we do the following:

- Copy the ML source code ( from `src/` and `conf/` ) to `/opt/program` on the Docker container running on the machine we'll pop for ML. Why `/opt/program` ? Because SageMaker expects all source code to be here ( it's a **WORKDIR** in the base Docker images we use, we can change it, but it's better to adhere to the norms ). We also copy the `container/requirements_train.txt` file to the image.

*PS: Any source code you don't COPY will not be available in the Docker image ==> not available in the ML instance machine you'll pop*


- We set a bunch of SageMaker environment variables ( **SM_CHANNEL_TRAIN**, **SM_MODEL_DIR**, **SM_DATA_DIR** ), to tell SageMaker where to look for training data and where to put model artefacts.

*PS: SageMaker copies your training data from an S3 path you provide ( when you call sagemaker for training - see later on in this notebook Estimator class ) to the path **SM_CHANNEL_TRAIN** on the container. So your ML code, should not read data from S3, but from this local path ( again, **local** in the container ).*

- Lastly, we configure the container to run as a Python executable when it's running. What happens is: When SageMaker is called to do training ( sdk's *Estimator* or boto3's *create_training_job()* etc. ), it adds an argument called `train` when it runs the Docker image in a container. Since we configured the container to be a Python executable, what happens in the container is:

```python
python train
```

which means that the *file* in `/opt/program` called **train** ( `/opt/program` in the container, which we've copied from `src/train` locally remember ? ) is executed. This train file is where your ML code should be ! More on this later...

*PS: We can change the name train if we'd like, through a SageMaker environment varibale **SM_PROGRAM**, but again, the norm is to keep it like this )*

NOW let's build the image !

> Let's checkout the script to build the image `../container/build_image_train.sh`

> The script collects the name to give the image as an argument, authenticates to ECR thenbuilds the images and pushes it there

In [7]:
IMAGE_NAME = 'sagemaker-tutorial-mlp'

In [None]:
%%sh -s "$IMAGE_NAME"

cd ..
sh container/build_image_train.sh $1

## 2. Bring your own : Training

#### Let's get the location of our training image

In [8]:
import boto3
from sagemaker import get_execution_role
from sagemaker.estimator import Estimator

In [9]:
region = boto3.session.Session().region_name
account = boto3.client('sts').get_caller_identity()['Account']
tag = ':latest'

The full path to the training image ( as it's coded in `container/build_image_train.sh` ) is:

In [10]:
image_name = '{}.dkr.ecr.{}.amazonaws.com/{}'.format(account, region, IMAGE_NAME + tag)

The SageMaker role associated to this notebook

In [None]:
get_execution_role()

Let's get the rest of the training job configurations.

*You can just set your variable values directly in here instead of doing it through a configuration file if you'd like*

In [13]:
role = config.get_global_role_arn() # same as = get_execution_role() in this case
bucket = config.get_train_bucket_input()
project_id = config.get_train_path_refined_data_input()
hyperparameters = config.get_train_hyperparameters()
train_instance_count = config.get_train_instance_count()
train_instance_type = config.get_train_instance_type()
security_group_ids = config.get_global_security_group_ids()
subnets = config.get_global_subnets()

model_output = "***/sagemaker-tutorials/"
model_uri = "s3://{}/{}".format(bucket, model_output)

In [None]:
print("- role:", role,
      "\n- image name:", image_name,
      "\n- bucket:", bucket,
      "\n- project_id:", project_id,
      "\n- hyperparameters:\n", hyperparameters,
      "\n- train_instance_count:", train_instance_count,
      "\n- train_instance_type:", train_instance_type,
      "\n- security_group_ids:", security_group_ids,
      "\n- subnets:", subnets
     )

In [17]:
# If you'd like to run the docker container locally
# instead of popping a machine ( faster to check your dev )
train_instance_type = 'ml.m5.2xlarge' #'local'

#### Let's start the training

* The **Estimator** class from SageMaker SDK pops the machine you specify ( in the **train_instance_type** variable ), and runs a container from the Docker image we built ( with the training code in it, recall ? ). This means that it executes the **train** file remember ? 

SO, for your dev ML, you should change the ``src/train`` file ( of course the file has dependencies with other files, but you know what I mean, the *program* which runs when you call SageMaker for training is in **/opt/program/train** ), then make changes to `container/Dockerfile_train` if need be ( to change your base image, add dependencies etc. ). Don't forget to rebuild the image ( i.e. run the `container/build_image_train.sh` file EVERYTIME you change the code, you know why ? because the new code needs to be copied to the Docker image ).

⚠️ Keep in mind :
- If you do not specify **security_group_ids** and **subnet** you will have permission errors
- Pay attention to where your training code reads its training data from in relation to how you feed the data to SageMaker in your **.fit()** method.
If you do :
1. 
```python
estimator.fit('s3://'+bucket+'/'+project_id)
```
Your training code should expect to read the data from: **SM_DATA_DIR=/opt/ml/input/data**

2. 
```python
estimator.fit({'training': 's3://'+bucket+'/'+project_id})
```
You specified a **channel**, so your code should expect to read the data from: **SM_CHANNEL_TRAIN=/opt/ml/input/data/training**

In [19]:
estimator = Estimator(role=role,
                      train_instance_count=train_instance_count,
                      train_instance_type=train_instance_type,
                      image_name=image_name,
                      hyperparameters=hyperparameters,
                      #model_uri=model_uri,
                      security_group_ids=security_group_ids,
                      subnets=subnets
                      )

estimator.fit({'training': 's3://'+bucket+'/'+project_id})

2020-03-09 21:14:35 Starting - Starting the training job...
2020-03-09 21:14:37 Starting - Launching requested ML instances...
2020-03-09 21:15:34 Starting - Preparing the instances for training......
2020-03-09 21:16:17 Downloading - Downloading input data...
2020-03-09 21:16:43 Training - Downloading the training image......
2020-03-09 21:17:52 Training - Training image download completed. Training in progress.[34mINFO:root:Using CPU[0m
[34m{'context_length': 140, 'num_hidden_dimensions': 5, 'len_hidden_dimensions': 38, 'epochs': 20, 'batch_size': 512, 'num_batches_per_epoch': 217, 'learning_rate': 6e-05}[0m
[34mGenerating GluonTS dataset for cutoff 202001...[0m
[34mINFO:root:Using CPU[0m
[34mINFO:root:Start model training[0m
[34mINFO:root:Number of parameters in SimpleFeedForwardTrainingNetwork: 535[0m
[34mINFO:root:Epoch[0] Learning rate is 6e-05[0m
[34m#015  0%|          | 0/217 [00:00<?, ?it/s]#015100%|██████████| 217/217 [00:05<00:00, 37.04it/s, avg_epoch_loss=8.1

## 3. Bring your own : Hyperparameter Optimization

In [56]:
from time import gmtime, strftime
from sagemaker.tuner import IntegerParameter, CategoricalParameter, ContinuousParameter, HyperparameterTuner

In [57]:
tuning_job_name = 'sagemaker-tutorials-mlp-tuning' # + strftime("%d-%H-%M-%S", gmtime())

hyperparameter_ranges = {
        'epochs': IntegerParameter (25, 30),
        'learning_rate': ContinuousParameter(1e-05, 1e-03, scaling_type="Logarithmic") #,
#        'batch_size': IntegerParameter (?, ?),
#        'context_length': IntegerParameter (?, ?),
#        'num_hidden_dimensions': IntegerParameter (?, ?),
#        'len_hidden_dimensions': IntegerParameter (?, ?),
}

In [58]:
objective_metric_name = 'Final loss'
objective_type = 'Minimize'
metric_definitions = [{'Name': 'Final loss',
                       'Regex': 'Final loss: ([0-9\\.]+)'}]

In [59]:
tuner = HyperparameterTuner(estimator,
                            objective_metric_name,
                            hyperparameter_ranges,
                            metric_definitions,
                            max_jobs=3,
                            max_parallel_jobs=1,
                            objective_type=objective_type)

In [60]:
#tuner.fit({'training': 's3://'+bucket+'/'+project_id,
#          'test': 's3://'+bucket+'/'+project_id})
# tuner.fit('s3://'+bucket+'/'+project_id)

tuner.fit({'training': 's3://'+bucket+'/'+project_id})

Let's check that the job has started...

In [61]:
boto3.client('sagemaker').describe_hyper_parameter_tuning_job(
    HyperParameterTuningJobName=tuner.latest_tuning_job.job_name)['HyperParameterTuningJobStatus']

'InProgress'

Once completed...

In [70]:
boto3.client('sagemaker').describe_hyper_parameter_tuning_job(
    HyperParameterTuningJobName=tuner.latest_tuning_job.job_name)['HyperParameterTuningJobStatus']

'Completed'

Let's get the best hyperparameters

In [69]:
boto3.client('sagemaker').describe_hyper_parameter_tuning_job(
HyperParameterTuningJobName=tuner.latest_tuning_job.job_name)['BestTrainingJob']['TunedHyperParameters']

{'epochs': '29', 'learning_rate': '4.364909274007211e-05'}

## 4. Bring your own : Batch Transform

Remember : Amazon SageMaker runs your container with the argument *train* or *serve*.
How your container processes this argument depends on the container.

Here, we choose to have seperate images for training and serving. We don't define an ENTRYPOINT in the Dockerfile so Docker will run the command *serve* at serving time, and thus run the file **serve** defined as an executable Python script.
(*However, since we are building separate containers for training and hosting, we could have defined a program as an ENTRYPOINT in the Dockerfile and ignore (or verify) the first argument passed in.*)

**Running your container during hosting**

Hosting has a very different model than training because hosting is reponding to inference requests that come in via HTTP. In this example, we use our recommended Python serving stack to provide robust and scalable serving of inference requests:

**> Request serving stack**

This stack is implemented in the sample code here and you can mostly just leave it alone.


![alt text](../assets/stack_serving.png "Title")


Amazon SageMaker uses two URLs in the container:


`/ping` will receive GET requests from the infrastructure. Your program returns 200 if the container is up and accepting requests.

`/invocations` is the endpoint that receives client inference POST requests. The format of the request and the response is up to the algorithm. If the client supplied ContentType and Accept headers, these will be passed in as well.
The container will have the model files in the same place they were written during training:

    /opt/ml
    |-- model
        |-- <model files>

The parts of the sample container
In the container directory are all the components you need to package the sample algorithm for Amazon SageMager:

    .
    |-- container
        |-- Dockerfile_serve
        |-- build_image_serve.sh
    |-- src
        |-- preprocess.py
        |-- model.py
        |-- config.py
        |-- utils.py
        |-- nginx.conf
        |-- predictor.py
        |-- serve
        |-- train
        `-- wsgi.py

Let's discuss each of these in turn:

* **Dockerfile** describes how to build your Docker container image. More details below.
* **build_and_push.sh** is a script that uses the Dockerfile to build your container image for serving and then pushes it to ECR. 
* **src** is the directory which contains the files that will be copied to the container.

The files that we'll put in the container are:

* **nginx.conf** is the configuration file for the nginx front-end. Generally, you should be able to take this file as-is.
* **predictor.py** is the program that actually implements the Flask web server and the gluonts mlp predictions for this app. You'll want to customize the actual prediction parts to your application. Since this algorithm is simple, we do all the processing here in this file, but you may choose to have separate files for implementing your custom logic.
* **serve** is the program started when the container is started for hosting. It simply launches the gunicorn server which runs multiple instances of the Flask app defined in `predictor.py`. You should be able to take this file as-is.
* **wsgi.py** is a small wrapper used to invoke the Flask app. You should be able to take this file as-is.

In summary, the file you will probably want to change for this part is and `predictor.py`.

In [None]:
!cat ../container/Dockerfile_serve

In [None]:
!cat ../container/build_image_serve.sh

#### Now let's build the serving image

In [119]:
import sagemaker

In [120]:
session = sagemaker.Session()
sagemaker_client = boto3.client('sagemaker')

In [49]:
SERVING_IMAGE_NAME = 'sagemaker-tutorial-mlp-serving'

In [None]:
%%sh -s "$SERVING_IMAGE_NAME"

cd ..
sh container/build_image_serve.sh $1

After creating a training job that meets your criteria, you are now ready to create a model. The model takes the training job and algorithm and creates a Docker configuration, which SageMaker (or any platform) can **host** for you.

In [155]:
primary_container = {
    'Image': '***.dkr.ecr.eu-west-1.amazonaws.com/sagemaker-tutorial-mlp-serving:latest',
    'ModelDataUrl': 's3://sagemaker-eu-west-1-***/sagemaker-tutorial-mlp-2020-03-09-21-14-33-854/output/model.tar.gz'
}
model_name = 'sagemaker-tutorials-mlp'

# Delete old model version if exists
try: 
    session.delete_model(model_name)
except:
    pass

In [156]:
create_model_response = sagemaker_client.create_model(
    ModelName = model_name,
    ExecutionRoleArn = role,
    PrimaryContainer = primary_container)

#### Now let's create the batch norm job

In [157]:
transform_output_folder = "***/sagemaker-tutorials"
output_path = "s3://{}/{}".format(bucket, transform_output_folder)

instance_type = 'ml.m5.xlarge' # 'local'
transform_job_name = "{}-batch-transform".format(model_name)

transformer = sagemaker.transformer.Transformer(
    model_name=model_name,
    instance_count=1,
    instance_type=instance_type,
    #strategy='SingleRecord',
    assemble_with='Line',
    base_transform_job_name=transform_job_name,
    output_path=output_path,
    sagemaker_session=session
)

Let's create a csv file with the path to our prediction input data.
Why not give the input data directly ? 
- Max payload : hard limit 10 MB
- Unsupported content types

In [40]:
import pandas as pd

data = {'bucket':  [bucket],
        'file_path':  ['***/gluonts_ds_cutoff_202002.pkl']}
df = pd.DataFrame(data, columns = ['bucket', 'file_path'])

WORK_DIRECTORY = '../data'
prefix = 'DEMO-sagemaker-tutorials-mlp' # S3 prefix

df.to_csv(WORK_DIRECTORY+'/pred_ds_path.csv', index=False)

data_location = session.upload_data(WORK_DIRECTORY, key_prefix=prefix)

We use *tranform()* on the transfomer to get inference results against the data that we uploaded. You can use these options when invoking the transformer.

* The **data_location** which is the location of input data
* The **content_type** which is the content type set when making HTTP request to container to get prediction
* The **split_type** which is the delimiter used for splitting input data
* The **input_filter** which indicates the first column (ID) of the input will be dropped before making HTTP request to container

In [None]:
transformer.transform(data_location,
                      content_type='text/csv',
                      split_type='Line' )#, input_filter='$[1:]')
transformer.wait()

#### Let's check the output of this job

In [160]:
s3_client = session.boto_session.client('s3')
s3_client.download_file(bucket,
                        "{}/pred_ds_path.csv.out".format(transform_output_folder),
                        '../data/pred_ds_path.csv.out')
with open('../data/pred_ds_path.csv.out') as f:
    results = f.readlines()   
print("Transform results: \n{}".format(''.join(results)))

Transform results: 
1055
937
1016
1002
1065
1028
1053
1073
1084
984
376
387
398
407
417
428
440
473
439
451
0
0
0
1
1
0
1
1
2
1
0
0
0
0
1
1
1
1
1
1
0
0
0
0
0
0
0
0
0
0
1
1
1
1
1
1
1
1
1
1
1540
1601
1644
1689
1694
1735
1759
1788
1800
1926
353
350
375
390
405
405
410
438
446
442
0
0
1
0
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
0
0
0
0
0
0
0
0
0
0
1295
1264
1305
1324
1363
1360
1361
1394
1413
1397
21187
21868
24510
25373
25400
27178
28516
31126
31367
28631
45855
46978
49154
52685
53676
55727
55773
62019
60093
59379
639
635
641
667
663
682
679
686
705
702
481
493
494
509
507
522
534
534
516
537
403
409
429
433
458
453
466
484
492
509
512
521
544
568
576
582
605
613
630
620
3
3
3
3
3
3
3
3
3
3
520
508
537
549
571
575
572
571
608
610
656
665
700
720
737
738
754
803
773
774
3
5
2
3
5
3
5
6
4
6
2
2
2
2
2
2
2
2
2
2
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
1
1
1
0
1
1
1
1566
1560
1636
1707
1748
1728
1728
1788
1859
1844
609
603
617
639
691
669
672
678
706
699
1030
1001
1045
1087
1104
1132
1118
1172
12

## 5. Bring your own : Endpoint

We've already coded our API. Behind the curtains of a batch transform job, SageMaker creates a hidden endpoint and generates predictions. But if you want an actual endpoint to invoke, here's how you do it... 

In [162]:
import time

In [161]:
endpoint_config_name = "{}-endpoint-config".format(model_name)
print(endpoint_config_name)

create_endpoint_config_response = sagemaker_client.create_endpoint_config(
    EndpointConfigName = endpoint_config_name,
    ProductionVariants=[{
        'InstanceType':'ml.m4.xlarge',
        'InitialVariantWeight':1,
        'InitialInstanceCount':1,
        'ModelName':model_name,
        'VariantName':'AllTraffic'}])

sagemaker-tutorials-mlp-endpoint


In [None]:
%%time
import time

endpoint_name = "{}-endpoint".format(model_name)
print(endpoint_name)

create_endpoint_response = sagemaker_client.create_endpoint(
    EndpointName=endpoint_name,
    EndpointConfigName=endpoint_config_name)
print(create_endpoint_response['EndpointArn'])

resp = sagemaker_client.describe_endpoint(EndpointName=endpoint_name)
status = resp['EndpointStatus']
print("Status: " + status)

while status=='Creating':
    time.sleep(60)
    resp = sagemaker_client.describe_endpoint(EndpointName=endpoint_name)
    status = resp['EndpointStatus']
    print("Status: " + status)

print("Arn: " + resp['EndpointArn'])
print("Status: " + status)

In [188]:
from sagemaker.predictor import RealTimePredictor
from sagemaker.predictor import csv_serializer, csv_deserializer

In [191]:
realPredictor = RealTimePredictor(endpoint_name,
                                  serializer= csv_serializer,
                                  deserializer= csv_deserializer)

In [192]:
realPredictor.predict(open('../data/pred_ds_path.csv', encoding='utf-8') )

[['1055'],
 ['937'],
 ['1016'],
 ['1002'],
 ['1065'],
 ['1028'],
 ['1053'],
 ['1073'],
 ['1084'],
 ['984'],
 ['376'],
 ['387'],
 ['398'],
 ['407'],
 ['417'],
 ['428'],
 ['440'],
 ['473'],
 ['439'],
 ['451'],
 ['0'],
 ['0'],
 ['0'],
 ['1'],
 ['1'],
 ['0'],
 ['1'],
 ['1'],
 ['2'],
 ['1'],
 ['0'],
 ['0'],
 ['0'],
 ['0'],
 ['1'],
 ['1'],
 ['1'],
 ['1'],
 ['1'],
 ['1'],
 ['0'],
 ['0'],
 ['0'],
 ['0'],
 ['0'],
 ['0'],
 ['0'],
 ['0'],
 ['0'],
 ['0'],
 ['1'],
 ['1'],
 ['1'],
 ['1'],
 ['1'],
 ['1'],
 ['1'],
 ['1'],
 ['1'],
 ['1'],
 ['1540'],
 ['1601'],
 ['1644'],
 ['1689'],
 ['1694'],
 ['1735'],
 ['1759'],
 ['1788'],
 ['1800'],
 ['1926'],
 ['353'],
 ['350'],
 ['375'],
 ['390'],
 ['405'],
 ['405'],
 ['410'],
 ['438'],
 ['446'],
 ['442'],
 ['0'],
 ['0'],
 ['1'],
 ['0'],
 ['1'],
 ['1'],
 ['1'],
 ['1'],
 ['1'],
 ['1'],
 ['1'],
 ['1'],
 ['1'],
 ['1'],
 ['1'],
 ['1'],
 ['1'],
 ['1'],
 ['1'],
 ['1'],
 ['0'],
 ['0'],
 ['0'],
 ['0'],
 ['0'],
 ['0'],
 ['0'],
 ['0'],
 ['0'],
 ['0'],
 ['1295'],
 ['1264'],


* Clean up ! Delete your end point once done, otherwise it'll stay up...

In [None]:
sagemaker_client.delete_endpoint(EndpointName=endpoint_name)
sagemaker_client.delete_endpoint_config(EndpointConfigName=endpoint_config_name)

PS: Another way to do it

In [None]:
runtime_client = boto3.client('sagemaker-runtime')

endpoint_name = "sagemaker-tutorials-mlp-endpoint"               # Your endpoint name.
content_type = "text/csv"                                        # The MIME type of the input data in the request body.
accept = "text/csv"                                              # The desired MIME type of the inference in the response.
payload = open('../data/pred_ds_path.csv', encoding='utf-8')     # Payload for inference.

response = runtime_client.invoke_endpoint(
    EndpointName=endpoint_name, 
    ContentType=content_type,
    Accept=accept,
    Body=payload
    )

print(response['CustomAttributes']) 