### Building and registering the container

The `build-and-push.sh` builds the container image using `docker build` and push the container image to ECR using `docker push`. 

If the `gpu` argument is passed to `build-and-push.sh` the GPU Docker file is used to create the GPU instance.  Otherwise the CPU instance is created.

This code looks for an ECR repository in the account you're using and the current default region (if you're using a SageMaker notebook instance, this is the region where the notebook instance was created). If the repository doesn't exist, the script will create it. In addition, since we are using the SageMaker PyTorch image as the base, we will need to retrieve ECR credentials to pull this public image.

In [1]:
!./build_and_push_sagemaker.sh

Requesting CPU image
Login Succeeded
Login Succeeded
Sending build context to Docker daemon  178.8MB
Step 1/10 : ARG REGION=us-east-1
Step 2/10 : FROM 520713654638.dkr.ecr.$REGION.amazonaws.com/sagemaker-pytorch:1.1.0-cpu-py3
1.1.0-cpu-py3: Pulling from sagemaker-pytorch

[1B7927d38a: Pulling fs layer 
[1Bac894db4: Pulling fs layer 
[1B2af6d627: Pulling fs layer 
[1B86211d23: Pulling fs layer 
[1Baf39bebe: Pulling fs layer 
[1B03f425cd: Pulling fs layer 
[1B1ec18efe: Pulling fs layer 
[1B8ad8ba55: Pulling fs layer 
[1B6c282ffb: Pulling fs layer 
[1B77dfb459: Pulling fs layer 
[1Bbbd8c730: Pulling fs layer 
[1BDigest: sha256:bd973d810e8cf494a37dc9cc477b619d13da901d5f2804a953064b5bafc1e484[1K[K[8A[1K[K[7A[1K[K[8A[1K[K[8A[1K[K[6A[1K[K[8A[1K[K[6A[1K[K[8A[1K[K[6A[1K[K[8A[1K[K[6A[1K[K[8A[1K[K[6A[1K[K[8A[1K[K[12A[1K[K[8A[1K[K[12A[1K[K[8A[1K[K[12A[1K[K[8A[1K[K[5A[1K[K[4A[1K[K[8A[1K[K[3A[1K[K[12A[1K[K[8A[1

[17B62ef907: Pushed   114.7MB/114.5MB[16A[1K[K[13A[1K[K[16A[1K[K[16A[1K[K[13A[1K[K[15A[1K[K[13A[1K[K[17A[1K[K[17A[1K[K[15A[1K[K[13A[1K[K[15A[1K[K[16A[1K[K[13A[1K[K[17A[1K[K[13A[1K[K[16A[1K[K[13A[1K[K[15A[1K[K[16A[1K[K[17A[1K[K[13A[1K[K[13A[1K[K[16A[1K[K[17A[1K[K[16A[1K[K[16A[1K[K[17A[1K[K[15A[1K[K[16A[1K[K[16A[1K[K[15A[1K[K[16A[1K[K[17A[1K[K[13A[1K[K[15A[1K[K[16A[1K[K[15A[1K[K[17A[1K[K[16A[1K[K[15A[1K[K[13A[1K[K[17A[1K[K[13A[1K[K[17A[1K[K[13A[1K[K[16A[1K[K[13A[1K[K[16A[1K[K[17A[1K[K[16A[1K[K[17A[1K[K[16A[1K[K[13A[1K[K[16A[1K[K[17A[1K[K[15A[1K[K[17A[1K[K[16A[1K[K[17A[1K[K[16A[1K[K[17A[1K[K[16A[1K[K[17A[1K[K[13A[1K[K[17A[1K[K[13A[1K[K[16A[1K[K[13A[1K[K[13A[1K[K[17A[1K[K[17A[1K[K[16A[1K[K[13A[1K[K[16A[1K[K[10A[1K[K[16A[1K[K[13A[1K[K[17A[1K[K[17A[1K[K[

## Testing your algorithm on your local machine

When you're packaging your first algorithm to use with Amazon SageMaker, you probably want to test it yourself to make sure it's working correctly. We use the [SageMaker Python SDK](https://github.com/aws/sagemaker-python-sdk) to test both locally and on SageMaker. For more examples with the SageMaker Python SDK, see [Amazon SageMaker Examples](https://github.com/awslabs/amazon-sagemaker-examples/tree/master/sagemaker-python-sdk). In order to test our algorithm, we need our dataset.

## SageMaker Python SDK Local Training
To represent our training, we use the Estimator class, which needs to be configured in five steps. 
1. IAM role - our AWS execution role
2. train_instance_count - number of instances to use for training.
3. train_instance_type - type of instance to use for training. For training locally, we specify `local`.
4. image_name - our custom PyTorch Docker image we created.
5. hyperparameters - hyperparameters we want to pass.

Let's start with setting up our IAM role. We make use of a helper function within the Python SDK. This function throw an exception if run outside of a SageMaker notebook instance, as it gets metadata from the notebook instance.

### Training the Reinforcement Learning Model Locally
Note we are only training for 200 iterations, which is too few to see any increase in the average score.  We are a purely checking for mechanical errors.

In [3]:
from sagemaker.estimator import Estimator
from sagemaker import get_execution_role

role = get_execution_role()
estimator = Estimator(role=role,
                      instance_count=1,
                      instance_type='local',
                      image_uri='rl-portfolio-optimization:latest',
                      hyperparameters={'timesteps': 1000})

estimator.fit()

Creating tmpl4u04c4__algo-1-dwbrk_1 ... 
[1BAttaching to tmpl4u04c4__algo-1-dwbrk_12mdone[0m
[36malgo-1-dwbrk_1  |[0m 2020-07-04 15:34:24,066 sagemaker-containers INFO     Imported framework sagemaker_pytorch_container.training
[36malgo-1-dwbrk_1  |[0m 2020-07-04 15:34:24,070 sagemaker-containers INFO     No GPUs detected (normal if no gpus installed)
[36malgo-1-dwbrk_1  |[0m 2020-07-04 15:34:24,084 sagemaker_pytorch_container.training INFO     Block until all host DNS lookups succeed.
[36malgo-1-dwbrk_1  |[0m 2020-07-04 15:34:24,088 sagemaker_pytorch_container.training INFO     Invoking user training script.
[36malgo-1-dwbrk_1  |[0m 2020-07-04 15:34:24,089 sagemaker-containers INFO     Module train does not provide a setup.py. 
[36malgo-1-dwbrk_1  |[0m Generating setup.py
[36malgo-1-dwbrk_1  |[0m 2020-07-04 15:34:24,090 sagemaker-containers INFO     Generating setup.cfg
[36malgo-1-dwbrk_1  |[0m 2020-07-04 15:34:24,090 sagemaker-containers INFO     Generating MANIFEST

Failed to delete: /tmp/tmpl4u04c4_/algo-1-dwbrk Please remove it manually.


===== Job Complete =====


## Training on SageMaker
Training a model on SageMaker with the Python SDK is done in a way that is similar to the way we trained it locally. This is done by changing our train_instance_type from `local` to one of the [supported EC2 instance types](https://aws.amazon.com/sagemaker/pricing/instance-types/).

### Locate the ECR image just built and pushed

In [4]:
import boto3

client = boto3.client('sts')
account = client.get_caller_identity()['Account']
region = boto3.Session().region_name
ecr_image = '{}.dkr.ecr.{}.amazonaws.com/rl-portfolio-optimization:latest'.format(account, region)

print(ecr_image)

031118886020.dkr.ecr.us-east-1.amazonaws.com/sagemaker-tennis-cpu:latest


### Submit the training job

In [5]:
from sagemaker.estimator import Estimator
estimator = Estimator(role=role,
                      instance_count=1,
                      instance_type='ml.m4.xlarge',
                      image_name=ecr_image,
                      hyperparameters={'timesteps': 200})
estimator.fit()

2020-07-04 15:35:55 Starting - Starting the training job...
2020-07-04 15:35:57 Starting - Launching requested ML instances......
2020-07-04 15:37:11 Starting - Preparing the instances for training......
2020-07-04 15:38:19 Downloading - Downloading input data
2020-07-04 15:38:19 Training - Downloading the training image......
2020-07-04 15:39:26 Training - Training image download completed. Training in progress..[34mbash: cannot set terminal process group (-1): Inappropriate ioctl for device[0m
[34mbash: no job control in this shell[0m
[34m2020-07-04 15:39:27,526 sagemaker-containers INFO     Imported framework sagemaker_pytorch_container.training[0m
[34m2020-07-04 15:39:27,529 sagemaker-containers INFO     No GPUs detected (normal if no gpus installed)[0m
[34m2020-07-04 15:39:27,542 sagemaker_pytorch_container.training INFO     Block until all host DNS lookups succeed.[0m
[34m2020-07-04 15:39:27,543 sagemaker_pytorch_container.training INFO     Invoking user training scrip