In [None]:
!pip install -U sagemaker
#restart your kernel

### Bring your own Container

In this notebook, we will cover how to bring our own container with either a framework or algorithm to train a model on SageMaker. 

We will use fastai in this case and build our container with custom training code integrated into the container. The other option is to use script mode which is easily done by changing the entrypoint.

The outline of this notebook is 

1. Build docker a image for FastAI and serving and training code (provided).

2. Log into ECR, tag and push docker image to ECR 

3. Use the FastAI container image in SageMaker to train our model 

4. Deploy model to endpoint using the container image

5. Test inference using an image in couple of possible ways 

#### Container Image
Let's start with building a container image locally and then push that to ECR (Elastic Container Registry)

In [12]:
%cd ~/SageMaker/pssummitwkshp/byoc/docker

/home/ec2-user/SageMaker/pssummitwkshp/byoc/docker


In [13]:
!docker build -t fastai .

Sending build context to Docker daemon  55.81kB
Step 1/8 : FROM fastdotai/fastai:latest
 ---> 539369040b97
Step 2/8 : LABEL maintainer="Raj Kadiyala"
 ---> Using cache
 ---> e9f28a6f8590
Step 3/8 : WORKDIR /
 ---> Using cache
 ---> e1d448f4f378
Step 4/8 : RUN pip3 install --no-cache --upgrade requests
 ---> Using cache
 ---> e87b43ada07e
Step 5/8 : ENV PYTHONDONTWRITEBYTECODE=1     PYTHONUNBUFFERED=1     LD_LIBRARY_PATH="${LD_LIBRARY_PATH}:/usr/local/lib"     PYTHONIOENCODING=UTF-8     LANG=C.UTF-8     LC_ALL=C.UTF-8
 ---> Using cache
 ---> 80a5af715cad
Step 6/8 : RUN pip3 install --no-cache --upgrade     sagemaker-training
 ---> Using cache
 ---> 24e09afa91e9
Step 7/8 : COPY code/* /opt/ml/code/
 ---> 43a6eb223e55
Step 8/8 : ENV SAGEMAKER_PROGRAM train.py
 ---> Running in d055b22e40d9
Removing intermediate container d055b22e40d9
 ---> 05529306daa0
Successfully built 05529306daa0
Successfully tagged fastai:latest


In [14]:
!docker images

REPOSITORY                                                                                                TAG                 IMAGE ID            CREATED                  SIZE
fastai                                                                                                    latest              05529306daa0        Less than a second ago   9.16GB
826659556017.dkr.ecr.us-east-1.amazonaws.com/sagemaker-training-containers/script-mode-container-fastai   latest              1f508fde1f60        2 hours ago              9.16GB
826659556017.dkr.ecr.us-east-1.amazonaws.com/sagemaker-training-containers/script-mode-container-fastai   <none>              724a3102419f        3 hours ago              9.16GB
826659556017.dkr.ecr.us-east-1.amazonaws.com/sagemaker-training-containers/script-mode-container-fastai   <none>              d325e6fdf792        8 hours ago              9.16GB
826659556017.dkr.ecr.us-east-1.amazonaws.com/sagemaker-training-containers/script-mode-container-fastai   <none>

## Set the ecr details and tags 
Lets set a few params here like ecr name space , tag name etc.

In [15]:
from sagemaker import get_execution_role
import boto3
ecr_namespace = "sagemaker-training-containers/"
prefix = "script-mode-container-fastai"

ecr_repository_name = ecr_namespace + prefix
role = get_execution_role()
account_id = role.split(":")[4]
region = boto3.Session().region_name
tag_name=account_id+'.dkr.ecr.'+region+'.amazonaws.com/'+ecr_repository_name+':latest'

In [16]:
tag_name

'826659556017.dkr.ecr.us-east-1.amazonaws.com/sagemaker-training-containers/script-mode-container-fastai:latest'

Now we tag our image with the tag name we generated above

In [17]:
!docker tag fastai $tag_name

### ECR Repository and push steps

All of these can be scripted out but they are laid out this way for transparency and step evolution understanding

First we get a token credential to ECR. This will allow us to perform ECR operations

In [18]:
!$(aws ecr get-login --no-include-email)

https://docs.docker.com/engine/reference/commandline/login/#credentials-store

Login Succeeded


Here we create an ECR repository

In [19]:
!aws ecr create-repository --repository-name $ecr_repository_name


An error occurred (RepositoryAlreadyExistsException) when calling the CreateRepository operation: The repository with name 'sagemaker-training-containers/script-mode-container-fastai' already exists in the registry with id '826659556017'


Now that our ECR respoitory has been created, we can now push our docker image to it with the tag name we assigned to it

In [20]:
!docker push $tag_name

The push refers to repository [826659556017.dkr.ecr.us-east-1.amazonaws.com/sagemaker-training-containers/script-mode-container-fastai]

[1B8c8c93b7: Preparing 
[1B249f5376: Preparing 
[1B11a5b3f2: Preparing 
[1Babbdd18a: Preparing 
[1B38cf5f0c: Preparing 
[1B41532ca7: Preparing 
[1Bf60d0d1d: Preparing 
[1B615a1743: Preparing 
[1Bbf18a086: Preparing 
[1B4401c38d: Preparing 
[1B22356a9c: Preparing 
[1B04ba5ce7: Preparing 
[1B751ec296: Preparing 
[1B8e31d021: Preparing 
[1B62e73fa9: Preparing 
[11B1532ca7: Waiting g 
[1Bdc413928: Preparing 
[1Bad8f2cae: Preparing 
[1B581dbc3c: Preparing 
[20Bc8c93b7: Pushed lready exists 1kB7A[2K[15A[2K[11A[2K[7A[2K[5A[2K[2A[2K[20A[2Klatest: digest: sha256:a4a8e34b3a7b245d6b549b9e2686e9f6ae3f10c5759fcf3ed1554be4786abd9a size: 4711


This is how we get the URI of our uploaded docker image in ECR

In [21]:
container_image_uri = "{0}.dkr.ecr.{1}.amazonaws.com/{2}:latest".format(
    account_id, region, ecr_repository_name
)
print(container_image_uri)

826659556017.dkr.ecr.us-east-1.amazonaws.com/sagemaker-training-containers/script-mode-container-fastai:latest


#### Call your custom container to train the model

In the cell below, replace **"your-unique-bucket-name"** with the name of bucket you created in the data-prep notebook

In [None]:
%%time
import sagemaker
import json

#bucket = "your-unique-bucket-name"
bucket = "myagm-dcsum"

# JSON encode hyperparameters
def json_encode_hyperparameters(hyperparameters):
    return {str(k): json.dumps(v) for (k, v) in hyperparameters.items()}


hyperparameters = json_encode_hyperparameters({"lr":1e-03})

est = sagemaker.estimator.Estimator(
    container_image_uri,
    role,
    instance_count=1,
    #train_instance_type="local",  # we use local mode
    instance_type='ml.m5.12xlarge',
    base_job_name=prefix,
    hyperparameters=hyperparameters,
)

train_config = sagemaker.session.TrainingInput(f's3://{bucket}/train')

est.fit({"train": train_config})