### Bring your own Container

In this notebook, we will cover how to bring our own container with either a framework or algorithm to train a model on SageMaker. 

We will use fastai in this case and build our container with custom training code integrated into the container. The other option is to use script mode which is easily done by changing the entrypoint.


#### Container Image
Let's start with building a container image locally and then push that to ECR (Elastic Container Registry)

In [None]:
%cd docker

In [None]:
!docker build -t am-scikit .

In [None]:
!docker images

## Set the ecr details and tags 
Lets set a few params here like ecr name space , tag name etc.

In [None]:
from sagemaker import get_execution_role
import boto3
ecr_namespace = "sagemaker-training-containers/"
prefix = "scikit-training"

ecr_repository_name = ecr_namespace + prefix

role = get_execution_role()
account_id = role.split(":")[4]
region = boto3.Session().region_name
tag_name = account_id+'.dkr.ecr.'+region+'.amazonaws.com/'+ecr_repository_name+':latest'

In [None]:
tag_name

In [None]:
!docker tag am-scikit $tag_name

### ECR Repository and push steps

All of these can be scripted out but they are laid out this way for transparency and step evolution understanding

In [None]:
!$(aws ecr get-login --no-include-email)

In [None]:
!aws ecr create-repository --repository-name $ecr_repository_name

In [None]:
!docker push $tag_name

In [None]:
container_image_uri = "{0}.dkr.ecr.{1}.amazonaws.com/{2}:latest".format(
    account_id, region, ecr_repository_name
)
print(container_image_uri)

#### Call your custom container to train the model
Our customer Docker image is now complete and uploaded to our ECR (Elastic Container Registry).  
Our code can now reference the customer Docker container to run our 'train.py' script.  

In [None]:
import sagemaker
import json

# JSON encode hyperparameters
def json_encode_hyperparameters(hyperparameters):
    return {str(k): json.dumps(v) for (k, v) in hyperparameters.items()}

hyperparameters = json_encode_hyperparameters({'min-samples-leaf':2, 'n-estimators':500})

# now we will call the generic SageMaker Estimator
est = sagemaker.estimator.Estimator(
    container_image_uri,
    role,
    instance_count=1,
    #train_instance_type="local",  # we use local mode
    instance_type='ml.m5.4xlarge',
    base_job_name=prefix,
    hyperparameters=hyperparameters,
)

# s3 URI of the preprocessed training data that we created in the BYOM lab
#preprocessed_training_data = 's3://sagemaker-us-east-1-662559257807/sagemaker-scikit-learn-2021-08-20-22-37-42-314/output/train/'
preprocessed_training_data = 'your-S3-URI-goes-here'
train_config = sagemaker.session.TrainingInput(preprocessed_training_data)

In [None]:
%%time
est.fit({"train": train_config})

In [None]:
training_job_description = est.jobs[-1].describe()
model_data_s3_uri = "{}{}/{}".format(
    training_job_description["OutputDataConfig"]["S3OutputPath"],
    training_job_description["TrainingJobName"],
    "output/model.tar.gz",
)
print(training_job_description["TrainingJobName"])
print(model_data_s3_uri)

#### Evaluate the trained model
Now that we have used our custom Docker container to train a Scikit-learn 0.24 model, let's see how well it performs.  

In [None]:
training_job_description = sklearn.jobs[-1].describe()

model_data_s3_uri = "{}{}/{}".format(
    training_job_description["OutputDataConfig"]["S3OutputPath"],
    training_job_description["TrainingJobName"],
    "output/model.tar.gz",
)
print(training_job_description["TrainingJobName"])
print(model_data_s3_uri)

In [None]:
sklearn_processor = SKLearnProcessor(
    framework_version='0.23-1',
    role=role,
    instance_type='ml.m5.xlarge',
    instance_count=1
)

sklearn_processor.run(
    code="code/evaluation.py",
    inputs=[
        ProcessingInput(source=model_data_s3_uri, destination="/opt/ml/processing/model"),
#       ProcessingInput(source=preprocessed_training_data, destination="/opt/ml/processing/train"),
        ProcessingInput(source=preprocessed_test_data, destination="/opt/ml/processing/test"),
    ],
    outputs=[ProcessingOutput(output_name="evaluation", source="/opt/ml/processing/evaluation")],
)
evaluation_job_description = sklearn_processor.jobs[-1].describe()