# Angry Ferret Detector

## Our Imports and Declarations

In [None]:
import os
import io
import subprocess

import PIL

import sagemaker
from sagemaker.estimator import Estimator
from sagemaker.pytorch import PyTorch, PyTorchModel
from sagemaker.predictor import RealTimePredictor, json_deserializer

from fastai.vision import *

In [None]:
sagemaker_session = sagemaker.Session()
role = sagemaker.get_execution_role()

bucket = 'angryferrets'
prefix = 'OwnModel/data'

inputs = 's3://'+ bucket + '/' + prefix
print('input spec (in this case, just an S3 path): {}'.format(inputs))

## Our Training Code

If you are preparing a model for publication into the AWS Marketplace, a few things you will need to keep in mind:
- At present, you can only submit models to AWS Marketplace from the ```us-east-2``` region.
- When you submit your model, you must go through an automated validation process.  This requires your model to be able to perform batch inferences so be sure your inference code can handle this.
- When the validation process happens, your model will be placed in a container without internet access, so it will have no ablility to download models, URLs or code from the internet.  This is important as some default implementations of certain frameworks will try to download pretrained models from zoos for example.  You need to stage the files within the container itself or disable the models from trying to download pretrained elements.

To this last point, you'll notice in the training code below, I am using a pretrained model for the training process as we're training this just using normal SageMaker.  Then, we'll be taking our trained model and creating a model package from that training job.  The model we're publishing to the AWS Marketplace is designed to only infer so we don't need to download anything...which allows us to validate in a container without internet access.

In [None]:
!pygmentize container/src/ferrets.py

## Create our Docker Container for Training and Inference
For this example, we are creating a customized container based on one provided by AWS for PyTorch.  We pull a specific version from the public repo and then add in the additional parts we need.  In this case, the version of FastAI is a little old so we want to update with a more recent version.  (Notice ```RUN pip install --no-cache-dir fastai==1.0.54 --upgrade```)

In [None]:
!cat container/Dockerfile 

In [None]:
!cat container/build_docker.sh

Use a bash script to build the docker images and push it to our Elastic Container Repository.  I recommend you execute this script in a Terminal session so you can see what's going on.

```
cd container
chmod +x build_docker.sh
./build_docker.sh <<containername>>
```

## Training and Test Locally
Now we have our customized Docker image both locally and in our remote repo.  Let's test it a bit by training and infering locally.  This will save you lots of time on iterations as you troubleshoot versus trying to launch instances on SageMaker.

I highly recommend springing for a P2 or P3 instances for training.  Otherwise, it will take quite a while.

As such, the Docker images we've built are for GPU instances, but they will run on CPU instances.  In fact, we're going to use a CPU instance for inference for our eventual deployment.

In [None]:
# Train locally
instance_type = 'local_gpu'
data_location = 'file://./data/'

print(data_location)

We need to update our Docker process to organize it for local training.

In [None]:
! sudo cp ./container/daemon.json /etc/docker/daemon.json && sudo pkill -SIGHUP dockerd

Now we can start the training process using the ```Estimator``` class.

In [None]:
estimator = Estimator(role=role,
                      train_instance_count=1,
                      train_instance_type=instance_type,
                      image_name='angryferrets:latest')

estimator.fit(data_location)

After a while, you'll see that our training process has finished.  We can now create a local endpoint to test our model.

In [None]:
predictor = estimator.deploy(1, instance_type)

Lets send some test images into the endpoint.

In [None]:
filename = './images/test/angry_test1.jpg'
#filename = './images/test/nice_test2.jpg'
#filename = './images/test/angry_test2.jpg'
#filename = './images/test/nice_test1.jpg'

img = PIL.Image.open(filename,mode='r')
img

In [None]:
predictor.content_type = 'image/jpeg'
response = predictor.predict(open(filename, 'rb'))
response

When we're satisfied, we can delete our local endpoint.

In [None]:
predictor.delete_endpoint()

# Train and Hosting on SageMaker
This process is much the same as training and deploying locally, but notice that we're using our S3 bucket for ```data_location``` and our ```image_name``` is the path to our ECR image that we pushed.
This process will take quite a bit longer than the local training did because SageMaker has to provision new instances and containers.

In [None]:
data_location=inputs
instance_type = 'ml.p3.2xlarge'

# ECR path from our bash script above that pushed our image to our repo
ecr_path = '<<insert the ECR path to your container>>'

print (data_location)

In [None]:
estimator = Estimator(role=role,
                      train_instance_count=1,
                      train_instance_type=instance_type,
                      image_name=ecr_path)

estimator.fit(data_location)

Now we can deploy it on SageMaker.  Notice that I'm using a CPU instance here, which is perfectly fine.  Many times, the training process benefits greatly from a GPU instance due to the highly computational nature of training.  For inference, you can usually get by with a smaller class instance and thus save some money.

In [None]:
instance_type = 'ml.c5.large'

predictor = estimator.deploy(initial_instance_count=1,
                                       instance_type=instance_type)

Once we see the ```--!``` we know the endpoint is deployed.  If you get an ```---*```, something has gone wrong.  You'll need to check the logs in CloudWatch to see what happened.  

In [None]:
#filename = './images/test/angry_test1.jpg'
#ilename = './images/test/nice_test2.jpg'
#filename = './images/test/angry_test2.jpg'
filename = './images/test/nice_test1.jpg'

img = PIL.Image.open(filename,mode='r')
img


In [None]:
predictor.content_type = 'image/jpeg'
response = predictor.predict(open(filename, 'rb'))
response

In [None]:
predictor.delete_endpoint()