Load in required libraries, below.

In [1]:
# data 
import pandas as pd 
import numpy as np
from sklearn.datasets import make_moons
from sklearn.model_selection import train_test_split
import matplotlib.pyplot as plt

%matplotlib inline

## SageMaker Resources

The below cell stores the SageMaker session and role (for creating estimators and models), and creates a default S3 bucket. After creating this bucket, locally stored data can be uploaded to S3.

In [3]:
# sagemaker
import boto3
import sagemaker
from sagemaker import get_execution_role

In [4]:
# SageMaker session and role
sagemaker_session = sagemaker.Session()
role = sagemaker.get_execution_role()

# default S3 bucket
bucket = sagemaker_session.default_bucket()
prefix='cnn-wendy'

Here we retrieve the dataset of images and we upload it to S3


In [9]:
!wget -nc https://da-youtube-ml.s3.eu-central-1.amazonaws.com/wendy-cnn/frames/wendy_cnn_frames_data.zip
!unzip -qq -n wendy_cnn_frames_data.zip -d wendy_cnn_frames_data 
!rm wendy_cnn_frames_data.zip


# upload to S3
input_data = sagemaker_session.upload_data(path='wendy_cnn_frames_data', bucket=bucket, key_prefix=prefix)
print(input_data)

--2020-10-18 21:09:16--  https://da-youtube-ml.s3.eu-central-1.amazonaws.com/wendy-cnn/frames/wendy_cnn_frames_data.zip
Resolving da-youtube-ml.s3.eu-central-1.amazonaws.com (da-youtube-ml.s3.eu-central-1.amazonaws.com)... 52.219.75.180
Connecting to da-youtube-ml.s3.eu-central-1.amazonaws.com (da-youtube-ml.s3.eu-central-1.amazonaws.com)|52.219.75.180|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 3406848357 (3.2G) [application/zip]
Saving to: ‘wendy_cnn_frames_data.zip’


2020-10-18 21:09:50 (95.0 MB/s) - ‘wendy_cnn_frames_data.zip’ saved [3406848357/3406848357]

s3://sagemaker-eu-central-1-283211002347/cnn-wendy


In [5]:
# location to input data can be written down here, if known
 input_data='s3://sagemaker-eu-central-1-283211002347/cnn-wendy'

After uploading images to S3, we can define and train the estimator


In [16]:
# import a PyTorch wrapper
from sagemaker.pytorch import PyTorch

# specify an output path
# prefix is specified above
output_path = 's3://{}/{}'.format(bucket, prefix)

# instantiate a pytorch estimator
estimator = PyTorch(entry_point='train.py',
                    source_dir='letsplay_classifier',
                    role=role,
                    framework_version='1.6',
                    train_instance_count=1,
                    train_instance_type='ml.p2.xlarge',
                    train_volume_size = 10,
                    output_path=output_path,
                    sagemaker_session=sagemaker_session,
                    hyperparameters={
                        'img-width': 64,
                        'img-height': 36,
                        'batch-size': 32,
                        'layer-cfg': 'A',
                        'epochs': 2
                    })

## Train the Estimator

After instantiating the estimator, we train it with a call to `.fit()`. 

In [None]:
%%time 
# train the estimator on S3 training data
estimator.fit({'train': input_data})

'create_image_uri' will be deprecated in favor of 'ImageURIProvider' class in SageMaker Python SDK v2.
's3_input' class will be renamed to 'TrainingInput' in SageMaker Python SDK v2.
'create_image_uri' will be deprecated in favor of 'ImageURIProvider' class in SageMaker Python SDK v2.


2020-10-19 22:49:26 Starting - Starting the training job..

In [None]:
print(estimator.model_data)
model_data = estimator.model_data
# model can be set here, if known
# model_data = 's3://sagemaker-eu-central-1-283211002347/cnn-wendy/pytorch-training-2020-10-19-14-17-38-926/output/model.tar.gz'


We set up a model that can predict the class of an image

### Deploy the trained model

We deploy our model to create a predictor. We'll use this to make predictions on our data and evaluate the model.

In [6]:
# importing PyTorchModel
from sagemaker.pytorch import PyTorchModel

# Create a model from the trained estimator data
# And point to the prediction script
model = PyTorchModel(model_data=model_data,
                     role = role,
                     framework_version='1.6',
                     entry_point='predict.py',
                     source_dir='letsplay_classifier')

Parameter image will be renamed to image_uri in SageMaker Python SDK v2.


In [7]:
%%time
# deploy and create a predictor
from sagemaker.predictor import Predictor                
predictor = model.deploy(initial_instance_count=1, instance_type='ml.p2.xlarge')


'create_image_uri' will be deprecated in favor of 'ImageURIProvider' class in SageMaker Python SDK v2.


-------------------!CPU times: user 39.6 s, sys: 6.16 s, total: 45.8 s
Wall time: 10min 16s


In [None]:
# the endpoint where the predictor is located
endpoint_name = predictor.endpoint


Now that the model is deployed, we check how the predictor performs on our full dataset,
ensuring that the predictions make sense.


In [None]:
from letsplay_classifier.endpoint import evaluate
avg_acc, avg_loss, count = evaluate(endpoint_name, wendy_cnn_frames_data, 0.05)
print("{} processed of {}".format(count, image_total))
print("Avg loss : {:.4f}".format(avg_loss))
print("Avg acc : {:.4f}".format(avg_acc))

## Delete the Endpoint

Finally, I've add a convenience function to delete prediction endpoints after we're done with them. And if you're done evaluating the model, you should delete your model endpoint!

In [24]:
# Accepts a predictor endpoint as input
# And deletes the endpoint by name
def delete_endpoint(predictor):
        try:
            boto3.client('sagemaker').delete_endpoint(EndpointName=predictor.endpoint)
            print('Deleted {}'.format(predictor.endpoint))
        except:
            print('Already deleted: {}'.format(predictor.endpoint))

In [25]:
# delete the predictor endpoint 
delete_endpoint(predictor)

Deleted pytorch-inference-2020-10-19-12-53-25-499
