Load in required libraries, below.

In [36]:
# data 
import pandas as pd 
import numpy as np
from sklearn.datasets import make_moons
from sklearn.model_selection import train_test_split
import matplotlib.pyplot as plt

%matplotlib inline

## SageMaker Resources

The below cell stores the SageMaker session and role (for creating estimators and models), and creates a default S3 bucket. After creating this bucket, you can upload any locally stored data to S3.

In [37]:
# sagemaker
import boto3
import sagemaker
from sagemaker import get_execution_role

In [38]:
# SageMaker session and role
sagemaker_session = sagemaker.Session()
role = sagemaker.get_execution_role()

# default S3 bucket
bucket = sagemaker_session.default_bucket()

In [9]:
!wget -nc https://da-youtube-ml.s3.eu-central-1.amazonaws.com/wendy-cnn/frames/wendy_cnn_frames_data.zip
!unzip -qq -n wendy_cnn_frames_data.zip -d wendy_cnn_frames_data 
!rm wendy_cnn_frames_data.zip
prefix='cnn-wendy'

# upload to S3
input_data = sagemaker_session.upload_data(path='wendy_cnn_frames_data', bucket=bucket, key_prefix=prefix)
print(input_data)

--2020-10-18 21:09:16--  https://da-youtube-ml.s3.eu-central-1.amazonaws.com/wendy-cnn/frames/wendy_cnn_frames_data.zip
Resolving da-youtube-ml.s3.eu-central-1.amazonaws.com (da-youtube-ml.s3.eu-central-1.amazonaws.com)... 52.219.75.180
Connecting to da-youtube-ml.s3.eu-central-1.amazonaws.com (da-youtube-ml.s3.eu-central-1.amazonaws.com)|52.219.75.180|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 3406848357 (3.2G) [application/zip]
Saving to: ‘wendy_cnn_frames_data.zip’


2020-10-18 21:09:50 (95.0 MB/s) - ‘wendy_cnn_frames_data.zip’ saved [3406848357/3406848357]

s3://sagemaker-eu-central-1-283211002347/cnn-wendy


After uploading images to S3, we can define and train the estimator


In [41]:
# import a PyTorch wrapper
from sagemaker.pytorch import PyTorch

# specify an output path
# prefix is specified above
output_path = 's3://{}/{}'.format(bucket, prefix)

# instantiate a pytorch estimator
estimator = PyTorch(entry_point='train.py',
                    source_dir='letsplay_classifier',
                    role=role,
                    framework_version='1.6',
                    train_instance_count=1,
                    train_instance_type='ml.p2.xlarge',
                    train_volume_size = 10,
                    output_path=output_path,
                    sagemaker_session=sagemaker_session,
                    hyperparameters={
                        'img-width': 128,
                        'img-height': 72,
                        'batch-size': 16,
                        'epochs': 10
                    })

## Train the Estimator

After instantiating the estimator, we train it with a call to `.fit()`. 

In [None]:
%%time 
# train the estimator on S3 training data
estimator.fit({'train': input_data})

'create_image_uri' will be deprecated in favor of 'ImageURIProvider' class in SageMaker Python SDK v2.
's3_input' class will be renamed to 'TrainingInput' in SageMaker Python SDK v2.
'create_image_uri' will be deprecated in favor of 'ImageURIProvider' class in SageMaker Python SDK v2.


2020-10-19 00:23:13 Starting - Starting the training job...
2020-10-19 00:23:14 Starting - Launching requested ML instances......
2020-10-19 00:24:17 Starting - Preparing the instances for training......
2020-10-19 00:25:29 Downloading - Downloading input data.................................
2020-10-19 00:31:05 Training - Downloading the training image..[34mbash: cannot set terminal process group (-1): Inappropriate ioctl for device[0m
[34mbash: no job control in this shell[0m
[34m2020-10-19 00:31:29,028 sagemaker-training-toolkit INFO     Imported framework sagemaker_pytorch_container.training[0m
[34m2020-10-19 00:31:29,051 sagemaker_pytorch_container.training INFO     Block until all host DNS lookups succeed.[0m

2020-10-19 00:31:27 Training - Training image download completed. Training in progress.[34m2020-10-19 00:31:35,273 sagemaker_pytorch_container.training INFO     Invoking user training script.[0m
[34m2020-10-19 00:31:35,778 sagemaker-training-toolkit INFO     Inst

We set up a model that can predict the class of an image

In [14]:
# importing PyTorchModel
from sagemaker.pytorch import PyTorchModel

# Create a model from the trained estimator data
# And point to the prediction script
model = PyTorchModel(model_data=estimator.model_data,
                     role = role,
                     framework_version='1.0',
                     entry_point='predict_fromfile.py',
                     source_dir='letsplay_classifier')


### Deploy the trained model

We deploy our model to create a predictor. We'll use this to make predictions on our test data and evaluate the model.

In [15]:
%%time
# deploy and create a predictor
predictor = model.deploy(initial_instance_count=1, instance_type='ml.t2.large')

INFO:sagemaker:Creating model with name: sagemaker-pytorch-2019-03-12-03-24-09-384
INFO:sagemaker:Creating endpoint with name sagemaker-pytorch-2019-03-12-03-24-09-384


---------------------------------------------------------------------------------------!CPU times: user 591 ms, sys: 62.9 ms, total: 654 ms
Wall time: 7min 21s


Now that the model is deployed, we check how it performs on our full dataset,
ensuring that the predictions make sense.


In [None]:
from predict_fromfile import model_fn, predict_fn, input_fn, output_fn
import argparse
import os
import json
import torch
from PIL import Image

import torch.nn as nn

index = 0
loss_test = 0
acc_test = 0
count = 0
cur_data_dir = 'wendy_cnn_frames_data'
dirs = sorted(os.listdir(cur_data_dir))
for dir in dirs:
    labels = torch.empty(1, dtype=int)
    labels[0] = index
    print(labels)
    curr_img_dir = os.path.join(args.data_dir, dir)
    images = os.listdir(curr_img_dir)
    for image in images:
        curr_img = os.path.join(curr_img_dir, image)

        with open(curr_img, 'rb') as f:
            image_data = Image.open(f)
           
            prediction_from_ep = predictor.predict(image_data)  
            prediction = torch.FloatTensor(prediction_from_ep).unsqueeze(0)

            _, preds = torch.max(prediction.data, 1)
            loss = criterion(prediction, labels)

            loss_test += loss.data
            acc_test += torch.sum(preds == labels.data)
            count += 1
            avg_loss = torch.true_divide(loss_test, count)
            avg_acc = torch.true_divide(acc_test, count)
            print("{} processed".format(count))
            print("Avg loss (test): {:.4f}".format(avg_loss))
            print("Avg acc (test): {:.4f}".format(avg_acc))
    index += 1

---
## Evaluating Your Model

Once your model is deployed, you can see how it performs when applied to the test data.

The provided function below, takes in a deployed predictor, some test features and labels, and returns a dictionary of metrics; calculating false negatives and positives as well as recall, precision, and accuracy.

In [16]:

import torch
import os
import json
import time
from letsplay_classifiers.train import get_data_loaders
import torch.optim as optim
from torch.optim import lr_scheduler
from torch.autograd import Variable
import torch
import torch.nn as nn
import torch.optim as optim

def eval_predictor(predictor, dataloaders):
    
    test_batches = len(dataloaders['val'])
    print("Evaluating model")
    print('-' * 10)

    for i, data in enumerate(dataloaders['val']):
        if i % 100 == 0:
            print("\rTest batch {}/{}".format(i, test_batches), end='', flush=True)


dataloaders, dataset_sizes, class_names = get_data_loaders(
    img_dir='wendy-cnn-data', 256, 256, 8 )

eval_predictor(predictor) 


### Test Results

The cell below runs the `evaluate` function. 

The code assumes that you have a defined `predictor` and `X_test` and `Y_test` from previously-run cells.

In [17]:
# get metrics for custom predictor
metrics = evaluate(predictor, X_test, Y_test, True)

predictions  0.0  1.0
actuals              
0             53   18
1             11   68

Recall:     0.861
Precision:  0.791
Accuracy:   0.807



## Delete the Endpoint

Finally, I've add a convenience function to delete prediction endpoints after we're done with them. And if you're done evaluating the model, you should delete your model endpoint!

In [18]:
# Accepts a predictor endpoint as input
# And deletes the endpoint by name
def delete_endpoint(predictor):
        try:
            boto3.client('sagemaker').delete_endpoint(EndpointName=predictor.endpoint)
            print('Deleted {}'.format(predictor.endpoint))
        except:
            print('Already deleted: {}'.format(predictor.endpoint))

In [19]:
# delete the predictor endpoint 
delete_endpoint(predictor)

Deleted sagemaker-pytorch-2019-03-12-03-24-09-384


## Final Cleanup!

* Double check that you have deleted all your endpoints.
* I'd also suggest manually deleting your S3 bucket, models, and endpoint configurations directly from your AWS console.

You can find thorough cleanup instructions, [in the documentation](https://docs.aws.amazon.com/sagemaker/latest/dg/ex1-cleanup.html).

---
# Conclusion

In this notebook, you saw how to train and deploy a custom, PyTorch model in SageMaker. SageMaker has many built-in models that are useful for common clustering and classification tasks, but it is useful to know how to create custom, deep learning models that are flexible enough to learn from a variety of data.