## Step 1: Create custom container using SageMaker PyTorch Deep Learning Framework

Update `role` with your SageMaker role arn.

In [3]:
!pip --version

pip 20.1 from /Users/yihyap/anaconda3/envs/sandbox36/lib/python3.6/site-packages/pip (python 3.6)


In [4]:
import boto3
import sagemaker
from sagemaker import get_execution_role
from sagemaker.pytorch import PyTorch
import warnings
warnings.filterwarnings('ignore')

ecr_namespace = 'sagemaker-training-containers/'
prefix = 'pytorch-training'
ecr_repository_name = ecr_namespace + prefix


ecr_repository_name = ecr_namespace + prefix
role = "arn:aws:iam::342474125894:role/service-role/AmazonSageMaker-ExecutionRole-20190405T234154"
account_id = role.split(':')[4]
region = boto3.Session().region_name
sagemaker_session = sagemaker.session.Session()
bucket = sagemaker_session.default_bucket()

print('Account: {}'.format(account_id))
print('Region: {}'.format(region))
print('Role: {}'.format(role))
print('S3 Bucket: {}'.format(bucket))
print('Repo: {}'.format(ecr_repository_name))

Account: 342474125894
Region: ap-southeast-1
Role: arn:aws:iam::342474125894:role/service-role/AmazonSageMaker-ExecutionRole-20190405T234154
S3 Bucket: sagemaker-ap-southeast-1-342474125894
Repo: sagemaker-training-containers/pytorch-training


### Build training container

Next we will create a script that will build and upload the custom container image into ECR. It has to be in the same region where the job is run.

In [2]:
# ./build_and_push.sh 342474125894 ap-southeast-1 sagemaker-training-containers/pytorch-training
! ../scripts/build_and_push.sh $account_id $region $ecr_repository_name

Sending build context to Docker daemon  18.43kB
Step 1/16 : FROM ubuntu:16.04
 ---> 13c9f1285025
Step 2/16 : LABEL maintainer="Giuseppe A. Porcelli"
 ---> Using cache
 ---> 6bbf3d07c68d
Step 3/16 : ARG PYTHON=python3
 ---> Using cache
 ---> 8e254b9ef0a0
Step 4/16 : ARG PYTHON_PIP=python3-pip
 ---> Using cache
 ---> 84c928b11bb3
Step 5/16 : ARG PIP=pip3
 ---> Using cache
 ---> 65e780b1f9d7
Step 6/16 : ARG PYTHON_VERSION=3.6.6
 ---> Using cache
 ---> 03bab72f170e
Step 7/16 : RUN apt-get update && apt-get install -y --no-install-recommends software-properties-common &&     add-apt-repository ppa:deadsnakes/ppa -y &&     apt-get update && apt-get install -y --no-install-recommends         build-essential         ca-certificates         curl         wget         git         libopencv-dev         openssh-client         openssh-server         vim         zlib1g-dev &&     rm -rf /var/lib/apt/lists/*
 ---> Using cache
 ---> 0b3f66ca4c73
Step 8/16 : RUN wget https://www.python.org/ftp/python/$P

In [3]:
train_image_uri = '{0}.dkr.ecr.{1}.amazonaws.com/{2}:latest'.format(account_id, region, ecr_repository_name)
print('ECR training container ARN: {}'.format(train_image_uri))

ECR training container ARN: 342474125894.dkr.ecr.ap-southeast-1.amazonaws.com/sagemaker-training-containers/pytorch-training:latest


The docker image is now pushed to ECR. In the next section, we will show how to train an acoustic classification model using the custom container.

## Step 2: Training on SageMaker PyTorch custom container

In [4]:
import sagemaker
import json

hyperparameters = {
    "seed": "1",
    "epochs": 50,
}

est = sagemaker.estimator.Estimator(train_image_uri,
                                    role,
                                    train_instance_count=1, 
                                    #instance_type='local', # we use local mode
                                    train_instance_type='ml.m5.xlarge',
                                    base_job_name=prefix,
                                    hyperparameters=hyperparameters)


est.fit()

#train_config = sagemaker.inputs.TrainingInput('s3://{0}/{1}/train/'.format(bucket, prefix), content_type='text/csv')
#val_config = sagemaker.inputs.TrainingInput('s3://{0}/{1}/val/'.format(bucket, prefix), content_type='text/csv')
#est.fit({'train': train_config, 'validation': val_config })

2020-08-11 14:16:09 Starting - Starting the training job...
2020-08-11 14:16:11 Starting - Launching requested ML instances......
2020-08-11 14:17:23 Starting - Preparing the instances for training...
2020-08-11 14:18:05 Downloading - Downloading input data
2020-08-11 14:18:05 Training - Downloading the training image......
2020-08-11 14:19:09 Uploading - Uploading generated training model.[34m2020-08-11 14:19:04,441 sagemaker-containers INFO     No GPUs detected (normal if no gpus installed)[0m
[34m2020-08-11 14:19:04,459 sagemaker-containers INFO     No GPUs detected (normal if no gpus installed)[0m
[34m2020-08-11 14:19:04,472 sagemaker-containers INFO     No GPUs detected (normal if no gpus installed)[0m
[34m2020-08-11 14:19:04,482 sagemaker-containers INFO     Invoking user script
[0m
[34mTraining Env:
[0m
[34m{
    "additional_framework_parameters": {},
    "channel_input_dirs": {},
    "current_host": "algo-1",
    "framework_module": null,
    "hosts": [
        "algo

### Retrieve model location

In [10]:
model_location = est.model_data
print(model_location)

s3://sagemaker-ap-southeast-1-342474125894/pytorch-training-2020-08-11-14-16-28-241/output/model.tar.gz


## Step 3: Inference

For inference, we will use default inference image. Mandatory `model_fn` is implemented in `inference.py`. PyTorchModel is used to deploy custom model that we trained previously.

### Deploy model

In [7]:
!pip show sagemaker

Name: sagemaker
Version: 1.51.3
Summary: Open source library for training and deploying models on Amazon SageMaker.
Home-page: https://github.com/aws/sagemaker-python-sdk/
Author: Amazon Web Services
Author-email: None
License: Apache License 2.0
Location: /Users/yihyap/anaconda3/envs/sandbox36/lib/python3.6/site-packages
Requires: boto3, packaging, protobuf, importlib-metadata, protobuf3-to-dict, smdebug-rulesconfig, numpy, scipy
Required-by: 


In [8]:
from sagemaker.pytorch import PyTorchModel

pytorch_model = PyTorchModel(model_data="s3://sagemaker-ap-southeast-1-342474125894/pytorch-training-2020-08-11-14-16-28-241/output/model.tar.gz", 
                             role=role, 
                             entry_point='inference.py',
                             source_dir='../docker/code',
                             py_version='py3',
                             framework_version='1.5.1',
                            )


In [9]:
predictor = pytorch_model.deploy(initial_instance_count=1, instance_type='ml.m5.xlarge', wait=True)


ClientError: An error occurred (404) when calling the HeadObject operation: Not Found

In [13]:
pytorch_model.endpoint_name

### Install python package

Install python packages to load sample test data

In [13]:
!pip install -q librosa==0.7.2 numba==0.48

### Perform inference on sample test data

Create dataloader to perform inference by batch

In [14]:
from torch.utils.data import Dataset
import numpy as np
import librosa
from pathlib import Path
from typing import Iterable
import pandas as pd
import torch

class UrbanSoundDataset(Dataset):
    def __init__(
        self, csv_path: Path, file_path: Path, folderList: Iterable[int], new_sr=8000, audio_len=20, sampling_ratio=5
    ):
        """[summary]

        Args:
            csv_path (Path): Path to dataset metadata csv
            file_path (Path): Path to data folders
            folderList (Iterable[int]): Data folders to be included in dataset
            new_sr (int, optional): New sampling rate. Defaults to 8000.
            audio_len (int, optional): Audio length based on new sampling rate (sec). Defaults to 20.
            sampling_ratio (int, optional): Additional downsampling ratio. Defaults to 5.
        """

        df = pd.read_csv(csv_path)
        self.file_names = []
        self.labels = []
        self.folders = []
        for i in range(0, len(df)):
            if df.iloc[i, 5] in list(folderList):
                self.labels.append(df.iloc[i, 6])
                self.folders.append(df.iloc[i, 5])
                temp = "fold" + str(df.iloc[i, 5]) + "/" + str(df.iloc[i, 0])
                temp = file_path / temp
                self.file_names.append(temp)

        self.file_path = Path(file_path)
        self.folderList = folderList
        self.new_sr = new_sr
        self.audio_len = audio_len
        self.sampling_ratio = sampling_ratio

    def __getitem__(self, index):
        # format the file path and load the file
        path = self.file_names[index]
        sound, sr = librosa.core.load(str(path), mono=False, sr=None)
        if sound.ndim < 2:
            sound = np.expand_dims(sound, axis=0)
        # Convert into single channel format
        sound = sound.mean(axis=0, keepdims=True)
        # Downsampling
        sound = librosa.core.resample(sound, orig_sr=sr, target_sr=self.new_sr)

        # Zero padding to keep desired audio length in seconds
        const_len = self.new_sr * self.audio_len
        tempData = np.zeros([1, const_len])
        if sound.shape[1] < const_len:
            tempData[0, : sound.shape[1]] = sound[:]
        else:
            tempData[0, :] = sound[0, :const_len]
        sound = tempData
        # Resampling
        new_const_len = const_len // self.sampling_ratio
        soundFormatted = torch.zeros([1, new_const_len])
        soundFormatted[0, :] = torch.tensor(sound[0, ::5], dtype=float)

        return soundFormatted, self.labels[index]

    def __len__(self):
        return len(self.file_names)


The following are the class labels.

```
0 = airconditioner 
1 = carhorn
2 = childrenplaying 
3 = dogbark
4 = drilling
5 = engineidling 
6 = gunshot
7 = jackhammer
8 = siren
9 = street_music
```

In [15]:
test_folder = [10]
datapath = Path("../data/UrbanSound8K")
csvpath = datapath / "UrbanSound8K.csv"

test_set = UrbanSoundDataset(csvpath, datapath, test_folder)
test_loader = torch.utils.data.DataLoader(test_set, batch_size=5, shuffle=True)

In [16]:
X, y = next(iter(test_loader))
print(X.shape, y)

torch.Size([5, 1, 32000]) tensor([4, 8, 0, 8, 7])


In [17]:
response = predictor.predict(X.numpy())
response = np.transpose(response, (1, 0, 2))
prediction = response[0].argmax(axis=1)
print(prediction)

[4 2 0 7 7]


## Step 4: Optional Cleanup

When you're done with the endpoint, you should clean it up.

All of the training jobs, models and endpoints we created can be viewed through the SageMaker console of your AWS account.

In [18]:
predictor.delete_endpoint()