## Training and Deploying with the AWS SageMaker
In order to productionize the idea of the model to the real world, AWS SageMaker is a no-match to anything. In order to deploy the model to the real world, train and deploy with the AWS SageMaker will require an engineering workflow. 
### Importing the Necessary Libraries.

In [1]:
import pandas as pd
import boto3
import sagemaker
import os

### Getting Hold of the AWS SageMaker credentials, Role and Bucket.
The Current SageMaker Session running throughout this notebook and beyond will be much required to get hold of the underlying bucket, execution role and IAM specifics and specific permissions and privillages of the current user.

In [2]:
session = sagemaker.Session()
role = sagemaker.get_execution_role()
bucket = session.default_bucket()
bucket

'sagemaker-us-west-2-782510500637'

### Uploading to Datasets to S3: This may take significant amount of time because of Large Size of Data.
Clean up of the bucket mentioned above after training the model will be required not to incur additional charges on the AWS Bills or not to exceed the free tier or credits that have been applied to the AWS account.

In [4]:
%%time
data_dir = 'data'
train_prefix = 'train_chest_xray/train'
test_prefix = 'test_chest_xray/test'
#uploading both of these two to S3 for Sagemaker Inference:
train_data = session.upload_data(os.path.join(data_dir, 'workdir'), key_prefix = train_prefix)
test_data = session.upload_data(os.path.join(data_dir, 'testdir'), key_prefix = test_prefix)

In [None]:
empty_check = []
for obj in boto3.resource('s3').Bucket(bucket).objects.all():
    empty_check.append(obj.key)
    print(obj.key)

assert len(empty_check) !=0, 'S3 bucket is empty.'

In [6]:
print(train_data)
print(test_data)

s3://sagemaker-us-west-2-782510500637/train_chest_xray
s3://sagemaker-us-west-2-782510500637/test_chest_xray


In [3]:
from sagemaker.pytorch import PyTorch
model_prefix = 'chest_xray_model'
chest_xray_pyt = PyTorch(role = role, 
                         entry_point='train.py',
                         source_dir='sagemaker_scripts', 
                         train_instance_count=1,
                         train_instance_type = 'ml.p2.xlarge', 
                         sagemaker_session = session, 
                         framework_version='0.4.0'
                        )                        

In [30]:
%%time
chest_xray_pyt.fit({'train':train_data})

2020-04-09 21:15:35 Starting - Starting the training job...
2020-04-09 21:15:37 Starting - Launching requested ML instances......
2020-04-09 21:16:36 Starting - Preparing the instances for training......
2020-04-09 21:17:48 Downloading - Downloading input data......
2020-04-09 21:19:03 Training - Downloading the training image...
2020-04-09 21:19:24 Training - Training image download completed. Training in progress.[34mbash: cannot set terminal process group (-1): Inappropriate ioctl for device[0m
[34mbash: no job control in this shell[0m
[34m2020-04-09 21:19:24,696 sagemaker-containers INFO     Imported framework sagemaker_pytorch_container.training[0m
[34m2020-04-09 21:19:24,722 sagemaker_pytorch_container.training INFO     Block until all host DNS lookups succeed.[0m
[34m2020-04-09 21:19:24,941 sagemaker_pytorch_container.training INFO     Invoking user training script.[0m
[34m2020-04-09 21:19:25,166 sagemaker-containers INFO     Module train does not provide a setup.py. 

In [4]:
#if no associated training jobs are found, attach the estimator with a training job. 
chest_xray_pyt = chest_xray_pyt.attach(training_job_name='sagemaker-pytorch-2020-04-09-21-15-35-146', sagemaker_session=session)

2020-04-09 22:55:19 Starting - Preparing the instances for training
2020-04-09 22:55:19 Downloading - Downloading input data
2020-04-09 22:55:19 Training - Training image download completed. Training in progress.
2020-04-09 22:55:19 Uploading - Uploading generated training model
2020-04-09 22:55:19 Completed - Training job completed[34mbash: cannot set terminal process group (-1): Inappropriate ioctl for device[0m
[34mbash: no job control in this shell[0m
[34m2020-04-09 21:19:24,696 sagemaker-containers INFO     Imported framework sagemaker_pytorch_container.training[0m
[34m2020-04-09 21:19:24,722 sagemaker_pytorch_container.training INFO     Block until all host DNS lookups succeed.[0m
[34m2020-04-09 21:19:24,941 sagemaker_pytorch_container.training INFO     Invoking user training script.[0m
[34m2020-04-09 21:19:25,166 sagemaker-containers INFO     Module train does not provide a setup.py. [0m
[34mGenerating setup.py[0m
[34m2020-04-09 21:19:25,166 sagemaker-containers I

In [54]:
# Creating provisions for Automatic Data Capture :
from time import gmtime, strftime
prefix = 'auto_data_capture'
data_capture_prefix = '{}/datacapture'.format(prefix)
s3_capture_upload_path = 's3://{}/{}'.format(bucket, data_capture_prefix)
reports_prefix = '{}/reports'.format(prefix)
s3_report_path = 's3://{}/{}'.format(bucket,reports_prefix)

from sagemaker.model_monitor import DataCaptureConfig
endpoint_name = 'chest-xray-with-data-capt-'+strftime("%Y-%m-%d-%H-%M-%S", gmtime())

data_capture_config = DataCaptureConfig(enable_capture=True, 
                                        sampling_percentage=70, 
                                        destination_s3_uri=s3_capture_upload_path)

In [10]:
predictor = chest_xray_pyt.deploy(initial_instance_count=1,
                                  instance_type='ml.m4.xlarge', 
                                  endpoint_name = endpoint_name, 
                                  data_capture_config=data_capture_config)

Using already existing model: sagemaker-pytorch-2020-04-09-21-15-35-146


---------------!

In [3]:
image = os.listdir(os.path.join('data/testdir', 'bacterial'))[4]

In [75]:
from torchvision import transforms, datasets
import torch
test_dir = 'data/testdir'
image_transformer = transforms.Compose([transforms.Resize((224, 224)), transforms.ToTensor()])
test_data = datasets.ImageFolder(test_dir, transform=image_transformer)
batch_size=20
num_workers=0
test_loader = torch.utils.data.DataLoader(test_data,
                                          batch_size=batch_size,
                                          num_workers=num_workers, 
                                          shuffle=True)
dataiter = iter(test_loader)
images, labels = dataiter.next()
# move model inputs to cuda, if GPU available
predictions = []
labels_target = []
for i in range(len(images)-1):
    pred = predictor.predict(images[i].unsqueeze_(0))
    pred = pred.argmax()
    predictions.append(pred)
    targ = labels.data[i].item()
    labels_target.append(targ)
print("Predicted Labels are: ")
print(predictions)
print("Original Labels are: ")
print(labels_target)

Predicted Labels are: 
[0, 2, 1, 1, 2, 2, 1, 0, 2, 2, 2, 0, 0, 0, 1, 1, 2, 1, 1]
Original Labels are: 
[1, 0, 0, 2, 0, 1, 1, 1, 0, 0, 0, 1, 2, 1, 0, 0, 1, 0, 1]


In [77]:
from sklearn.metrics import accuracy_score
accuracy_score(predictions, labels_target)

0.10526315789473684

In [16]:
import sys
test_loss = 0.0
class_correct = list(0. for i in range(5))
class_total = list(0. for i in range(5))

# iterate over test data
for data, target in test_loader:
    # move tensors to GPU if CUDA is available
    if train_on_gpu:
        data, target = data.cuda(), target.cuda()
    # forward pass: compute predicted outputs by passing inputs to the model
    output = vgg19(data)
    # calculate the batch loss
    loss = criterion(output, target)
    # update  test loss 
    test_loss += loss.item()*data.size(0)
    # convert output probabilities to predicted class
    _, pred = torch.max(output, 1)    
    # compare predictions to true label
    correct_tensor = pred.eq(target.data.view_as(pred))
    correct = np.squeeze(correct_tensor.numpy()) if not train_on_gpu else np.squeeze(correct_tensor.cpu().numpy())
    # calculate test accuracy for each object class
    for i in range(batch_size):
        if i < len(target.data):
            label = target.data[i]
            class_correct[label] += correct[i].item()
            class_total[label] += 1
test_loss = test_loss/len(test_loader.dataset)
sys.stderr.write('Test Loss: {:.6f}\n'.format(test_loss))
sys.stderr.write('\nTest Accuracy (Overall): %2d%% (%2d/%2d)' % (
    100. * np.sum(class_correct) / np.sum(class_total),
    np.sum(class_correct), np.sum(class_total)))

2

In [55]:
import cv2
im = cv2.imread(os.path.join('data/testdir/bacterial', os.listdir(os.path.join('data/testdir', 'bacterial'))[0]))
image_transformer = transforms.Compose([transforms.ToPILImage(), transforms.Resize((224, 224)), transforms.ToTensor()])
im = image_transformer(im)
pred = predictor.predict(im.unsqueeze_(0))
pred.argmax()

2

In [None]:
from sagemaker.predictor import RealTimePredictor
from sagemaker.pytorch import PyTorchModel

class StringPredictor(RealTimePredictor):
    def __init__(self, endpoint_name, sagemaker_session):
        super(StringPredictor, self).__init__(endpoint_name, sagemaker_session, content_type='text/plain')

model = PyTorchModel(model_data=estimator.model_data,
                     role = role,
                     framework_version='0.4.0',
                     entry_point='predict.py',
                     source_dir='serve',
                     predictor_cls=StringPredictor)
predictor = model.deploy(initial_instance_count=1, instance_type='ml.m4.xlarge')

In [7]:
predictor.delete_endpoint()

NameError: name 'predictor' is not defined