## Task 3: Create and train a convolutional neural network model using ResNet-20

In this task, you will train a ResNet neural network with CIFAR-10 training data to classify an image into 10 known categories. The code is written in MXNet.

[CIFAR-10](https://www.cs.toronto.edu/~kriz/cifar.html) and CIFAR-100 are labeled subsets of the 80 million tiny images dataset. They were collected by Alex Krizhevsky, Vinod Nair, and Geoffrey Hinton.

![](cifar-10.png)

The CIFAR-10 dataset consists of 60,000 32x32 color images in 10 classes, with 6000 images per class. There are 50,000 training images and 10,000 test images:

The dataset is divided into five training batches and one test batch, each with 10,000 images. The test batch contains exactly 1000 randomly-selected images from each class. The training batches contain the remaining images in random order, but some training batches may contain more images from one class than another. Between them, the training batches contain exactly 5000 images from each class. The following are general ways to work with image datasets

- Classification
- Localization
- Segmentation
- Scene classification
- [Scene parsing](http://sceneparsing.csail.mit.edu/) to segment and parse an image into different image regions associated with semantic categories, such as sky, road, person, and bed

If you want to learn more about deep learning on images, here is a good lecture: [CS231n: Convolutional Neural Networks for Visual Recognition](http://cs231n.stanford.edu/slides/2016/winter1516_lecture8.pdf)


Run each cell in this notebook by pressing **SHIFT + ENTER**. When the cell finishes running, the text to the left of the cell changes from **In [*]:** to **In [1]**.

In [None]:
import os, sys
import argparse
import logging
import mxnet as mx
import random
from mxnet.io import DataBatch, DataIter
import numpy as np
import time
import subprocess
import errno
import sagemaker

In [None]:
!pip install gluoncv

In [None]:
#from __future__ import division
import argparse, time, logging, random, math

import numpy as np
import mxnet as mx

from mxnet import gluon, nd
from mxnet import autograd as ag
from mxnet.gluon import nn
from mxnet.gluon.data.vision import transforms

from gluoncv.model_zoo import get_model
from gluoncv.utils import makedirs, TrainingHistory

In [None]:
# Number of cpus to use
num_cpus = 1
ctx = [mx.cpu(i) for i in range(num_cpus)]

In [None]:
transform_train = transforms.Compose([
    # Randomly flip the image horizontally
    transforms.RandomFlipLeftRight(),
    # Randomly jitter the brightness, contrast, and saturation of the image
    transforms.RandomColorJitter(brightness=0.1, contrast=0.1, saturation=0.1),
    # Randomly add noise to the image
    transforms.RandomLighting(0.1),
    # Transpose the image from height*width*num_channels to num_channels*height*width
    # and map values from [0, 255] to [0,1]
    transforms.ToTensor(),
    # Normalize the image with mean and standard deviation calculated across all images
    transforms.Normalize([0.4914, 0.4822, 0.4465], [0.2023, 0.1994, 0.2010])
])

In [None]:
transform_test = transforms.Compose([
    # Transpose the image from height*width*num_channels to num_channels*height*width
    # and map values from [0, 255] to [0,1]
    transforms.ToTensor(),
    # Normalize the image with mean and standard deviation calculated across all images
    transforms.Normalize([0.4914, 0.4822, 0.4465], [0.2023, 0.1994, 0.2010])
])

In [None]:
# Batch size for each cpu
per_device_batch_size = 128
# Number of data loader workers
num_workers = 8
# Calculate effective total batch size
batch_size = per_device_batch_size * num_cpus

# Set train=True for training data
# Set shuffle=True to shuffle the training data
train_data = gluon.data.DataLoader(
    gluon.data.vision.CIFAR10(train=True).transform_first(transform_train),
    batch_size=batch_size, shuffle=True, last_batch='discard', num_workers=num_workers)

# Set train=False for validation data
val_data = gluon.data.DataLoader(
    gluon.data.vision.CIFAR10(train=False).transform_first(transform_test),
    batch_size=batch_size, shuffle=False, num_workers=num_workers)

In [None]:
# Get the model CIFAR_ResNet20_v1, with 10 output classes, without pretrained weights
net = get_model('cifar_resnet20_v1', classes=10, pretrained=False)
net.initialize(mx.init.Xavier(), ctx = ctx)

In [None]:
# Using stochastic gradient descent
optimizer = 'sgd'

# Set parameters
optimizer_params = {'learning_rate': 0.01, 'wd': 0.0001, 'momentum': 0.9}

# Define the trainer for net
trainer = gluon.Trainer(net.collect_params(), optimizer, optimizer_params)

In [None]:
# Softmaxcrossentropy loss function
loss_fn = gluon.loss.SoftmaxCrossEntropyLoss()

# Use accuracy as the training metric
train_metric = mx.metric.Accuracy()
train_history = TrainingHistory(['training-acc', 'validation-acc'])

In [None]:
def test(net, ctx, val_data):
    '''
    The test function to be used in the training data to check accuracy of the unseen data
    Params:
        ctx: Context describes the device type and ID on which computation should be carried out
        val_data: Validation data to check the accuracy of unseen data
    Returns:
        metrics: Metric name and accuracy
    '''
    metric = mx.metric.Accuracy()
    for i, batch in enumerate(val_data):
        data = gluon.utils.split_and_load(batch[0], ctx_list=ctx, batch_axis=0)
        label = gluon.utils.split_and_load(batch[1], ctx_list=ctx, batch_axis=0)
        outputs = [net(X) for X in data]
        metric.update(label, outputs)
    return metric.get()

In [None]:
epochs = 10
lr_decay_count = 0

for epoch in range(epochs):
    tic = time.time()
    train_metric.reset()
    train_loss = 0

    # Loop through each batch of training data
    for i, batch in enumerate(train_data):
        # Extract data and label
        data = gluon.utils.split_and_load(batch[0], ctx_list=ctx, batch_axis=0)
        label = gluon.utils.split_and_load(batch[1], ctx_list=ctx, batch_axis=0)

        # AutoGrad
        with ag.record():
            output = [net(X) for X in data]
            loss = [loss_fn(yhat, y) for yhat, y in zip(output, label)]

        # Backpropagation
        for l in loss:
            l.backward()

        # Optimize
        trainer.step(batch_size)

        # Update metrics
        train_loss += sum([l.sum().asscalar() for l in loss])
        train_metric.update(label, output)

    name, acc = train_metric.get()
    # Evaluate on validation data
    name, val_acc = test(net, ctx, val_data)

    # Update history and print metrics
    train_history.update([acc, val_acc])
    print('[Epoch %d] train=%f val=%f loss=%f time: %f' %
        (epoch, acc, val_acc, train_loss, time.time()-tic))

# Plot the metric scores
train_history.plot()

Now you should have close to 80% validation accuracy after 10 epochs. But how does your model compare to other models out there?

# Task 4: Compare different ResNet models

In this task, you will compare four validation accuracies between two different models, ResNet-20 and ResNet-56, using the flags `pretrained=True` and `pretrained=False`. During a machine learning project, you can compare different models using a metric like accuracy, precision, or recall. In this case, use the accuracy metric on the validation data only.

To start, use code from the previous task but wrap the code in the function `model_training_job()` so that you can call it using multiple models.

In [None]:
num_cpus = 1
ctx = [mx.cpu(i) for i in range(num_cpus)]

def model_training_job(model, epochs=10):
    '''
    The function describes the model training job with the specified model using the variable "model".
    The function includes ingesting the data, creating the transforms, and defining the hyperparams
    before you start your training loop.
    Params:
        model: initialized machine learning algorithm you are training
        epochs: number of epochs to train the algorithm; default is 10
    Returns:
        training_history: history of metrics per epoch
    '''
    num_epochs = epochs
    
    transform_train = transforms.Compose([
    # Randomly flip the image horizontally
    transforms.RandomFlipLeftRight(),
    # Randomly jitter the brightness, contrast, and saturation of the image
    transforms.RandomColorJitter(brightness=0.1, contrast=0.1, saturation=0.1),
    # Randomly add noise to the image
    transforms.RandomLighting(0.1),
    # Transpose the image from height*width*num_channels to num_channels*height*width
    # and map values from [0, 255] to [0,1]
    transforms.ToTensor(),
    # Normalize the image with mean and standard deviation calculated across all images
    transforms.Normalize([0.4914, 0.4822, 0.4465], [0.2023, 0.1994, 0.2010])
    ])
    
    transform_test = transforms.Compose([
    #transforms.Resize(32),
    transforms.ToTensor(),
    transforms.Normalize([0.4914, 0.4822, 0.4465], [0.2023, 0.1994, 0.2010])
    ])
    
    # Batch size for each cpu
    per_device_batch_size = 128
    # Number of data loader workers
    num_workers = 8
    # Calculate effective total batch size
    batch_size = per_device_batch_size * num_cpus

    # Set train=True for training data
    # Set shuffle=True to shuffle the training data
    train_data = gluon.data.DataLoader(
        gluon.data.vision.CIFAR10(train=True).transform_first(transform_train),
        batch_size=batch_size, shuffle=True, last_batch='discard', num_workers=num_workers)

    # Set train=False for validation data
    val_data = gluon.data.DataLoader(
        gluon.data.vision.CIFAR10(train=False).transform_first(transform_test),
        batch_size=batch_size, shuffle=False, num_workers=num_workers)
    
    # Learning rate decay factor
    lr_decay = 0.0001
    # Epochs where learning rate decays
    lr_decay_epoch = [80, 160, np.inf]
    lr_decay_count = 0

    # Using stochastic gradient descent
    optimizer = 'sgd'
    # Set parameters
    optimizer_params = {'learning_rate': 0.01, 'wd': 0.0001, 'momentum': 0.9}

    # Define the trainer for net
    trainer = gluon.Trainer(model.collect_params(), optimizer, optimizer_params)
    
    # Define the loss function
    loss_fn = gluon.loss.SoftmaxCrossEntropyLoss()
    
    # Define the training metric "accuracy" using mx.metric.Accuracy()
    train_metric = mx.metric.Accuracy()
    train_history = TrainingHistory(['training-acc', 'validation-acc'])
    
    print("Starting Training")
    for epoch in range(epochs):
        tic = time.time()
        train_metric.reset()
        train_loss = 0

        # Loop through each batch of training data
        for i, batch in enumerate(train_data):
            #print(f'Epoch: {epoch} Batch: {i}')
            # Extract data and label
            data = gluon.utils.split_and_load(batch[0], ctx_list=ctx, batch_axis=0)
            label = gluon.utils.split_and_load(batch[1], ctx_list=ctx, batch_axis=0)

            # AutoGrad
            with ag.record():
                output = [model(X) for X in data]
                loss = [loss_fn(yhat, y) for yhat, y in zip(output, label)]

            # Backpropagation
            for l in loss:
                l.backward()

            # Optimize
            trainer.step(batch_size)

            # Update metrics
            train_loss += sum([l.sum().asscalar() for l in loss])
            train_metric.update(label, output)

        name, acc = train_metric.get()
        # Evaluate on Validation data
        name, val_acc = test(model,ctx, val_data)

        # Update history and print metrics
        train_history.update([acc, val_acc])
        print('[Epoch %d] train=%f val=%f loss=%f time: %f' %
            (epoch, acc, val_acc, train_loss, time.time()-tic))

    # Plot the metric scores
    train_history.plot()
    return train_history.history

In [None]:
trains = {}

print('Training cifar_resnet20_v2 without pretrain')
net_20_f = get_model('cifar_resnet20_v2', classes=10, pretrained=False, ctx=ctx)
net_20_f.initialize(mx.init.Xavier(), ctx = ctx)
trains['cifar_resnet20_v2_f'] = model_training_job(net_20_f,3)

print('Training cifar_resnet56_v2 without pretrain')
net_56_f = get_model('cifar_resnet56_v2', classes=10, pretrained=False, ctx=ctx)
net_56_f.initialize(mx.init.Xavier(), ctx = ctx)
trains['cifar_resnet56_v2_f'] = model_training_job(net_56_f,3)

print('Training cifar_resnet20_v2 with pretrain')
net_20_t = get_model('cifar_resnet20_v2', classes=10, pretrained=True, ctx=ctx)
#net_20_t.initialize(mx.init.Xavier(), ctx = ctx)
trains['cifar_resnet20_v2_t'] = model_training_job(net_20_t,3)

print('Training cifar_resnet56_v2 with pretrain')
net_56_t = get_model('cifar_resnet56_v2', classes=10, pretrained=True, ctx=ctx)
#net_56_t.initialize(mx.init.Xavier(), ctx = ctx)
trains['cifar_resnet56_v2_t'] = model_training_job(net_56_t,3)

To compare the algorithms, use the library `bokeh` to plot the different validation curves to see the accuracy between them. 

In [None]:
import bokeh
from bokeh.plotting import figure, output_file, show,output_notebook
output_notebook()
def model_comparison(data_type):
    p = figure(plot_width=800, 
               plot_height=400,
               x_axis_label='Number of epochs',
               y_axis_label=f'{data_type} Accuracy',
               toolbar_location='above')
    x = list(range(len(trains['cifar_resnet20_v2_f']['training-acc'])))
    colors = ['green', 'orange', 'blue','red']
    color = colors[:len(trains.keys())]

    for keys,col in zip(trains.keys(),colors):
        print(keys,col)
        acc = trains[keys][f'{data_type}-acc']
        p.line(x,acc, line_width=2,legend_label=keys,color=col)
        p.circle(x,acc, line_width=2,color=col)
        #show(p)

    p.legend.location = 'bottom_right'
    p.xaxis[0].ticker.desired_num_ticks = len(x)
    show(p)    

model_comparison('validation')

Now look at the training data as well.

In [None]:
model_comparison('training')

In the plot, `cifar_resnet20_v2_f` and `cifar_resnet56_v2_f` are very close to each other but aren't close to `cifar_resnet20_v2_t` and `cifar_resnet56_v2_t`. One difference to notice is that you added the flag `pretrained=True` to the models that are giving much higher accuracy than the other two models. 

### Question: Why do the models give a higher accuracy for `pretrained=True` flag? What is pretraining and the pretraining flag? 

**Answer**: A pretrained convolutional neural network (CNN) model is a CNN model that has been trained on a larger dataset for you and sometimes run for a longer time (more epochs). This lab's models were training on the CIFAR-10 dataset, and the initial weights that the model learned were added to your inital weights. So you start your training with features and the weights that the pretrained model learned instead of learning from scratch. This is also called *incremental training*. 

In most cases, the problem you are working on may not be exactly the same problem as one of these datasets. For example, what if the classes that you are trying to predict are not in the CIFAR-10 dataset? In such a case, you can still use a pretrained model from a related large-scale problem such as ImageNet for other visual recognition tasks without the need to train the first few layers. In this case, the first layer weights are fixed or unchanged while you train the model to recognize the images for your problem. This is called *fine tuning*.

The upper layers are trained or fine tuned to match your problem at hand. This transfer of knowledge from one problem to another problem is called *transfer learning* because you are using a CNN model that was trained on a different but correlated problem. This is normally done to speed up the learning and reduces the need for very large training datasets.

## Task 5: Use Amazon SageMaker built-in algorithms to train your model incrementally

Now take the model you trained and use the Amazon SageMaker image classification algorithm to train the model. This algorithm is a supervised learning algorithm that supports multi-label classification. It takes an image as input and outputs one or more labels assigned to that image. The algorithm uses a CNN (ResNet) that can be trained from scratch or trained using transfer learning when a large number of training images are not available.

First, save the parameters of your created model **(CIFAR_ResNet20_v1)** in Task 3.

In [None]:
net.save_parameters('cifar10_resnet20_v2_f.params')
#net.summary

**Note:** You will see the **cifar10_resnet20_v2_f.params** file created in the notebook instance.

In some cases, you may want to save the model params as well as the model architecture. If your network is hybrid, you can even save the network architecture into files, and you won’t need the network definition in a Python file to load the network.


Here, you use the saved model **(CIFAR_ResNet20_v1)** in Task 3 and re-train it with 5 epochs.

In [None]:
net.hybridize()
model_training_job(net, 5)
net.export('cifar10_resnet20_v2_f')

### Training job on SageMaker Instances

The Amazon SageMaker image classification algorithm supports both **RecordIO** (`application/x-recordio`) and **image** (`image/png`, `image/jpeg`, and `application/x-image`) content types for training in file mode and supports the **RecordIO** (`application/x-recordio`) content type for training in pipe mode. However, you can also train in pipe mode using the image files (`image/png`, `image/jpeg`, and `application/x-image`) without creating RecordIO files by using the augmented manifest format. The algorithm supports `image/png`, `image/jpeg`, and `application/x-image` for inference. You will use the RecordIO format that is already provided on this notebook instance in this lab.

**Note:** For this lab, an Amazon Simple Storage Service (Amazon S3) bucket has been pre configured with the training and validation data so the Amazon SageMaker training job can access it. 

Now, get the right container image for image training.

In [None]:
# Get the right container image for image training
import boto3
import sagemaker
import re
from sagemaker import get_execution_role
import logging
from sagemaker import image_uris
from botocore.exceptions import ClientError

training_image = image_uris.retrieve('image-classification',boto3.Session().region_name)
print("Training Image: ", training_image)

Now that you are finished with all the setup that is needed, you are ready to train the object detector. To begin, create a sageMaker.estimator.Estimator object. This estimator launches the training job.

You need to set two kinds of parameters for training. The first are the parameters for the training job. These include:
- **Training instance count**: Number of instances on which to run the training. When the number of instances is greater than one,  the image classification algorithm runs in distributed settings.
- **Training instance type**: Type of machine on which to run the training. Typically, cpu instances are used for this training.
- **Output path**: Amazon S3 folder in which the training output is stored

Run the training using the Amazon SageMaker CreateTrainingJob API.

**Note:** In the code below, Replace `<LabDataBucket>` value with the bucket name value from the left side of the lab instructions.

In [None]:
import boto3
import sagemaker
import re
from sagemaker import get_execution_role
import logging
from botocore.exceptions import ClientError

sess = sagemaker.Session()

role = get_execution_role()



s3_output_location = 's3://{}/{}/output'.format('<LabDataBucket>', 'image-classification-full-training/output/image-classification')
cifar = sagemaker.estimator.Estimator(training_image,
                                         role, 
                                         instance_count=1, 
                                         instance_type='ml.m4.xlarge',
                                         volume_size = 30,
                                         max_run = 360000,
                                         input_mode= 'File',
                                         output_path=s3_output_location,
                                         sagemaker_session=sess)

There are also hyperparameters that are specific to the algorithm. These are:
- **num_layers**: Number of layers (depth) for the network. This sample uses 18, but other values such as 50 and 152 can be used.
- **image_shape**: Input image dimensions,'num_channels, height, width', for the network. It should be no larger than the actual image size. The number of channels should be same as the actual image.
- **num_classes**: Number of output classes for the new dataset. ImageNet was trained with 1,000 output classes, but the number of output classes can be changed for fine-tuning. For the Caltech 256 dataset, 257 is used because it has 256 object categories + 1 clutter class.
- **num_training_samples**: Total number of training samples. It is set to 15,240 for the Caltech 256 dataset with the current split.
- **mini_batch_size**: Number of training samples used for each mini batch. In distributed training, the number of training samples used per batch is N * mini_batch_size where N is the number of hosts on which training is run.
- **epochs**: Number of training epochs
- **learning_rate**: Learning rate for training
- **top_k**: Report the top-k accuracy during training

In [None]:
cifar.set_hyperparameters(num_layers=20, 
                             image_shape = "3,32,32",
                             num_classes=10,
                             num_training_samples=50000,
                             mini_batch_size=128,
                             epochs=10,
                             learning_rate=0.1,
                             top_k=2)

Here, you will see how to create a definition for input data used by an Amazon SageMaker training job.

**Note:** In the code below, for **s3_train** and **s3_validation** variables, replace `<LabDataBucket>` value with the bucket name value from the left side of the lab instructions.

In [None]:
# Get the path of training data that is uploaded to the S3 bucket.
s3_train = 's3://<LabDataBucket>/image-classification-full-training/train/'

# Get the path of validation data that is uploaded to the S3 bucket.
s3_validation = 's3://<LabDataBucket>/image-classification-full-training/validation/'

train_data = sagemaker.inputs.TrainingInput(s3_train, distribution='FullyReplicated', 
                            content_type='application/x-recordio', s3_data_type='S3Prefix')
validation_data = sagemaker.inputs.TrainingInput(s3_validation, distribution='FullyReplicated', 
                            content_type='application/x-recordio', s3_data_type='S3Prefix')

data_channels = {'train': train_data, 'validation': validation_data}

To start model training, estimator's fit method with the training and validation data is called. 

To learn more about this, select this link: [Estimator fit method for training](https://sagemaker.readthedocs.io/en/stable/api/training/estimators.html#sagemaker.estimator.EstimatorBase.fit)

**Note:** For this lab, the model artifact (model.tar.gz) from the training job has already been created and saved to an Amazon Simple Storage Service (Amazon S3) bucket.


Run the code below to see the created model.
Replace `<LabDataBucket>` value with the bucket name value from the left side of the lab instructions.

In [None]:
!aws s3 ls s3://<LabDataBucket>/image-classification-full-training/output/image-classification/output/

## Task 6: Prepare your model for inference using an Amazon SageMaker endpoint

Now you can use the trained model to perform inference. For this example, that means predicting the 10 classes in the CIFAR-10 dataset. You can deploy the created model by using the deploy method in the estimator. This creates a new Amazon SageMaker endpoint. You can deploy it to get predictions in one of two ways:
- To set up a persistent endpoint to get one prediction at a time, use Amazon SageMaker hosting services.
- To get predictions for an entire dataset, use Amazon SageMaker batch transform.

In this task, you will use Amazon SageMaker hosting services to set up a persistent endpoint to get a single prediction per call.

Deploying a model using Amazon SageMaker hosting services is a three-step process:

1. **Create a model in Amazon SageMaker**: By creating a model, you tell Amazon SageMaker where it can find the model components. This includes the Amazon S3 path where the model artifacts are stored and the Docker registry path for the image that contains the inference code. In subsequent deployment steps, you specify the model by name.

2. **Create an endpoint configuration for an HTTPS endpoint**: You specify the name of one or more models in production variants and the ML compute instances that you want Amazon SageMaker to launch to host each production variant.

3. **Create an HTTPS endpoint**: Provide the endpoint configuration to Amazon SageMaker. The service launches the ML compute instances and deploys the model or models as specified in the configuration. For more information, see the CreateEndpoint API. To get inferences from the model, client applications send requests to the Amazon SageMaker Runtime HTTPS endpoint. For more information about the API, see the InvokeEndpoint API.

**Note:** In the code below, Replace `<LabDataBucket>` value with the bucket name value from the left side of the lab instructions.

In [None]:
from sagemaker.model import Model

# Get SageMaker execution role 
sagemaker_role = get_execution_role()

# Get the model location path from the S3 bucket
model_url='s3://<LabDataBucket>/image-classification-full-training/output/image-classification/output/model.tar.gz'

model = Model(image_uri=training_image, 
            model_data=model_url, 
            role=sagemaker_role)

cifar_classifier = model.deploy(initial_instance_count=1,
                            instance_type='ml.m4.xlarge',
                            endpoint_name='cifar-image-classification')

If you have a currently deployed endpoint, you can update the endpoint with the following command. To do this, uncomment the command, and replace `<endpoint_name>` with the name of your currently running endpoint. Wait until the endpoint is updated before running the next code cell. 

In [None]:
#cifar_classifier = cifar.deploy(endpoint_name = <endpoint_name>, 
#                                update_endpoint=True, 
#                                initial_instance_count = 1, 
#                                instance_type = 'ml.m4.xlarge')

To check whether the endpoint has updated or created, use **boto3** to `DescribeEndpoint`. Do not continue until the status changes to **InService**.

In [None]:
sm = boto3.client('sagemaker')

In [None]:
describe_endpoint = sm.describe_endpoint(EndpointName='cifar-image-classification')
print(f"The status of the endpoint is {describe_endpoint['EndpointStatus']} ")
describe_endpoint

Now, you already have the validation data in the variable `val_data`. Import the raw image data `img_data` from Gluon, which you will use for prediction. Use this endpoint in two different ways to predict:
1. Predict with raw Gluon CIFAR-10 validation image data `img_data` when you are developing your model.
2. Predict with a URL when you deploy your model in your app and your app gets an image URL.

### Predict with raw Gluon CIFAR-10 validation image data

In [None]:
from mxnet import autograd, gluon, image, init, nd
from matplotlib.pylab import imshow

img_data = gluon.data.vision.CIFAR10(train=False)

label_dict = {0:"airplane", 1:"automobile", 2:"bird", 3:"cat", 4:"deer",
              5:"dog", 6:"frog", 7:"horse", 8:"ship", 9:"truck"
             }

The Amazon SageMaker endpoint predicts one image at a time. Choose the first image.

In [None]:
sample = img_data[0]
data = sample[0]
label = sample[1]

imshow(data.asnumpy())

Predict the image.

In [None]:
import cv2
import json
from sagemaker.predictor import Predictor

runtime= boto3.client('runtime.sagemaker')
endpoint = 'cifar-image-classification'

sample_imgs, sample_labels = img_data[:10]

for img, label in zip(sample_imgs, sample_labels): 
    payload = cv2.imencode('.jpeg', img.asnumpy())[1].tobytes()    
    response = runtime.invoke_endpoint(EndpointName=endpoint,
                                       ContentType='application/x-image',
                                       Body= payload)
    pred = np.argmax(json.loads(response['Body'].read()))
    label_dict[pred]

    print(f"Prediction: {pred}-{label_dict[pred]}, True Label: {label}-{label_dict[label]} " )

### Predict with a URL

For developing your application, the `model.predict` works perfectly. However, when you deploy your model to your application, you need to use the `sagemaker-runtime` library and call the `invoke_endpoint` API to get the predictions.

**Note** `response['Body'].read()` can only be called once each time you call the `invoke_endpoint` API.

In [None]:
import requests
from IPython.display import Image

urls = 'https://cdn.pixabay.com/photo/2013/06/08/04/17/ferry-boat-123059__340.jpg'

display(Image(requests.get(urls).content))


payload = requests.get(urls).content

ENDPOINT_NAME = 'cifar-image-classification'
runtime= boto3.client('runtime.sagemaker')
response = runtime.invoke_endpoint(EndpointName=ENDPOINT_NAME,
                                       ContentType='application/x-image',
                                       Body=payload)

pred = np.argmax(json.loads(response['Body'].read()))
label_dict[pred]

## Lab complete

Congratulations! You have completed this lab. To clean up your lab environment, do the following:

- Close this notebook file.
- Log out of Jupyter Notebook by clicking **Quit**. Then, close the tab.
- Log out of the AWS Management Console by clicking the user name at the top of the console, and then clicking **Sign Out**.
- End the lab session by clicking **End Lab**.