# Part 2: Deploy a model trained using SageMaker distributed data parallel

To deploy the model you previously trained, you need to create a Sagemaker Endpoint. This is a hosted prediction service that you can use to perform inference.

## Finding the model

This notebook uses a stored model if it exists. If you recently ran a training example that use the `%store%` magic, it will be restored in the next cell.

Otherwise, the notebook downloads a trained model artifact from a public bucket and uploads into your default S3 bucket for the AWS Region you use to run this notebook. 

In [1]:
import boto3
import os
import sagemaker


def copy_model_from_public_bucket():
    """Copy a trained model artifact to your def"""
    s3 = boto3.client("s3")
    public_bucket = "sagemaker-sample-files"
    key = "datasets/image/MNIST/model/model.tar.gz"
    with open(os.path.join("/tmp", "model.tar.gz"), "wb") as f:
        s3.download_fileobj(public_bucket, key, f)

    # upload to your default bucket
    default_bucket = sagemaker.Session().default_bucket()
    with open(os.path.join("/tmp", "model.tar.gz"), "rb") as f:
        s3.upload_fileobj(f, default_bucket, key)
    return "s3://" + default_bucket + "/" + key

In [2]:
# Retrieve a saved model from a previous notebook run's stored variable
%store -r model_data

try:
    model_data
except NameError:
    # If no model was found, set it manually here.
    model_data = copy_model_from_public_bucket()

print("Using this model: {}".format(model_data))

Using this model: s3://sagemaker-us-west-2-688520471316/datasets/image/MNIST/model/model.tar.gz


In [12]:
!aws s3 cp s3://sagemaker-us-west-2-688520471316/datasets/image/MNIST/model/model.tar.gz .

download: s3://sagemaker-us-west-2-688520471316/datasets/image/MNIST/model/model.tar.gz to ./model.tar.gz


In [14]:
!tar -xf model.tar.gz

tar: Removing leading `/' from member names


## Create a model object

You define the model object by using SageMaker SDK's `TensorFlowModel` and pass in the model from the `estimator` and the `entry_point`. The function loads the model and sets it to use a GPU, if available.

In [11]:
import sagemaker
from sagemaker.tensorflow import TensorFlowModel

role = sagemaker.get_execution_role()

model = TensorFlowModel(model_data=model_data, role=role, framework_version="2.3")

### Deploy the model on an endpoint

You create a `predictor` by using the `model.deploy` function. You can optionally change both the instance count and instance type.

In [7]:
predictor = model.deploy(initial_instance_count=1, instance_type="ml.m4.xlarge")

update_endpoint is a no-op in sagemaker>=2.
See: https://sagemaker.readthedocs.io/en/stable/v2.html for details.


---------------------------------------------*

UnexpectedStatusException: Error hosting endpoint tensorflow-inference-2021-05-28-21-18-37-294: Failed. Reason: The primary container for production variant AllTraffic did not pass the ping health check. Please check CloudWatch logs for this endpoint..

In [10]:
model.endpoint_name

'tensorflow-inference-2021-05-28-21-18-37-294'

## Cleanup

If you don't intend to try out inference or to do anything else with the endpoint, you should delete the endpoint.

In [None]:
#predictor.delete_endpoint()