# SageMaker endpoint

To deploy the model you previously trained, you need to create a Sagemaker Endpoint. This is a hosted prediction service that you can use to perform inference.

## Finding the model

This notebook uses a stored model if it exists. If you recently ran a training example that use the `%store%` magic, it will be restored in the next cell.

Otherwise, you can pass the URI to the model file (a .tar.gz file) in the `model_data` variable.

In [None]:
import boto3
import os
import sagemaker

def copy_model_from_public_bucket():
    """Copy a trained model artifact to your def"""
    s3 = boto3.client('s3')
    public_bucket="sagemaker-sample-files"
    key = "datasets/image/MNIST/model/model.tar.gz"
    with open(os.path.join('/tmp', 'model.tar.gz'), 'wb') as f:
        s3.download_fileobj(public_bucket, key, f)
    
    # upload to your default bucket
    default_bucket = sagemaker.Session().default_bucket()
    with open(os.path.join('/tmp', 'model.tar.gz'), 'rb') as f:
        s3.upload_fileobj(f, default_bucket, key)
    return 's3://' + default_bucket + '/' + key

In [None]:
# Retrieve a saved model from a previous notebook run's stored variable
%store -r model_data

try:
    model_data
except NameError:
    # If no model was found, set it manually here.
    model_data = copy_model_from_public_bucket()

print("Using this model: {}".format(model_data))

## Create a model object

You define the model object by using SageMaker SDK's `TensorFlowModel` and pass in the model from the `estimator` and the `entry_point`. The function loads the model and sets it to use a GPU, if available.

In [None]:
import sagemaker
role = sagemaker.get_execution_role()

from sagemaker.tensorflow import TensorFlowModel
model = TensorFlowModel(model_data=model_data, role=role, framework_version='2.3')

### Deploy the model on an endpoint

You create a `predictor` by using the `model.deploy` function. You can optionally change both the instance count and instance type.

In [None]:
predictor = model.deploy(initial_instance_count=1, instance_type='ml.m4.xlarge')

## Cleanup

If you don't intend to try out inference or to do anything else with the endpoint, you should delete the endpoint.

In [None]:
predictor.delete_endpoint()