# Deploy an MLflow model with SageMaker

## Intro

We are finally ready to deploy a MLFlow model to a SageMaker hosted endpoint ready to be consumed for online predictions.

## Install dependencies

In [None]:
!pip install -q --upgrade pip
!pip install -q --upgrade sagemaker==2.63.1
!pip install -q --upgrade mlflow==1.18.0

## Setup environment

In [None]:
import os
import pandas as pd
import json
import random

import sagemaker
from sagemaker.tuner import IntegerParameter, HyperparameterTuner
from sagemaker.sklearn.estimator import SKLearn
import boto3

import mlflow
import mlflow.sagemaker
from mlflow.tracking.client import MlflowClient

sess = sagemaker.Session()
role = sagemaker.get_execution_role()
bucket = sess.default_bucket()
region = sess.boto_region_name
account = role.split("::")[1].split(":")[0]
tracking_uri = os.environ['MLFLOWSERVER']
experiment_name = 'california-housing'
model_name = 'california-housing-model'

print('SageMaker role name: {}'.format(role.split("/")[-1]))
print('Account: {}'.format(account))
print('bucket: {}'.format(bucket))
print("Using AWS Region: {}".format(region))
print("MLflow server: {}".format(tracking_uri))

## Build MLflow docker image to serve the model with SageMaker

We first need to build a new MLflow Sagemaker image, assign it a name, and push to ECR.

The `mlflow sagemaker build-and-push-container` function does exactly that. It first builds an MLflow Docker image. The image is built locally and it requires Docker to run. Then, the image is pushed to ECR under current active AWS account and to current active AWS region. More information on this command can be found in the official [MLflow CLI documentation for SageMaker](https://www.mlflow.org/docs/latest/cli.html#mlflow-sagemaker)

In [None]:
!mlflow sagemaker build-and-push-container

In [None]:
# URL of the ECR-hosted Docker image the model should be deployed into: make sure to include the tag 1.18.0
image_uri = "{}.dkr.ecr.{}.amazonaws.com/mlflow-pyfunc:1.18.0".format(account, region)
print("image URI: {}".format(image_uri))

## Deploy a SageMaker endpoint with our scikit-learn model

We first need to get the best performing model stored in MLFlow. Once it has been identified, we register it to the Registry and then deploy to a SageMaker managed endpoint via the MLflow SDK. More information can be found [here](https://www.mlflow.org/docs/latest/python_api/mlflow.sagemaker.html)

In [None]:
from mlflow.tracking.client import MlflowClient

mlflow.set_tracking_uri(tracking_uri)
mlflow.set_experiment(experiment_name)
client = MlflowClient()

experiment = mlflow.get_experiment_by_name(experiment_name)
experiment_id = experiment.experiment_id

# Get the best run according to the objective metric
run = client.search_runs(
  experiment_ids=experiment_id,
  filter_string="",
  max_results=1,
  order_by=["metrics.`AE-at-50th-percentile` ASC"]
)[0]

try:
    client.create_registered_model(model_name)
except:
    print("Registered model already exists")

model_version = client.create_model_version(
    name=model_name,
    source="{}/model".format(run.info.artifact_uri),
    run_id=run.info.run_uuid
)

print("model_version: {}".format(model_version))
model_uri = model_version.source

In [None]:
endpoint_name = 'california-housing'

mlflow.sagemaker.deploy(
    mode='create',
    app_name=endpoint_name,
    model_uri=model_uri,
    image_url=image_uri,
    execution_role_arn=role,
    instance_type='ml.m5.xlarge',
    instance_count=1,
    region_name=region
)

## Predict

We are now ready to make predictions again the endpoint.

In [None]:
# load california  dataset
data = pd.read_csv('./california_test.csv')
df_y = data[['target']]
df = data.drop(['Latitude','Longitude','target'], axis=1)
runtime= boto3.client('runtime.sagemaker')

for _ in range(0,10):
    # Randomly pick a row to test the prediction
    index = random.randrange(0, len(df_y))
    payload = df.iloc[[index]].to_json(orient="split")
    y = df_y['target'][index]
    print(payload)
    runtime_response = runtime.invoke_endpoint(EndpointName=endpoint_name, ContentType='application/json', Body=payload)
    prediction = json.loads(runtime_response['Body'].read().decode())
    print(f'Payload: {payload}')
    print(f'Actual value: {y}')
    print(f'Prediction: {prediction[0]}')

## Delete endpoint

In order to avoid unwanted costs, make sure you delete the endpoint.

In [None]:
mlflow.sagemaker.delete(app_name=endpoint_name, region_name=region)