# Ray Serve - Integration with Model Registry MLflow

© 2019-2022, Anyscale. All Rights Reserved

This tutorial example shows how to deploy models saved in a model registry such as MLflow to Ray Serve, using the simple Ray Serve deployment APIs. 

### Learning Objective:
In this tutorial, you will learn how to:

 * Integrate with model registeries like [MLflow](https://mlflow.org/)
 * Train a scikit-learn classification model
 * Use MLflow `autolog()` method to automatically log all metrics, parameters, artifacts, and the model
 * Create a deployment class and deploy the model for serving from MLflow model artifacts
 * Deploy and serve the model
 
<img src="images/serve_mlflow.png" height="50%" width="100%">

You can peruse the saved models' metrics, parameters, and artifacts in MLflow ui.



In [3]:
import json
import numpy as np
import pandas as pd
import requests
import os
import tempfile

from sklearn.datasets import load_iris
from sklearn.ensemble import GradientBoostingClassifier
from mlflow.tracking import MlflowClient

from ray import serve
import mlflow
import matplotlib

Define a utility function:
 * create Iris data set
 * use a classifier
 * train and fit model
 * track all experiments using MLflow `autolog(...)` method

### Step 1 & 2: Train the scikit-learn model and log to MLflow

In [4]:
def create_and_save_model():
    # load Iris data
    iris_data = load_iris()
    data, target, target_names = (iris_data['data'],
                                  iris_data['target'],
                                  iris_data['target_names'])

    # Instantiate a model
    model = GradientBoostingClassifier()

    # Training and validation split
    np.random.shuffle(data), np.random.shuffle(target)
    train_x, train_y = data[:100], target[:100]
    val_x, val_y = data[100:], target[100:]

    # Create labels list as file
    LABEL_PATH = os.path.join(tempfile.gettempdir(), "iris_labels.json")
    with open(LABEL_PATH, "w") as f:
        json.dump(target_names.tolist(), f)

    # Train the model and save our label list as an MLflow artifact
    # mlflow.sklearn.autolog automatically logs all parameters and metrics during
    # the training.
    mlflow.sklearn.autolog()
    with mlflow.start_run() as run:
        model.fit(train_x, train_y)
        # Log label list as a artifact
        mlflow.log_artifact(LABEL_PATH, artifact_path="labels")
    return run.info.run_id

### Step 3: Create our Ray Serve deployment class and deploy it

In [5]:
@serve.deployment(route_prefix="/regressor")
class BoostingModel:
    def __init__(self, uri):
        # Load the model and label artifact from the local
        # Mlflow model registry as a PyFunc Model
        self.model = mlflow.pyfunc.load_model(model_uri=uri)

        # Download the artifact list of labels
        local_dir = tempfile.mkdtemp()
        client = MlflowClient()
        local_path = f"{client.download_artifacts(run_id, 'labels', local_dir)}/iris_labels.json"
        with open(local_path, "r") as f:
            self.label_list = json.load(f)

    async def __call__(self, starlette_request):
        payload = await starlette_request.json()
        print(f"Worker: received Starlette request with data: {payload}")

        # Get the input vector from the payload
        input_vector = [
            payload["sepal length"],
            payload["sepal width"],
            payload["petal length"],
            payload["petal width"],
        ]

        # Convert the input vector to a Pandas DataFrame for prediction since
        # an MLflow PythonFunc model, model.predict(...), takes pandas DataFrame
        prediction = self.model.predict(pd.DataFrame([input_vector]))[0]
        human_name = self.label_list[prediction]
        return {"result": human_name}


Train and save the model artifacts in MLflow.
Here our MLflow model registry is the local file directory `./mlruns`

In [6]:
run_id = create_and_save_model()
# Construct model uri to load the model from our model registry
uri = f"runs:/{run_id}/model"
print(uri)



runs:/6d0ef9f4095446058ea82c7007a2c8ae/model


In [7]:
# Start the Ray Serve instance
serve.start()

2022-06-21 14:27:48,080	INFO services.py:1470 -- View the Ray dashboard at [1m[32mhttp://127.0.0.1:8265[39m[22m
[2m[36m(ServeController pid=63301)[0m INFO 2022-06-21 14:27:52,269 controller 63301 checkpoint_path.py:17 - Using RayInternalKVStore for controller checkpoint and recovery.
[2m[36m(ServeController pid=63301)[0m INFO 2022-06-21 14:27:52,377 controller 63301 http_state.py:112 - Starting HTTP proxy with name 'SERVE_CONTROLLER_ACTOR:kEcuWS:SERVE_PROXY_ACTOR-node:127.0.0.1-0' on node 'node:127.0.0.1-0' listening on '127.0.0.1:8000'


<ray.serve.client.ServeControllerClient at 0x7f8a912ca9a0>

### Step 4: Deploy and serve the model

In [10]:
# Deploy our model.
print(uri)
BoostingModel.deploy(uri)

runs:/6d0ef9f4095446058ea82c7007a2c8ae/model


[2m[36m(ServeController pid=63301)[0m INFO 2022-06-21 14:28:21,078 controller 63301 deployment_state.py:1175 - Stopping 1 replicas of deployment 'BoostingModel' with outdated versions.
[2m[36m(ServeController pid=63301)[0m INFO 2022-06-21 14:28:23,261 controller 63301 deployment_state.py:1216 - Adding 1 replicas to deployment 'BoostingModel'.


Send requests

In [11]:
# Send in a request for labels types virginica, setosa, versicolor
sample_request_inputs = [{
    "sepal length": 6.3,
    "sepal width": 3.3,
    "petal length": 6.0,
    "petal width": 2.5
    }
]

In [12]:
for input_request in sample_request_inputs:
    response = requests.get("http://localhost:8000/regressor",
                            json=input_request)
    print(response.text)

{
  "result": "versicolor"
}
[2m[36m(BoostingModel pid=63395)[0m Worker: received Starlette request with data: {'sepal length': 6.3, 'sepal width': 3.3, 'petal length': 6.0, 'petal width': 2.5}


[2m[36m(HTTPProxyActor pid=63306)[0m INFO 2022-06-21 14:28:31,745 http_proxy 127.0.0.1 http_proxy.py:310 - GET /regressor 200 5.6ms
[2m[36m(BoostingModel pid=63395)[0m INFO 2022-06-21 14:28:31,744 BoostingModel BoostingModel#nDZvax replica.py:478 - HANDLE __call__ OK 1.7ms


### Launch the MLflow UI to see the metrics 

In [None]:
!mlflow ui 

### Exercise

1. Increase the number of replicas to 2 or 3
2. Add more samples to `sample_request_inputs`
2. Send requests and observe which replica is serving them. You should see each being used.

### Next

We will learn how you can compose complex model using [ServerHandle APIs](https://docs.ray.io/en/latest/serve/ml-models.html#model-ensemble).