# Ray Serve - Integration with Model Registry MLflow

Â© 2019-2022, Anyscale. All Rights Reserved

![Anyscale Academy](../images/AnyscaleAcademyLogo.png)

This tutorial example shows how to deploy models saved in a model registry such as MLflow to Ray Serve, using the simple Ray Serve deployment APIs. 

<img src="../images/serve_mlflow.png" height="50%" width="100%">

You can peruse the saved models' metrics, parameters, and artifacts in MLflow ui.

We are going to follow three simple steps:

1. Train a scikit-learn classification model
2. Use MLflow `autolog()` method to automatically logs all metrics, parameters, artifacts, and the model
3. Create a deployment class and deploy the model for serving

In [1]:
!pip install mlflow



In [2]:
import json
import numpy as np
import pandas as pd
import requests
import os
import tempfile

from sklearn.datasets import load_iris
from sklearn.ensemble import GradientBoostingClassifier
from mlflow.tracking import MlflowClient

from ray import serve
import mlflow

Define a utility function:
 * create Iris data set
 * use a classifier
 * train and fit model
 * track all experiments using MLflow `autolog(..)` method

In [6]:
def create_and_save_model():
    # load Iris data
    iris_data = load_iris()
    data, target, target_names = (iris_data['data'],
                                  iris_data['target'],
                                  iris_data['target_names'])

    # Instantiate a model
    model = GradientBoostingClassifier()

    # Training and validation split
    np.random.shuffle(data), np.random.shuffle(target)
    train_x, train_y = data[:100], target[:100]
    val_x, val_y = data[100:], target[100:]

    # Create labels list as file
    LABEL_PATH = os.path.join(tempfile.gettempdir(), "iris_labels.json")
    with open(LABEL_PATH, "w") as f:
        json.dump(target_names.tolist(), f)

    # Train the model and save our label list as an MLflow artifact
    # mlflow.sklearn.autolog automatically logs all parameters and metrics during
    # the training.
    mlflow.sklearn.autolog()
    with mlflow.start_run() as run:
        model.fit(train_x, train_y)
        # Log label list as a artifact
        mlflow.log_artifact(LABEL_PATH, artifact_path="labels")
    return run.info.run_id

Create our Ray Serve deployment class

In [7]:
@serve.deployment(route_prefix="/regressor")
class BoostingModel:
    def __init__(self, uri):
        # Load the model and label artifact from the local
        # Mlflow model registry as a PyFunc Model
        self.model = mlflow.pyfunc.load_model(model_uri=uri)

        # Download the artifact list of labels
        local_dir = "/tmp/artifact_downloads"
        if not os.path.exists(local_dir):
            os.mkdir(local_dir)
        client = MlflowClient()
        local_path = f"{client.download_artifacts(run_id, 'labels', local_dir)}/iris_labels.json"
        with open(local_path, "r") as f:
            self.label_list = json.load(f)

    async def __call__(self, starlette_request):
        payload = await starlette_request.json()
        print(f"Worker: received Starlette request with data: {payload}")

        # Get the input vector from the payload
        input_vector = [
            payload["sepal length"],
            payload["sepal width"],
            payload["petal length"],
            payload["petal width"],
        ]

        # Convert the input vector in a Pandas DataFrame for prediction since
        # an MLflow PythonFunc model, model.predict(...), takes pandas DataFrame
        prediction = self.model.predict(pd.DataFrame([input_vector]))[0]
        human_name = self.label_list[prediction]
        return {"result": human_name}


Train and save the model artifacts in MLflow.
Here our MLflow model registry is local file directory `./mlruns`

In [8]:
run_id = create_and_save_model()
# Construct model uri to load the model from our model registry
uri = f"runs:/{run_id}/model"



In [9]:
# Start the Ray Serve instance
serve.start()

2022-02-28 19:10:06,601	INFO services.py:1374 -- View the Ray dashboard at [1m[32mhttp://127.0.0.1:8266[39m[22m
[2m[36m(ServeController pid=35687)[0m 2022-02-28 19:10:09,225	INFO checkpoint_path.py:16 -- Using RayInternalKVStore for controller checkpoint and recovery.
[2m[36m(ServeController pid=35687)[0m 2022-02-28 19:10:09,332	INFO http_state.py:98 -- Starting HTTP proxy with name 'SERVE_CONTROLLER_ACTOR:HfbbXC:SERVE_PROXY_ACTOR-node:127.0.0.1-0' on node 'node:127.0.0.1-0' listening on '127.0.0.1:8000'
2022-02-28 19:10:09,581	INFO api.py:475 -- Started Serve instance in namespace 'serve'.


<ray.serve.api.Client at 0x7fe2e2533670>

[2m[36m(HTTPProxyActor pid=35684)[0m INFO:     Started server process [35684]


In [10]:
# Deploy our model.
BoostingModel.deploy(uri)

2022-02-28 19:10:16,666	INFO api.py:249 -- Updating deployment 'BoostingModel'. component=serve deployment=BoostingModel
[2m[36m(ServeController pid=35687)[0m 2022-02-28 19:10:16,730	INFO deployment_state.py:920 -- Adding 1 replicas to deployment 'BoostingModel'. component=serve deployment=BoostingModel
2022-02-28 19:10:18,955	INFO api.py:261 -- Deployment 'BoostingModel' is ready at `http://127.0.0.1:8000/regressor`. component=serve deployment=BoostingModel


In [11]:
# Send in a request for labels types virginica, setosa, versicolor
sample_request_inputs = [{
    "sepal length": 6.3,
    "sepal width": 3.3,
    "petal length": 6.0,
    "petal width": 2.5
    },
    {
    "sepal length": 5.1,
    "sepal width": 3.5,
    "petal length": 1.4,
    "petal width": 0.2
    },
    {
    "sepal length": 6.4,
    "sepal width": 3.2,
    "petal length": 4.5,
    "petal width": 1.5},
]

In [12]:
for input_request in sample_request_inputs:
    response = requests.get("http://localhost:8000/regressor",
                            json=input_request)
    print(response.text)

{
  "result": "versicolor"
}
[2m[36m(BoostingModel pid=35686)[0m Worker: received Starlette request with data: {'sepal length': 6.3, 'sepal width': 3.3, 'petal length': 6.0, 'petal width': 2.5}
[2m[36m(BoostingModel pid=35686)[0m Worker: received Starlette request with data: {'sepal length': 5.1, 'sepal width': 3.5, 'petal length': 1.4, 'petal width': 0.2}
{
  "result": "virginica"
}
{
  "result": "versicolor"
}
[2m[36m(BoostingModel pid=35686)[0m Worker: received Starlette request with data: {'sepal length': 6.4, 'sepal width': 3.2, 'petal length': 4.5, 'petal width': 1.5}


In [13]:
!mlflow ui 

[2022-02-28 19:11:59 -0800] [35991] [INFO] Starting gunicorn 20.1.0
[2022-02-28 19:11:59 -0800] [35991] [INFO] Listening at: http://127.0.0.1:5000 (35991)
[2022-02-28 19:11:59 -0800] [35991] [INFO] Using worker: sync
[2022-02-28 19:11:59 -0800] [35995] [INFO] Booting worker with pid: 35995
^C
[2022-02-28 19:13:50 -0800] [35991] [INFO] Handling signal: int
[2022-02-28 19:13:50 -0800] [35995] [INFO] Worker exiting (pid: 35995)


### Framework-Specific Tutorials

Ray Serve seamlessly integrates with popular Python ML libraries. Below are tutorials with some of these frameworks to help get you started.

 * [PyTorch Tutorial](https://docs.ray.io/en/latest/serve/tutorials/pytorch.html#serve-pytorch-tutorial)
 * [Scikit-Learn Tutorial](https://docs.ray.io/en/latest/serve/tutorials/sklearn.html#serve-sklearn-tutorial)
 * [Keras and Tensorflow Tutorial](https://docs.ray.io/en/latest/serve/tutorials/tensorflow.html#serve-tensorflow-tutorial)
 * [Ray Serve MLflow Deployment Pluggin](https://github.com/ray-project/mlflow-ray-serve)
