# Serve Ray AIR Predictor with `ModelWrapper`

[Ray Serve](rayserve.org) is the recommended tool exposing Ray AIR checkpoints for live interactive querying. 
The core concept is called `ModelWrapper`. `ModelWrapper` takes a predictor class and a checkpoint and transform them to live HTTP endpoint. 
Let's take a look at an example with custom predictor. 
You can find end to end examples with your specific frameworks in the [exmaples](air-examples-ref) page.

In this example, we will demonstrate:
- How to serve a predictor accepting array input.
- How to serve a predictor accepting dataframe input.
- How to serve a predictor accepting custom input that can be transformed to array or dataframe.
- How to configure micro-batching to enhance performance.

Let's first make sure Ray AIR is installed.

In [1]:
!pip install "ray[air]"

You should consider upgrading via the 'pip install --upgrade pip' command.[0m


## Predictor accepting NumPy Array
We will use a simple predictor implementation that adds a scaler to input array.

In [2]:
import numpy as np

from ray.ml.predictor import Predictor
from ray.ml.checkpoint import Checkpoint

class AdderPredictor(Predictor):
    def __init__(self, increment: int):
        self.increment = increment
    
    @classmethod
    def from_checkpoint(cls, ckpt: Checkpoint):
        return cls(ckpt.to_dict()["increment"])
    
    def predict(self, inp: np.ndarray) -> np.ndarray:
        return inp + self.increment

Let's first test it locally.

In [3]:
local_checkpoint = Checkpoint.from_dict({"increment": 2})
local_predictor = AdderPredictor.from_checkpoint(local_checkpoint)
assert local_predictor.predict(np.array([40])) == np.array([42])

It worked! Now let's serve it behind HTTP. For more about Ray Serve the framework, checkout
[its documentation](rayserve).

In [4]:
from ray import serve
from ray.serve.model_wrappers import ModelWrapperDeployment

# Create Ray Serve instance
serve.start()

# Deploy the model behind HTTP endpoint
ModelWrapperDeployment.options(name="Adder").deploy(
    predictor_cls=AdderPredictor,
    checkpoint=local_checkpoint
)

2022-05-19 19:36:25,720	INFO services.py:1483 -- View the Ray dashboard at [1m[32mhttp://127.0.0.1:8265[39m[22m
[2m[36m(ServeController pid=56600)[0m INFO 2022-05-19 19:36:30,711 controller 56600 checkpoint_path.py:17 - Using RayInternalKVStore for controller checkpoint and recovery.
[2m[36m(ServeController pid=56600)[0m INFO 2022-05-19 19:36:30,821 controller 56600 http_state.py:115 - Starting HTTP proxy with name 'SERVE_CONTROLLER_ACTOR:mFggjx:SERVE_PROXY_ACTOR-node:127.0.0.1-0' on node 'node:127.0.0.1-0' listening on '127.0.0.1:8000'
[2m[36m(HTTPProxyActor pid=56607)[0m INFO:     Started server process [56607]
[2m[36m(ServeController pid=56600)[0m INFO 2022-05-19 19:36:32,854 controller 56600 deployment_state.py:1217 - Adding 1 replicas to deployment 'Adder'.


As you can see the core component is called `ModelWrapperDeployment`, the deployment takes few arguments. It requires two arguments to start:
- `predictor_cls (Type[Predictor] | str)`: The predictor Python class. Typically you just need to use the builtin integration from Ray AIR like `TorchPredictor`. Alternatively, you can specify the class path to import such predictor like `"ray.ml.integrations.torch.TorchPredictor"`.
- `checkpoint (Checkpoint | str)`: A checkpoint instance, or uri to load checkpoint from.

After the model has been deployed, let's send an HTTP request.

In [5]:
import requests
resp = requests.post("http://localhost:8000/Adder/", json={"array": [40]})
resp.raise_for_status()
resp.json()

[42.0]

That's it for array! You can specify multi-dimensional array in the JSON payload, as well as "dtype" and "shape" field to process to array. The schema for array input is available [here](serve-ndarray-schema).

## Predictor accepting Pandas DataFrame
Let's now take a look at a predictor accepting dataframe input. We will perform some simple column wise transformation on the input data.

In [6]:
import pandas as pd


class DataFramePredictor(Predictor):
    def __init__(self, increment: int):
        self.increment = increment

    @classmethod
    def from_checkpoint(cls, ckpt: Checkpoint):
        return cls(ckpt.to_dict()["increment"])

    def predict(self, inp: pd.DataFrame) -> pd.DataFrame:
        inp["prediction"] =  inp["base"] * inp["multipiler"] + self.increment
        return inp

local_df_predictor = DataFramePredictor.from_checkpoint(local_checkpoint)

[2m[36m(HTTPProxyActor pid=56607)[0m INFO 2022-05-19 19:36:35,950 http_proxy 127.0.0.1 http_proxy.py:320 - POST /Adder 200 15.9ms
[2m[36m(Adder pid=56615)[0m INFO 2022-05-19 19:36:35,949 Adder Adder#fCjrZL replica.py:483 - HANDLE __call__ OK 12.0ms


In [7]:
from ray.serve.http_adapters import pandas_read_json

ModelWrapperDeployment.options(name="DataFramePredictor").deploy(
    predictor_cls=DataFramePredictor,
    checkpoint=local_checkpoint,
    http_adapter=pandas_read_json
)

[2m[36m(ServeController pid=56600)[0m INFO 2022-05-19 19:36:36,175 controller 56600 deployment_state.py:1217 - Adding 1 replicas to deployment 'DataFramePredictor'.


In [10]:
resp = requests.post(
    "http://localhost:8000/DataFramePredictor/",
    json=[{"base": 1, "multipiler": 2}, {"base": 3, "multipiler": 4}],
    params={"orient": "records"},
)
resp.raise_for_status()
resp.text

'[{"base":1,"multipiler":2,"prediction":4},{"base":3,"multipiler":4,"prediction":14}]'

[2m[36m(HTTPProxyActor pid=56607)[0m INFO 2022-05-19 19:36:53,656 http_proxy 127.0.0.1 http_proxy.py:320 - POST /DataFramePredictor 200 9.9ms
[2m[36m(DataFramePredictor pid=56624)[0m INFO 2022-05-19 19:36:53,654 DataFramePredictor DataFramePredictor#VKWXkl replica.py:483 - HANDLE __call__ OK 7.0ms
