# Running a MLOps pipeline in MLServer

This example walks you through how to create and serialise a [MLOps pipeline](https://github.com/SeldonIO/mlops), which can then be served through MLServer.
This pipeline can contain custom Python arbitrary code.

## Creating the pipeline

The first step will be to create our MLOps pipeline.

In [46]:
from mlops.serve.serve import Model, SeldonRay
from mlops.seldon.servers import ServerType
import numpy as np

@SeldonRay(name="s1", endpoint="/endpoint")
class MyServer(object):
    def __init__(self):
        self.count = 0

    @Model(name="irissk",
           uri="gs://seldon-models/sklearn/iris",
           server_type=ServerType.sklearn_server,
           namespace="default")
    def iris_prediction_sklearn(self, data) -> np.array:
        return np.array(data)

    @Model(name="irisxgb",
           uri="gs://seldon-models/xgboost/iris",
           server_type=ServerType.xgboost_server,
           namespace="default")
    def iris_prediction_xgb(self, data) -> np.array:
        return np.array(data)

    def pipeline(self, request: np.array) -> np.array:
        res1 = self.iris_prediction_sklearn(request)
        if res1[0][0] > 0.7:
            return res1
        else:
            return self.iris_prediction_xgb(request)

pipeline = MyServer()

# Mock input data to downstream models
# pipeline.iris_prediction_sklearn.set_prediction(np.array([[0.9, 2, 3]]))
# pipeline.iris_prediction_xgb.set_prediction(np.array([[4, 5, 6]]))

This pipeline can then be serialised using `cloudpickle`.

In [47]:
import cloudpickle

with open("mlops-pipeline.pickle", 'wb') as pipeline_file:
    # Explicitly use Pickle's protocol 4 for compatibility with Python 3.7
    cloudpickle.dump(pipeline, pipeline_file, protocol=4)

## Serving the pipeline

Once we have our pipeline created and serialised, we can then create a `model-settings.json` file.
This configuration file will hold the configuration specific to our MLOps pipeline.

In [15]:
%%writefile ./model-settings.json
{
    "name": "mlops-pipeline",
    "implementation": "mlserver_mlops.MLOpsModel",
    "parameters": {
        "uri": "./mlops-pipeline.pickle"
    }
}

Overwriting ./model-settings.json


### Start serving our model

Now that we have our config in-place, we can start the server by running `mlserver start .`. This needs to either be ran from the same directory where our config files are or pointing to the folder where they are.

```shell
mlserver start .
```

Since this command will start the server and block the terminal, waiting for requests, this will need to be ran in the background on a separate terminal.

### Send test inference request

We now have our model being served by `mlserver`.
To make sure that everything is working as expected, let's send a request.

For that, we can use the Python types that `mlserver` provides out of box, or we can build our request manually.

In [48]:
import requests

x_0 = np.array([[0.1, 3.1, 1.5, 0.2]])
inference_request = {
    "inputs": [
        {
          "name": "predict",
          "shape": x_0.shape,
          "datatype": "FP32",
          "data": x_0.tolist()
        }
    ]
}

endpoint = "http://localhost:8080/v2/models/mlops-pipeline/infer"
response = requests.post(endpoint, json=inference_request)

response.json()

JSONDecodeError: Expecting value: line 1 column 3 (char 2)