# Deploying Models

## Introduction

>Once the model has been saved, it can easily be deployed to various services:
- [locally](https://www.mlflow.org/docs/latest/models.html#deploy-mlflow-models) with REST API (either inside a `docker` container or a `conda` environment).
- [Microsoft's Azure ML](https://www.mlflow.org/docs/latest/models.html#deploy-a-python-function-model-on-microsoft-azure-ml).
- [Amazon SageMaker](https://www.mlflow.org/docs/latest/models.html#deploy-a-python-function-model-on-amazon-sagemaker).
- [Apache UDF](https://www.mlflow.org/docs/latest/models.html#export-a-python-function-model-as-an-apache-spark-udf).
- Other services, maintained by community-driven deployment plugins (e.g. `torchserve`). Check [here](https://www.mlflow.org/docs/latest/plugins.html#deployment-plugins) for more information.

The `mlflow model` commands are shown below.

In [None]:
!mlflow models --help

## The build-docker Subcommand

> This subcommand creates a docker image and places the model inside it.

Thereafter, the model can be served by running the created image (by default, port `8080` is exposed; thus, it can easily be mapped).

To view more information on this subcommand, run the following:

In [None]:
!mlflow models build-docker --help

__The default is `python_flavor`, and it is compatible with every specific integration__ (more details are provided [here](https://www.mlflow.org/docs/latest/python_api/mlflow.pyfunc.html)).

## The serve Subcommand

> This subcommand runs a basic web server (created via `flask`) which we can query (e.g. using `curl`).

The following can be specified, amongst other things:
- `--model-uri`: the model resource (mandatory).
- `--workers`: the number of parallel workers handling requests.
- `--port`: the port that the server listens to for requests.

In [None]:
!mlflow models serve --help

## The predict Subcommand

> This subcommand allows us to query the model with a file (`.csv` or `.json`) (__useful for testing__).

To view the possibilities, run the following command:

In [None]:
!mlflow models predict --help

## Querying Deployed Models

Once a model has been deployed (via `docker` or `flask` webserver), it can be queried (from other machines or `localhost`). 

Requests are made by sending `json` text strings to the `/invocations` endpoint. There are a few possibilities for sending the data:
- JSON-serialised pandas DataFrames in the split orientation (`data = pandas_df.to_json(orient='split')`).
- JSON-serialised pandas DataFrames in the record orientation (discouraged).
- CSV-serialised pandas DataFrames (`data = pandas_df.to_csv()`).
- Tensor input, formatted as described in TF Serving’s API docs; here, the provided inputs will be cast to Numpy arrays.

Each of the above can be observed below (please note the `content/type` specification for the different versions):

In [None]:
# split-oriented DataFrame input
curl http://127.0.0.1:5000/invocations -H 'Content-Type: application/json' -d '{
    "columns": ["a", "b", "c"],
    "data": [[1, 2, 3], [4, 5, 6]]
}'

# record-oriented DataFrame input (fine for vector rows, loses ordering for JSON records)
curl http://127.0.0.1:5000/invocations -H 'Content-Type: application/json; format=pandas-records' -d '[
    {"a": 1,"b": 2,"c": 3},
    {"a": 4,"b": 5,"c": 6}
]'

# numpy/tensor input using TF serving's "instances" format
curl http://127.0.0.1:5000/invocations -H 'Content-Type: application/json' -d '{
    "instances": [
        {"a": "s1", "b": 1, "c": [1, 2, 3]},
        {"a": "s2", "b": 2, "c": [4, 5, 6]},
        {"a": "s3", "b": 3, "c": [7, 8, 9]}
    ]
}'

Moreover, we could encode more complex data before sending the request (e.g. images could be encoded using `base64` and automatically decoded by MLFlow):

In [None]:
# record-oriented DataFrame input with binary column "b"
curl http://127.0.0.1:5000/invocations -H 'Content-Type: application/json; format=pandas-records' -d '[
    {"a": 0, "b": "dGVzdCBiaW5hcnkgZGF0YSAw"},
    {"a": 1, "b": "dGVzdCBiaW5hcnkgZGF0YSAx"},
    {"a": 2, "b": "dGVzdCBiaW5hcnkgZGF0YSAy"}
]'

# record-oriented DataFrame input with datetime column "b"
curl http://127.0.0.1:5000/invocations -H 'Content-Type: application/json; format=pandas-records' -d '[
    {"a": 0, "b": "2020-01-01T00:00:00Z"},
    {"a": 1, "b": "2020-02-01T12:34:56Z"},
    {"a": 2, "b": "2021-03-01T00:00:00Z"}
]'

## Conclusion
At this point, you should have a good understanding of 
- MLFlow model deployment.
- MLFlow model commands.
- how to query deployed models.