# MLflow

The aeon custom model flavor enables logging of aeon models in MLflow format via the `aeon.utils.mlflow_aeon.save_model()` and `aeon.utils.mlflow_aeon.log_model()` methods. These methods also add the `pyfunc` flavor to the MLflow Models that they produce, allowing the model to be interpreted as generic Python functions for inference via `aeon.utils.mlflow_aeon.pyfunc.load_model()`. This loaded PyFunc model can only be scored with a DataFrame input. You can also use the `aeon.utils.mlflow_aeon.load_model()` method to load MLflow Models with the aeon model flavor in native aeon formats.

The `pyfunc` flavor of the model supports aeon predict methods `predict`,  `predict_interval`, `predict_proba`, `predict_quantiles`, `predict_var`.

The interface for utilizing a aeon model loaded as a `pyfunc` type for generating forecasts requires passing an exogenous regressor as Pandas DataFrame to the `pyfunc.predict()` method (an empty DataFrame must be passed if no exogenous regressor is used). The configuration of predict methods and parameter values passed to the predict methods is defined by a dictionary to be saved as an attribute of the fitted aeon model instance. If no prediction configuration is defined `pyfunc.predict()` will return output from aeon `predict` method. Note that for `pyfunc` flavor the forecasting horizon `fh` must be passed to the fit method.

Predict methods and parameter values for `pyfunc` flavor can be defined in two ways:
- `Dict[str, dict]` if parameter values are passed to `pyfunc.predict()`, for example  `{"predict_method": {"predict": {}, "predict_interval": {"coverage": [0.1, 0.9]}}`
- `Dict[str, list]`, with default parameters in predict method, for example  `{"predict_method": ["predict", "predict_interval"}` (Note: when including `predict_proba` method the former appraoch must be followed as `quantiles` parameter has to be provided by the user)
- If no prediction config is defined `pyfunc.predict()` will return output from aeon `predict()` method

Signature logging for aeon from a non-pyfunc artifact will not function correctly for `predict_interval` or `predict_quantiles`. The output of the native aeon model flavor for these methods is not a recognized signature type due to the MultiIndex column structure of the returned DataFrame. MLflow's ``infer_schema`` will function correctly if using the ``pyfunc`` flavor of the model, though.

## 1. Setup
### 1.1 Config

In [1]:
model_path = "model"

### 1.2 Imports

In [2]:
import mlflow
import pandas as pd

from aeon.datasets import load_longley
from aeon.forecasting.model_selection import temporal_train_test_split
from aeon.forecasting.naive import NaiveForecaster
from aeon.utils import mlflow_aeon

### 1.3 Load sample data

In [3]:
y, X = load_longley()
y_train, y_test, X_train, X_test = temporal_train_test_split(y, X)

## 2. Example usage of native `aeon flavor` and `pyfunc flavor`

### 2.1 Create prediction config for pyfunc flavor

In [4]:
coverage = [0.8, 0.9]
quantiles = [0.1, 0.9]

pyfunc_predict_conf = {
    "predict_method": {
        "predict": {},
        "predict_interval": {"coverage": coverage},
        "predict_proba": {"quantiles": quantiles},
        "predict_quantiles": {},
        "predict_var": {},
    }
}

### 2.2 Train and save model

In [5]:
with mlflow.start_run():

    forecaster = NaiveForecaster()
    forecaster.fit(
        y_train,
        X=X_train,
        fh=[1, 2, 3],
    )
    forecaster.pyfunc_predict_conf = pyfunc_predict_conf

    mlflow_aeon.save_model(forecaster, model_path)



### 2.3 Load model

#### 2.3.1 Native aeon flavor

In [6]:
loaded_model = mlflow_aeon.load_model(model_path)

#### 2.3.2 Pyfunc flavor

In [7]:
loaded_pyfunc = mlflow_aeon.pyfunc.load_model(model_path)

### 2.4 Generate predictions

#### 2.4.1 Native aeon flavor

In [8]:
loaded_model.predict(X=X_test)

1959    66513.0
1960    66513.0
1961    66513.0
Freq: A-DEC, dtype: float64

In [9]:
loaded_model.predict_interval(X=X_test, coverage=coverage)

Unnamed: 0_level_0,Coverage,Coverage,Coverage,Coverage
Unnamed: 0_level_1,0.8,0.8,0.9,0.9
Unnamed: 0_level_2,lower,upper,lower,upper
1959,64719.913711,68306.086289,64211.598663,68814.401337
1960,63977.193051,69048.806949,63258.327017,69767.672983
1961,63407.283445,69618.716555,62526.855956,70499.144044


In [10]:
y_pred_dist = loaded_model.predict_quantiles(X=X)
y_pred_dist_quantiles = pd.DataFrame(y_pred_dist)
y_pred_dist_quantiles.columns = [f"Quantiles_{q}" for q in quantiles]
y_pred_dist_quantiles

Unnamed: 0,Quantiles_0.1,Quantiles_0.9
1959,64211.598663,68814.401337
1960,63258.327017,69767.672983
1961,62526.855956,70499.144044


In [11]:
loaded_model.predict_quantiles(X=X_test)

Unnamed: 0_level_0,Quantiles,Quantiles
Unnamed: 0_level_1,0.05,0.95
1959,64211.598663,68814.401337
1960,63258.327017,69767.672983
1961,62526.855956,70499.144044


In [12]:
loaded_model.predict_var(X=X_test)

Unnamed: 0,0
1959,1957628.0
1960,3915256.0
1961,5872885.0


#### 2.4.2 Pyfunc flavor

In [13]:
loaded_pyfunc.predict(X_test)

ValueError: operands could not be broadcast together with shapes (3,3) (2,) 

## 3. Model deployment example

### 3.1 Create experiment

In [None]:
artifact_path = "model"

mlflow.set_experiment("Test aeon")

with mlflow.start_run() as run:

    forecaster = NaiveForecaster()
    forecaster.fit(y_train, X=X_train, fh=[1, 2, 3])
    forecaster.pyfunc_predict_conf = pyfunc_predict_conf

    mlflow_aeon.log_model(estimator=forecaster, artifact_path=artifact_path)

run_id = run.info.run_id
print(f"MLflow run id: {run_id}")

### 3.2 Deploy pyfunc model to local REST API endpoint
- Open a terminal window and cd into `examples`directory
- In the terminal run: `mlflow models serve -m runs:/<RUN_ID>/model --env-manager local --host <HOST>`
    - where you substitute `<RUN_ID>` by the `run_id` and `<HOST>` by the network address to listen on (e.g. `127.0.0.1`)
- More details here: https://www.mlflow.org/docs/latest/cli.html#mlflow-models-serve

### 3.3 Request predictions from local REST API endpoint

- For more details see: https://www.mlflow.org/docs/latest/models.html#built-in-deployment-tools

#### 3.3.1 JSON input using `dataframe_split` field with pandas DataFrame in the `split` orientation

In [None]:
host = "127.0.0.1"
url = f"http://{host}:5000/invocations"

X_test = X_test.reset_index(drop=True)
json_data = {"dataframe_split": X_test.to_dict(orient="split")}
print(json_data)

# # Comment in the below lines to run the prediction request
# import requests
# response = requests.post(url, json=json_data)
# response.json()

#### 3.3.2 JSON input using `dataframe_records` field with pandas DataFrame in the `records` orientation

In [None]:
json_data = {"dataframe_records": X_test.to_dict(orient="records")}
print(json_data)

# # Comment in the below lines to run the prediction request
# response = requests.post(url, json=json_data)
# response.json()

#### 3.3.3 CSV input using valid `pd.DataFrame` csv representation

In [None]:
headers = {
    "Content-Type": "text/csv",
}
data = X_test.to_csv()
print(data)

# # Comment in the below lines to run the prediction request
# response = requests.post(url, headers=headers, data=data)
# response.json()