# 2. How to deploy from MLflow with python

## 2.1 MLflow Models

An MLflow Model is a standard format for packaging machine learning models that can be used in a variety of downstream tools - for example, real-time serving through a REST API or batch inference on Apache Spark. The format defines a convention that lets you save a model in different “flavors” that can be understood by different downstream tools.

All of the flavors that a particular model supports are defined in its MLmodel file in YAML format. For example, mlflow.sklearn outputs models as follows:

```
# Directory written by mlflow.sklearn.save_model(tree, "model")
model/
├── MLmodel
├── model.pkl
├── conda.yaml
└── requirements.txt
```

For environment recreation, we automatically log conda.yaml and requirements.txt files whenever a model is logged. These files can then be used to reinstall dependencies using either conda or pip. And its MLmodel file describes two flavors:

```yaml
time_created: 2018-05-25T17:28:53.35

flavors:
  sklearn:
    sklearn_version: 0.19.1
    pickled_model: model.pkl
  python_function:
    loader_module: mlflow.sklearn
```

This model can then be used with any tool that supports either the sklearn or python_function model flavor. For example, the mlflow models serve command can serve a model with the python_function flavor:

```bash
mlflow models serve -m model
```

## 2.2 The MLflow Model Registry

The MLflow Model Registry component is a centralized model store, set of APIs, and UI, to collaboratively manage the full lifecycle of an MLflow Model. It provides model lineage (which MLflow experiment and run produced the model), model versioning, stage transitions (for example from staging to production), and annotations.

- **Model**: A MLflow Model is created from an experiment or run that is logged with one of the model flavor’s mlflow.\<model_flavor\>.log_model() methods. Once logged, this model can then be registered with the Model Registry.

- **Registered Model**: A MLflow Model can be registered with the Model Registry. A registered model has a unique name, contains versions, associated transitional stages, model lineage, and other metadata.


## 2.3. Register a model

In [1]:
import mlflow

remote_server_uri = "http://0.0.0.0:5001"   # set to your server URI
mlflow.set_tracking_uri(remote_server_uri)

First we need to create registred model:

In [3]:
from mlflow.tracking import MlflowClient
from mlflow.exceptions import RestException

model_name = "penguin"

client = MlflowClient()
try:
    registered_model = client.create_registered_model(model_name)
    print(registered_model)
except RestException:
    print(f"Model '{model_name}' already exists in registry.")

Model 'penguin' already exists in registry.


Now we can register experiment runs to that model. Pick a run ID from your tracking log and add it here.

In [5]:
run_id = "0c2a356eecc141eb9ed000728ef2182c"
model_name = "penguin"
result = mlflow.register_model(
    f"runs:/{run_id}/model",
    f"{model_name}"
)

Registered model 'penguin' already exists. Creating a new version of this model...
2025/05/22 19:25:19 INFO mlflow.store.model_registry.abstract_store: Waiting up to 300 seconds for model version to finish creation. Model name: penguin, version 4
Created version '4' of model 'penguin'.


In [6]:
print(result)

<ModelVersion: aliases=[], creation_timestamp=1747918519211, current_stage='None', description='', last_updated_timestamp=1747918519211, name='penguin', run_id='0c2a356eecc141eb9ed000728ef2182c', run_link='', source='mlflow-artifacts:/181050822057161957/0c2a356eecc141eb9ed000728ef2182c/artifacts/model', status='READY', status_message=None, tags={}, user_id='', version='4'>


## 2.4. Serve a Model from the registry

```bash
# Set environment variable for the tracking URL where the Model Registry resides
# Serve the production model from the model registry
MLFLOW_TRACKING_URI=http://localhost:5001 mlflow models serve --no-conda -m "models:/penguin/4" -p 4242
```
(This serves version 1 of the model)

Save by stage:

```bash
MLFLOW_TRACKING_URI=http://localhost:5001 mlflow models serve --no-conda -m "models:/penguins_clf_test_part_4/Production" -p 4242
```


Query the model with cURL:
```bash
# record-oriented DataFrame input (fine for vector rows, loses ordering for JSON records)
curl http://127.0.0.1:4242/invocations -H 'Content-Type: application/json; format=pandas-records' -d '[
    {"Culmen Length (mm)": 1,"Culmen Depth (mm)": 3},
    {"Culmen Length (mm)": 14,"Culmen Depth (mm)": 120}
]'
```

Or we can call the API directly from python:

In [8]:
import json
import requests
from requests.structures import CaseInsensitiveDict

url = "http://127.0.0.1:4242/invocations"

headers = CaseInsensitiveDict()
headers["Content-Type"] = "application/json"

data = {
    "dataframe_split": {
         "columns": ["Culmen Length (mm)", "Culmen Depth (mm)"],
         "data": [[1.8,  14.7], [ 3.0,120.0 ]]
    }
}
response = requests.post(url, headers=headers, data=json.dumps(data))

print(response.status_code)

200


In [9]:
response.text

'{"predictions": ["Adelie", "Adelie"]}'

## 2.5. Other deployment targets

- Sagemaker
- AzureML
- Kubernetes
- ...

## 2.6. Transition a models stages

Over the course of the model’s lifecycle, a model evolves—from development to staging to production. You can transition a registered model to one of the stages: **Staging, Production or Archived.**

In [10]:
client = MlflowClient()

client.transition_model_version_stage(
    name=model_name,
    version=1,
    stage="Production"
)

  client.transition_model_version_stage(


<ModelVersion: aliases=[], creation_timestamp=1747915421748, current_stage='Production', description='', last_updated_timestamp=1747919359751, name='penguin', run_id='bc33c515123347fab91528a9a95dc764', run_link='', source='mlflow-artifacts:/181050822057161957/bc33c515123347fab91528a9a95dc764/artifacts/model', status='READY', status_message=None, tags={}, user_id='', version='1'>