# Overview
In this notebook we will look at the MLFlow Registry API. As noted in the [README](README.md), MLFlow's API breaks down into several components. In the [MLFlow Tracking API Notebook](MLFLlow%20Tracking%20API.ipynb) we looked at how to experiment and run data to the MLFlow server.

The Registry API allows data scientists to collaboratively manage the full lifecycle of an MLflow Model. It provides model lineage (which MLflow experiment and run produced the model), model versioning, stage transitions (for example from staging to production), and annotations.

According to the [documentation](https://www.mlflow.org/docs/latest/model-registry.html), the Model Registry provides a centralized place to store models and manage their lifecycle. The Model Registry introduces a few concepts that describe and facilitate the full lifecycle of an MLflow Model. The registry is designed as an additional feature which can be used in tandem with the Tracking API. As we will see it allows us to impliment MLOps and integrate with existing DevOps practices.

<center><img src="images/mlflow_model_registry_concepts.png"></center>

The MLFlow Model Registry also provides the baility to manage and monitor changes. Users can design events to automatically log key information when changes are made to the model registry. Users can also implement levels of control to the deployment process by requiring changes to the Model Registry to be requested, reviewed and approved before being submitted.

## Prerequisites
This notebook is based on the work from the [MLFlow Tracking API Notebook](MLFLlow%20Tracking%20API.ipynb). Specifically, we logged models from an experiment which we will be using here. If we did not log models, we will not be able to register them as part of this notebook.

## Agenda

1. Launching MLFlow Server
2. Configure Connection For MLFlow Client
3. Register A Model

# 1. Launch MLFlow Server

In [1]:
mlflow_port = 5000
shell_command = "mlflow server --backend-store-uri sqlite:///mlflow.db"
shell_command += " --default-artifact-root ./artifacts"
shell_command += " --host 0.0.0.0"
shell_command += " --port {}".format(mlflow_port)

def mlflow_running():
    import socket
    with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as s:
        running = s.connect_ex(('localhost', mlflow_port)) == 0
        return running
    
if mlflow_running():
    print("It looks like MLFlow is already running. Check is python.exe is listening on port 5000 ")
else:
    print("Starting MLFlow with the following shell command:")
    print(shell_command)
    import subprocess
    process = subprocess.Popen(shell_command)
    
print("MLFlow should be accessible at http://localhost:{}".format(mlflow_port))

It looks like MLFlow is already running. Check is python.exe is listening on port 5000 
MLFlow should be accessible at http://localhost:5000


# 2. Configure Connection For MLFLow Client
We need to tell our MLFlow client where the server is and how to connect.

In [2]:
import mlflow
mlflow.set_tracking_uri("http://127.0.0.1:5000")

We can verify the connection settings as follows:

In [3]:
mlflow.get_tracking_uri()

'http://127.0.0.1:5000'

# 3. Register a Model
As discussed in the [README](README.md), when we register a model, we are enabling that model to be managed using the lifecycle management functionality of the Registry API. Generally speaking everything we do with MLFlow can be done through the UI or through the API. 

## 3.1. Registering A Model Through The UI
In the UI, registering a model is a few mouse clicks. One simply needs to navigate to the experiment run containing the model one wants to register and the click the "register model" button.

<center><img src="images/register_model.png"></center>

We can create a newly registered model by providing a name or update an existing model by selecting the drop down.

<center><img src="images/register_model_name.png"></center>

Once a model is registered we can view it in the Models tab.

<center><img src="images/registered_models.png"></center>

On this page we can see the current version, the modification date, and the stage the model is in. If we click on the odel we can navigate back to the run page and see the meta data associated with a model.

## 3.2 Register A Model Using The Python API

The first step in is to register a model. To do this we will call the *mlflow.register_model()* function. This function will require information from the Experiment Run which logged the model.

As mentioned, we will continue our example from [MLFlow Tracking API Notebook](MLFLlow%20Tracking%20API.ipynb). We had a run which outperformed the others... we want to register the model from that run.

### 3.2.1. Lookup Past Experiment Run

In [4]:
from  mlflow.tracking import MlflowClient
mlflow_client = MlflowClient()
experiments = mlflow_client.list_experiments()
experiments

[<Experiment: artifact_location='./artifacts/0', experiment_id='0', lifecycle_stage='active', name='Default', tags={}>,
 <Experiment: artifact_location='./artifacts/1', experiment_id='1', lifecycle_stage='active', name='AABA-ARIMA', tags={}>]

In [5]:
experiment_name = 'AABA-ARIMA'
mlflow.set_experiment(experiment_name)

In [9]:
# Get the experiment ID
print("Searching for experiment named '{}'".format(experiment_name))
experiment = [experiment for experiment in mlflow_client.list_experiments() if experiment.name == experiment_name][0]
print("Found id: {}".format(experiment.experiment_id))

# Get the runs for the experiment
print("Getting the run information")
experiment_run_df = mlflow.search_runs()
run_count = experiment_run_df.shape[0]
print("Found {} runs".format(run_count))

# Get the best run from the experiment
print("Searching for the best experiment run")
min_aic = experiment_run_df["metrics.aic"].min()
best_run = experiment_run_df[experiment_run_df["metrics.aic"] == min_aic].iloc[0]
print("Found the following run:")
print("")
print(best_run)

Searching for experiment named 'AABA-ARIMA'
Found id: 1
Getting the run information
Found 83 runs
Searching for the best experiment run
Found the following run:

run_id                                            28155a8261b84db9813767da598f2c9b
experiment_id                                                                    1
status                                                                    FINISHED
artifact_uri                     ./artifacts/1/28155a8261b84db9813767da598f2c9b...
start_time                                        2021-06-15 16:29:23.445000+00:00
end_time                                          2021-06-15 16:29:24.728000+00:00
metrics.aic                                                                409.494
params.q                                                                         0
params.p                                                                         0
params.d                                                                         1
tags.mlf

### 3.2.2. Retrieve Model Details

In [16]:
experiment_run_id = best_run["run_id"]
experiment_run_id

'28155a8261b84db9813767da598f2c9b'

In [18]:
model_name = best_run["tags.model_type"]
model_name

'SARIMAX'

In [20]:
best_run["tags.mlflow.log-model.history"]

'[{"run_id": "28155a8261b84db9813767da598f2c9b", "artifact_path": "SARIMAX-0_1_0", "utc_time_created": "2021-06-15 16:29:24.146840", "flavors": {"python_function": {"model_path": "model.pkl", "loader_module": "mlflow.sklearn", "python_version": "3.6.8", "env": "conda.yaml"}, "sklearn": {"pickled_model": "model.pkl", "sklearn_version": "0.24.2", "serialization_format": "cloudpickle"}}}]'

### 3.2.3. Register The Model

In [19]:
model_uri = "runs:/{}".format(experiment_run_id)
seconds_to_wait = 500
model_version = mlflow.register_model(model_uri, model_name, await_registration_for=seconds_to_wait)
model_version

Successfully registered model 'SARIMAX'.
2021/06/15 13:11:55 INFO mlflow.tracking._model_registry.client: Waiting up to 500 seconds for model version to finish creation.                     Model name: SARIMAX, version 1
Created version '1' of model 'SARIMAX'.


<ModelVersion: creation_timestamp=1623780715601, current_stage='None', description='', last_updated_timestamp=1623780715601, name='SARIMAX', run_id='28155a8261b84db9813767da598f2c9b', run_link='', source='./artifacts/1/28155a8261b84db9813767da598f2c9b/artifacts/sklearn-model', status='READY', status_message='', tags={}, user_id='', version='1'>

# 4. Update Model Version

We see that if we reregister the same model, the version is updated. The versioning appears to be based on the name and not the run id or the code associated with the model.

To update a model through the UI or the API, simply register a run.

In [21]:
model_uri = "runs:/{}".format(experiment_run_id)
seconds_to_wait = 500
model_version = mlflow.register_model(model_uri, model_name, await_registration_for=seconds_to_wait)

Registered model 'SARIMAX' already exists. Creating a new version of this model...
2021/06/15 13:18:50 INFO mlflow.tracking._model_registry.client: Waiting up to 500 seconds for model version to finish creation.                     Model name: SARIMAX, version 2
Created version '2' of model 'SARIMAX'.


# 5. List Or Query Registered Models
There are a few ways to interrogate the traking api. See the [docs](https://mlflow.org/docs/latest/python_api/mlflow.tracking.html) for more details.

The first option for programatically browsing our rigistered models is to use the *list_registered_models()* function. As we will see, it only returns the latest version of a given model. But it does tell us a lot of information about that version including the stage and run associated with it.

In [42]:
registered_models = mlflow_client.list_registered_models()
registered_models[0]

<RegisteredModel: creation_timestamp=1623780715046, description='', last_updated_timestamp=1623781130322, latest_versions=[<ModelVersion: creation_timestamp=1623781130322, current_stage='None', description='', last_updated_timestamp=1623781130322, name='SARIMAX', run_id='28155a8261b84db9813767da598f2c9b', run_link='', source='./artifacts/1/28155a8261b84db9813767da598f2c9b/artifacts', status='READY', status_message='', tags={}, user_id='', version='2'>], name='SARIMAX', tags={}>

The next option is to use the *search_registered_models()* function. This funcion allows us to search and filter the registered models according three parameters: source_path, run_id, and name. Using the name filter is similar to the *search_model_versions()* fucntion.

This function can be used to return all versions of a model.

In [50]:
model_name = "SARIMAX"
search_filter = "name='{}'".format(model_name)
model_versions = mlflow_client.search_model_versions(search_filter)
for model_version in model_versions:
    print(model_version)
    print("")

<ModelVersion: creation_timestamp=1623780715601, current_stage='None', description='', last_updated_timestamp=1623780715601, name='SARIMAX', run_id='28155a8261b84db9813767da598f2c9b', run_link='', source='./artifacts/1/28155a8261b84db9813767da598f2c9b/artifacts/sklearn-model', status='READY', status_message='', tags={}, user_id='', version='1'>

<ModelVersion: creation_timestamp=1623781130322, current_stage='None', description='', last_updated_timestamp=1623781130322, name='SARIMAX', run_id='28155a8261b84db9813767da598f2c9b', run_link='', source='./artifacts/1/28155a8261b84db9813767da598f2c9b/artifacts', status='READY', status_message='', tags={}, user_id='', version='2'>



# 6. Load The Registered Model
Once the registered model's run id has been identified, we can load the model in section 5.4. of the [MLFlow Tracking API Notebook](MLFLlow%20Tracking%20API.ipynb).

In [93]:
model_name = "SARIMAX"
model_version_number = '2'

print("Searching for model {} version {}".format(model_name, model_version_number))

search_filter = "name='{}'".format(model_name)
model_versions = mlflow_client.search_model_versions(search_filter)
model_run_id = None
for model_version in model_versions:
    if model_version.version == model_version_number:
        model_run_id = model_version.run_id
        break
if model_run_id == None:
    raise Exception("Unable to find a run ID for the coresponding model version!")
model_artifact_path = model_version.source

print("Looking up run history for run {}".format(model_run_id))
model_run = mlflow_client.get_run(run_id=model_run_id)
model_history = model_run._data.tags['mlflow.log-model.history']

print("Deserializing the run history")
import json
model_history = json.loads(model_history)

print("Importing model loader function")
loader_module_string = model_history[0]["flavors"]["python_function"]["loader_module"]
import importlib
loader_module = importlib.import_module(loader_module_string)

print("Loading model")
artifact_path = model_history[0]["artifact_path"]
model_uri = "runs:/{}/{}".format(model_run_id, artifact_path)
loaded_model = loader_module.load_model(model_uri)

loaded_model

Searching for model SARIMAX version 2
Looking up run history for run 28155a8261b84db9813767da598f2c9b
Deserializing the run history
Importing model loader function
Loading model


<statsmodels.tsa.statespace.sarimax.SARIMAXResultsWrapper at 0x1a20b2e8>

# 7. Transition Model To Stage
Currently, MLFlow only supports four stages: None, Staging, Procuction, and Archived. This may change in the future but there is no current support on github.

## 7.1 List Stages For a Model
We can get a list of the stages for a given model as follows:

In [95]:
model_name = "SARIMAX"
model_version_number = '2'
posisble_model_stages = mlflow_client.get_model_version_stages(model_name, model_version_number)
posisble_model_stages

['None', 'Staging', 'Production', 'Archived']

## 7.2. Get Model's Current Stage

In [97]:
model_version = mlflow_client.get_model_version(model_name, model_version_number)
model_version.current_stage

'None'

## 7.3. Transition Model
We can update the state as follows:

In [100]:
new_stage = "Staging"
model_version = mlflow_client.transition_model_version_stage(model_name, model_version_number, new_stage)
model_version.current_stage

'Staging'

In [101]:
model_version = mlflow_client.get_model_version(model_name, model_version_number)
model_version.current_stage

'Staging'