# Model Management

Nice! we are now have a system to keep track of our experiments and importantly have models ready to go to the next step: Deployment. Before we can deploy a model, we need a way to manage models that can provide

* Model Lineage: where the model comes from
* Model Versioning
* Model Transition
* Annotations: model metadata

To do so, we'll use *Model Registry*, which is a component in *MLflow* for dealing with model management.

Now that we have a model ready for deployment, we need a transparent and efficient way for the team to know which model is ready to go to the next step. Therefore, we need to register the model. This step is like a process to seemlessly hand over the work from data scientists to ML engineers. 

What to do is to label the model and keep adding tag and alias where there is a change to the model. There are three basic stages for tagging:

* Staging: the last status before deploying to production. It can be pre-deployment test or A/B testing.
* Production: the model serving for real users.
* Archieve: Not in-used model.

Although, there are basic tagging for identifying status of the models, *MLflow* provides tag and alias for more flexible labelling process. You can modify tag and alias using UI or *MLflow* APIs in Python. Note that a model version is automatically increment for registered model.

In sum, what you need to do after creating model in each run are as the following:

1. Model Registration: add a run to *MLflow* model. This is for providing a unique name to several ML model tasks.
2. Model Tagging and Aliasing: add metadata about deployment status of the model.
3. Model Versioning: done automatically via registering model.

## Setup

In [1]:
from mlflow import MlflowClient
from mlflow.entities import ViewType

In [2]:
client = MlflowClient(tracking_uri="sqlite:///mlflow.db")

## Model Registration

In [3]:
# List experiments
# client.create_experiment(name="new-test-experiment")
experiments = client.search_experiments()
experiment_id = experiments[0].experiment_id

In [4]:
# Search top runs
runs = client.search_runs(
    experiment_ids=experiment_id, 
    filter_string="tags.model_type = 'model_selection'", 
    run_view_type=ViewType.ACTIVE_ONLY, 
    max_results=5, 
    order_by=["metrics.rmse ASC"]
)

In [14]:
client.get_registered_model(name="nyc-green-taxi-trip-duration")

<RegisteredModel: aliases={'proposed': 1}, creation_timestamp=1717397289931, description='Trip Duration Predictor for NYC Green Taxi', last_updated_timestamp=1717397771880, latest_versions=[<ModelVersion: aliases=[], creation_timestamp=1717397290060, current_stage='None', description='Trip Duration Predictor for NYC Green Taxi', last_updated_timestamp=1717397321342, name='nyc-green-taxi-trip-duration', run_id='00bb8f0187644496ba54c27944833f41', run_link='', source='/workspaces/mlops-zoomcamp/02-experiment-tracking/mlruns/1/00bb8f0187644496ba54c27944833f41/artifacts/mlflow-models', status='READY', status_message=None, tags={'stage': 'staging'}, user_id=None, version=1>], name='nyc-green-taxi-trip-duration', tags={}>