# MLflow's Model Registry

In [1]:
import mlflow

In [2]:
from mlflow import MlflowClient

MLFLOW_TRACKING_URI = 'http://ec2-3-16-69-43.us-east-2.compute.amazonaws.com:5000'

## Interacting with the MLflow tracking server

With the ```MLflowClient``` object, we can interact with:
* an MLflow Tracking Server that creates and manages experiments and runs
* an MLflow Registry Server that creates and manages registered models and model versions

To instantatiate the object, pass a tracking URI and/or a registry URI

In [3]:
client = MlflowClient(tracking_uri=MLFLOW_TRACKING_URI)

In [4]:
all_experiments = client.search_experiments()
print(all_experiments)

[<Experiment: artifact_location='s3://mlflow-artifacts-experiments/1', creation_time=1747163755802, experiment_id='1', last_update_time=1747163755802, lifecycle_stage='active', name='nyc_taxi_experiment', tags={}>, <Experiment: artifact_location='s3://mlflow-artifacts-experiments/0', creation_time=1747161871674, experiment_id='0', last_update_time=1747161871674, lifecycle_stage='active', name='Default', tags={}>]


In [6]:
client.create_experiment(name='my-cool-experiment')

'2'

Let's check the latest versions for the experiment with id 1...

In [7]:
from mlflow.entities import ViewType

runs = client.search_runs(
    experiment_ids='1',
    filter_string='metrics.rmse < 7',
    run_view_type=ViewType.ACTIVE_ONLY,
    max_results=5,
    order_by=['metrics.rmse ASC']
)

In [15]:
for run in runs:
    print(f'run id: {run.info.run_id}, rmse: {run.data.metrics['rmse']:.4f}')

run id: 1e59f9ac971349a5844e38f52dd03399, rmse: 6.3063
run id: 93ad5b95ad994dc3a2d66b163c32d0d9, rmse: 6.3064
run id: ae2964d21a614e2a8017afc9b5b5c09f, rmse: 6.3066
run id: 8f38c14868e841a08cc7dfe1aa5da5ef, rmse: 6.3066
run id: 38bbf0c7de1c4ff1999be429d2462ca4, rmse: 6.3066


## Interacting with Model Registry

Use the MlflowClient instance to:
1. Register a new version for the experiment ```nyc-taxi-regressor```
2. Retreive the latest versions of the model ```nyc-taxi-regressor``` and check that a new version ```3``` was created

In [16]:
import mlflow

mlflow.set_tracking_uri(MLFLOW_TRACKING_URI)

In [20]:
run_id = 'ae2964d21a614e2a8017afc9b5b5c09f'
model_uri = f'runs:/{run_id}/model'
mlflow.register_model(model_uri=model_uri, name='nyc-taxi-regressor')

Registered model 'nyc-taxi-regressor' already exists. Creating a new version of this model...
2025/05/19 21:26:09 INFO mlflow.store.model_registry.abstract_store: Waiting up to 300 seconds for model version to finish creation. Model name: nyc-taxi-regressor, version 3
Created version '3' of model 'nyc-taxi-regressor'.


<ModelVersion: aliases=[], creation_timestamp=1747689969748, current_stage='None', description='', last_updated_timestamp=1747689969748, name='nyc-taxi-regressor', run_id='ae2964d21a614e2a8017afc9b5b5c09f', run_link='', source='s3://mlflow-artifacts-experiments/1/ae2964d21a614e2a8017afc9b5b5c09f/artifacts/model', status='READY', status_message=None, tags={}, user_id='', version='3'>

In [23]:
model_name = 'nyc-taxi-regressor'
latest_versions = client.get_latest_versions(name=model_name)

for version in latest_versions:
    print(f'version:{version.version}')

version:3


  latest_versions = client.get_latest_versions(name=model_name)


Since stages are being deprecated, I added a unique alias to each model instead. @staging1, @production2, and @staging3.

## Compare different versions and select the new "Production" model

Compare the models performance against an unseen data set (green trip taxi data from March 2021). This scenario will simulate when a deployment engineer has to interact with the model registry to decide whether to update the model version that is in production or not.

Steps:
1. Load the test dataset March 2021 NYC Green Taxi data.
2. Download the ```DictVectorizer``` that was fitted during the training data and saved to MLflow as an artifact and load it with pickle.
3. Preprocesss the test set using the ```DictVectorizer``` to properly provide input to the regressor models.
4. Make predictions on the test set using the model versions that are currently in "Staging" and "Production" stages, and compare their performance.
5. Based on the results, update the "Production" model version accordingly.

**Note: The model registry doesn't actually deploy the model into Production. It just labels the model version with "Production". Complement the registry with some CI/CD code that does the actual deployment.**

In [38]:
from sklearn.metrics import root_mean_squared_error
import pandas as pd


def read_dataframe(filename):
    df = pd.read_parquet(filename)

    df.lpep_dropoff_datetime = pd.to_datetime(df.lpep_dropoff_datetime)
    df.lpep_pickup_datetime = pd.to_datetime(df.lpep_pickup_datetime)

    df['duration'] = df.lpep_dropoff_datetime - df.lpep_pickup_datetime
    df.duration = df.duration.apply(lambda td: td.total_seconds() / 60)

    df = df[(df.duration >= 1) & (df.duration <= 60)]

    categorical = ['PULocationID', 'DOLocationID']
    df[categorical] = df[categorical].astype(str)
    
    return df


def preprocess(df, dv):
    df['PU_DO'] = df['PULocationID'] + '_' + df['DOLocationID']
    categorical = ['PU_DO']
    numerical = ['trip_distance']
    train_dicts = df[categorical + numerical].to_dict(orient='records')
    return dv.transform(train_dicts)


def test_model(name, version, X_test, y_test):
    model = mlflow.pyfunc.load_model(f"models:/{name}/{version}")
    y_pred = model.predict(X_test)
    return {"rmse": root_mean_squared_error(y_test, y_pred)}

In [26]:
df = read_dataframe('data/green_tripdata_2021-03.parquet')

In [27]:
client.download_artifacts(run_id=run_id, path='preprocessor', dst_path='.')

Downloading artifacts:   0%|          | 0/1 [00:00<?, ?it/s]

'/home/ubuntu/notebooks/preprocessor'

In [28]:
import pickle

with open('preprocessor/preprocessor.b', 'rb') as f_in:
    dv = pickle.load(f_in)

In [29]:
X_test = preprocess(df, dv)

In [30]:
target = 'duration'
y_test = df[target].values

In [39]:
%time test_model(name=model_name, version=2, X_test=X_test, y_test=y_test) #@production2

Downloading artifacts:   0%|          | 0/5 [00:00<?, ?it/s]

CPU times: user 166 ms, sys: 15.8 ms, total: 182 ms
Wall time: 263 ms


{'rmse': 6.659623830022514}

In [40]:
%time test_model(name=model_name, version=1, X_test=X_test, y_test=y_test) #@staging1

Downloading artifacts:   0%|          | 0/5 [00:00<?, ?it/s]

Note: You have installed the 'manylinux2014' variant of XGBoost. Certain features such as GPU algorithms or federated learning are not available. To use these features, please upgrade to a recent Linux distro with glibc 2.28+, and install the 'manylinux_2_28' variant.


CPU times: user 17.1 s, sys: 50.7 ms, total: 17.1 s
Wall time: 4.72 s


{'rmse': 6.265125782624643}

In [41]:
%time test_model(name=model_name, version=3, X_test=X_test, y_test=y_test) #@staging3

Downloading artifacts:   0%|          | 0/1 [00:00<?, ?it/s]

MlflowException: The following failures occurred while downloading one or more artifacts from s3://mlflow-artifacts-experiments/1/ae2964d21a614e2a8017afc9b5b5c09f/artifacts/model:
##### File  #####
An error occurred (404) when calling the HeadObject operation: Not Found

It looks like @staging1 model has a better RMSE {'rmse': 6.265125782624643} result even thought it takes longer than the @production2 model. Will update the alias for @staging1 to @production1 and change @production2 to @staging2.