# Get started with Model Registry

This notebook demonstrates how to use Evidently to:
* Generate a model performance report and calculate associated metrics.
* Log model metrics to MLFlow.
* Store the model in MLFlow as an artifact.
* Store the model performance report in MLFlow as an artifact.

In [1]:
%load_ext autoreload
%autoreload 2

import joblib
import mlflow
import mlflow.sklearn
from mlflow.tracking import MlflowClient
import pandas as pd
from pathlib import Path
from sklearn import ensemble, model_selection
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error, mean_absolute_error

## Load Data

More information about the dataset can be found in UCI machine learning repository: https://archive.ics.uci.edu/ml/datasets/bike+sharing+dataset

Acknowledgement: Fanaee-T, Hadi, and Gama, Joao, 'Event labeling combining ensemble detectors and background knowledge', Progress in Artificial Intelligence (2013): pp. 1-15, Springer Berlin Heidelberg

In [2]:
# Download original dataset with: python src/load_data.py 

raw_data = pd.read_csv(f"../data/raw_data.csv")
raw_data.head()

Unnamed: 0,instant,dteday,season,yr,mnth,hr,holiday,weekday,workingday,weathersit,temp,atemp,hum,windspeed,casual,registered,cnt
0,1,2011-01-01,1,0,1,0,0,6,0,1,0.24,0.2879,0.81,0.0,3,13,16
1,2,2011-01-01,1,0,1,1,0,6,0,1,0.22,0.2727,0.8,0.0,8,32,40
2,3,2011-01-01,1,0,1,2,0,6,0,1,0.22,0.2727,0.8,0.0,5,27,32
3,4,2011-01-01,1,0,1,3,0,6,0,1,0.24,0.2879,0.75,0.0,3,10,13
4,5,2011-01-01,1,0,1,4,0,6,0,1,0.24,0.2879,0.75,0.0,0,1,1


## Define column mapping

In [3]:
target = 'cnt'
prediction = 'prediction'
numerical_features = ['temp', 'atemp', 'hum', 'windspeed', 'mnth', 'hr', 'weekday']
categorical_features = ['season', 'holiday', 'workingday', ]

In [4]:
sample_data = raw_data.set_index('dteday').loc['2011-01-01 00:00:00':'2011-01-28 23:00:00'].reset_index()

print(sample_data.shape)

(594, 17)


In [5]:
X_train, X_test, y_train, y_test = model_selection.train_test_split(
    sample_data[numerical_features + categorical_features],
    sample_data[target],
    test_size=0.3
)

print(X_train.shape)
print(X_test.shape)

(415, 10)
(179, 10)


## Train a  Linear Regression Model

In [6]:
model_lr = LinearRegression()
model_lr.fit(X_train, y_train) 

model_lr_path = Path('../models/model_lr.joblib')
joblib.dump(model_lr, model_lr_path)

['../models/model_lr.joblib']

In [7]:
from sklearn.metrics import mean_squared_error, mean_absolute_error

preds_lr = model_lr.predict(X_test)

me = mean_squared_error(y_test, preds_lr)
mae = mean_absolute_error(y_test, preds_lr)

print(me, mae)

2194.465319528649 34.00606717197546


## Train a RandomForestRegressor Model

In [8]:
model_rf = ensemble.RandomForestRegressor(random_state = 0, n_estimators = 50)
model_rf.fit(X_train, y_train) 

model_path = Path('../models/model_rf.joblib')
joblib.dump(model_rf, model_path)

['../models/model_rf.joblib']

In [9]:

preds_rf = model_rf.predict(X_test)

me = mean_squared_error(y_test, preds_rf)
mae = mean_absolute_error(y_test, preds_rf)

print(me, mae)

306.5309072625698 11.949608938547483


# Model Registry

## Set up MLFlow

In [10]:
# Set up MLFlow 
MLFLOW_TRACKING_URI = "http://localhost:5001"
mlflow.set_tracking_uri(MLFLOW_TRACKING_URI)

# Set up MLFlow Client
client = MlflowClient()
print(f"Client tracking uri: {client.tracking_uri}")

# Set experiment name
mlflow.set_experiment('4-Model-Registry')


2025/06/17 09:36:39 INFO mlflow.tracking.fluent: Experiment with name '4-Model-Registry' does not exist. Creating a new experiment.


Client tracking uri: http://localhost:5001


<Experiment: artifact_location='mlflow-artifacts:/995036023860049948', creation_time=1750127799921, experiment_id='995036023860049948', last_update_time=1750127799921, lifecycle_stage='active', name='4-Model-Registry', tags={}>

## Registering a Model

- Docs on [mlflow.sklearn.log_model](https://www.mlflow.org/docs/latest/python_api/mlflow.sklearn.html?highlight=save_model#mlflow.sklearn.log_model)

### Log the `model_lr` model

In [11]:
with mlflow.start_run() as run: 
    
    # Log the sklearn model and register as version 1
    mlflow.sklearn.log_model(
        sk_model=model_lr,
        artifact_path="LinearRegression"
    )

2025/06/17 09:36:40 INFO mlflow.system_metrics.system_metrics_monitor: Skip logging GPU metrics. Set logger level to DEBUG for more details.
2025/06/17 09:36:40 INFO mlflow.system_metrics.system_metrics_monitor: Started monitoring system metrics.
2025/06/17 09:36:41 INFO mlflow.system_metrics.system_metrics_monitor: Stopping system metrics monitoring...
2025/06/17 09:36:41 INFO mlflow.system_metrics.system_metrics_monitor: Successfully terminated system metrics monitoring!


🏃 View run nebulous-moth-252 at: http://localhost:5001/#/experiments/995036023860049948/runs/985c64836f3a4c369dd8109c2af11c31
🧪 View experiment at: http://localhost:5001/#/experiments/995036023860049948


### Log and Register the `model_rf` model (3 times)

- Use `registered_model_name` to register a model automatically.
- If a registered model with the name doesn’t exist, the method registers a new model and creates `Version 1`.
- If a registered model with the name exists, the method creates a new model version.

INSTRUCTION: 
- Run the cell below 3 time to register 3 versions of the "bike-sharing-RandomForestRegressor" model

In [15]:
from mlflow.models import infer_signature

with mlflow.start_run() as run: 

    # Show newly created run metadata info
    print("Experiment id: {}".format(run.info.experiment_id))
    print("Run id: {}".format(run.info.run_id))
    print("Run name: {}".format(run.info.run_name))
    print('MLFlow tracking uri:', mlflow.get_tracking_uri())
    print('MLFlow artifact uri:', mlflow.get_artifact_uri())
    run_id = run.info.run_id

    # Infer the model signature
    signature = infer_signature(y_test, preds_rf)

    # Log the sklearn model and register as version 1
    mlflow.sklearn.log_model(
        sk_model=model_rf,
        artifact_path="RandomForest",
        signature=signature,
        registered_model_name="RandomForest",
    )

2025/06/17 09:37:23 INFO mlflow.system_metrics.system_metrics_monitor: Skip logging GPU metrics. Set logger level to DEBUG for more details.
2025/06/17 09:37:23 INFO mlflow.system_metrics.system_metrics_monitor: Started monitoring system metrics.


Experiment id: 995036023860049948
Run id: c054d6de996649ca8e401e819cec4df9
Run name: capricious-bear-771
MLFlow tracking uri: http://localhost:5001
MLFlow artifact uri: mlflow-artifacts:/995036023860049948/c054d6de996649ca8e401e819cec4df9/artifacts


Registered model 'RandomForest' already exists. Creating a new version of this model...
2025/06/17 09:37:24 INFO mlflow.store.model_registry.abstract_store: Waiting up to 300 seconds for model version to finish creation. Model name: RandomForest, version 3
Created version '3' of model 'RandomForest'.
2025/06/17 09:37:24 INFO mlflow.system_metrics.system_metrics_monitor: Stopping system metrics monitoring...
2025/06/17 09:37:24 INFO mlflow.system_metrics.system_metrics_monitor: Successfully terminated system metrics monitoring!


🏃 View run capricious-bear-771 at: http://localhost:5001/#/experiments/995036023860049948/runs/c054d6de996649ca8e401e819cec4df9
🧪 View experiment at: http://localhost:5001/#/experiments/995036023860049948


In [16]:
# Adding or Updating an MLflow Model Descriptions

client.update_model_version(
    name="RandomForest",
    version=3,
    description="This a model version 3 description added with update_model_version() method",
)

# Note: if you got an error, make sure you run the previous cell 3 times and save 3 versions of the model! 

<ModelVersion: aliases=[], creation_timestamp=1750127844737, current_stage='None', deployment_job_state=<ModelVersionDeploymentJobState: current_task_name='', job_id='', job_state='DEPLOYMENT_JOB_CONNECTION_STATE_UNSPECIFIED', run_id='', run_state='DEPLOYMENT_JOB_RUN_STATE_UNSPECIFIED'>, description='This a model version 3 description added with update_model_version() method', last_updated_timestamp=1750127844946, metrics=None, model_id=None, name='RandomForest', params=None, run_id='c054d6de996649ca8e401e819cec4df9', run_link='', source='models:/m-351207e758a349c59419747e10ad6ba4', status='READY', status_message=None, tags={}, user_id='', version='3'>

## Discover models and their stages

In [17]:
from pprint import pprint

# Listing and Searching MLflow Models

for rm in client.search_registered_models():
    pprint(dict(rm), indent=4)

{   'aliases': {},
    'creation_timestamp': 1750127531192,
    'deployment_job_id': '',
    'deployment_job_state': 'DEPLOYMENT_JOB_CONNECTION_STATE_UNSPECIFIED',
    'description': '',
    'last_updated_timestamp': 1750127531195,
    'latest_versions': [   <ModelVersion: aliases=[], creation_timestamp=1750127531195, current_stage='None', deployment_job_state=<ModelVersionDeploymentJobState: current_task_name='', job_id='', job_state='DEPLOYMENT_JOB_CONNECTION_STATE_UNSPECIFIED', run_id='', run_state='DEPLOYMENT_JOB_RUN_STATE_UNSPECIFIED'>, description='', last_updated_timestamp=1750127531195, metrics=None, model_id=None, name='1-get-started', params=None, run_id='a313ad423c1a4050a048f96cdca765dd', run_link='', source='models:/m-9d4e6b2494b44cb48e33fdd1bed7c1a6', status='READY', status_message=None, tags={}, user_id='', version='1'>],
    'name': '1-get-started',
    'tags': {}}
{   'aliases': {},
    'creation_timestamp': 1750127802960,
    'deployment_job_id': '',
    'deployment_jo

In [18]:
# Search for a specific model name and list its version details 

for mv in client.search_model_versions("name='RandomForest'"):
    pprint(dict(mv), indent=4)

{   'aliases': [],
    'creation_timestamp': 1750127844737,
    'current_stage': 'None',
    'deployment_job_state': <ModelVersionDeploymentJobState: current_task_name='', job_id='', job_state='DEPLOYMENT_JOB_CONNECTION_STATE_UNSPECIFIED', run_id='', run_state='DEPLOYMENT_JOB_RUN_STATE_UNSPECIFIED'>,
    'description': 'This a model version 3 description added with '
                   'update_model_version() method',
    'last_updated_timestamp': 1750127844946,
    'metrics': None,
    'model_id': None,
    'name': 'RandomForest',
    'params': None,
    'run_id': 'c054d6de996649ca8e401e819cec4df9',
    'run_link': '',
    'source': 'models:/m-351207e758a349c59419747e10ad6ba4',
    'status': 'READY',
    'status_message': None,
    'tags': {},
    'user_id': '',
    'version': '3'}
{   'aliases': [],
    'creation_timestamp': 1750127842649,
    'current_stage': 'None',
    'deployment_job_state': <ModelVersionDeploymentJobState: current_task_name='', job_id='', job_state='DEPLOYMENT_J

## Transitioninig a model stage

In [19]:
# Over the course of the model’s lifecycle, a model evolves—from development to staging to production. 
# You can transition a registered model to one of the stages: Staging, Production or Archived.

client.transition_model_version_stage(
    name="RandomForest", version=3, stage="Production"
)

  client.transition_model_version_stage(


<ModelVersion: aliases=[], creation_timestamp=1750127844737, current_stage='Production', deployment_job_state=<ModelVersionDeploymentJobState: current_task_name='', job_id='', job_state='DEPLOYMENT_JOB_CONNECTION_STATE_UNSPECIFIED', run_id='', run_state='DEPLOYMENT_JOB_RUN_STATE_UNSPECIFIED'>, description='This a model version 3 description added with update_model_version() method', last_updated_timestamp=1750127847722, metrics=None, model_id=None, name='RandomForest', params=None, run_id='c054d6de996649ca8e401e819cec4df9', run_link='', source='models:/m-351207e758a349c59419747e10ad6ba4', status='READY', status_message=None, tags={}, user_id='', version='3'>

## Download and use models from the registry

In [20]:
model_version_uri = "models:/example-model@Champion"

In [21]:
# Load the model from the model registry and score
model_uri = f"models:/RandomForest/3"
loaded_model = mlflow.sklearn.load_model(model_uri)
loaded_model

  from .autonotebook import tqdm as notebook_tqdm
Downloading artifacts: 100%|████████████████████████████████████████████████████████████████████████████████████████████| 5/5 [00:00<00:00, 597.09it/s]


0,1,2
,n_estimators,50
,criterion,'squared_error'
,max_depth,
,min_samples_split,2
,min_samples_leaf,1
,min_weight_fraction_leaf,0.0
,max_features,1.0
,max_leaf_nodes,
,min_impurity_decrease,0.0
,bootstrap,True


In [22]:
loaded_model.predict(X_test)

array([ 65.88,   6.04, 166.8 , 101.22,  39.24,   4.84,  29.44,  46.9 ,
        40.52,  29.76,  80.5 ,   4.82,  70.68,   4.  ,  34.88,  78.56,
        89.92,   2.16, 180.92,  63.62,  51.88,  83.54,   9.1 ,  59.38,
        86.84,  19.78, 141.82,   8.84,  17.7 ,  57.58,  81.72, 119.94,
        64.82,  55.76, 128.76,  88.56,  68.14,  59.26,   5.96,  21.84,
        50.32,  61.  ,  59.  ,   1.84,   5.08,   5.48,  26.64,  76.66,
         9.18,  25.88,  22.58,   4.52,  34.74,  40.48,  61.48,  78.88,
        40.68,  37.08, 182.34, 143.32,  62.42,   2.94,  78.52,  42.66,
         2.04,  13.58,  17.64,  10.5 ,  42.78, 149.6 ,   6.04,  60.4 ,
        69.98,  60.32,   3.1 , 143.08,  33.78,  74.58,   2.8 ,  20.82,
        80.68,  57.98, 157.6 , 105.14,  24.36,  23.2 ,  34.52, 105.48,
        84.02, 148.52,  41.48,  17.3 , 136.04,  84.42,  62.92,  96.7 ,
        62.88, 149.06,  19.82, 190.74,  81.3 ,  67.56,  78.66,   4.2 ,
        43.36,  53.98,   3.64,  65.16,  80.8 ,  68.04,  68.6 ,  17.6 ,
      

## Deregistering, Deleting and Archiving models 

In [23]:
# Archive models version 3 from Production into Archived

client = MlflowClient()
client.transition_model_version_stage(
    name="RandomForest", version=2, stage="Archived"
)

  client.transition_model_version_stage(


<ModelVersion: aliases=[], creation_timestamp=1750127842649, current_stage='Archived', deployment_job_state=<ModelVersionDeploymentJobState: current_task_name='', job_id='', job_state='DEPLOYMENT_JOB_CONNECTION_STATE_UNSPECIFIED', run_id='', run_state='DEPLOYMENT_JOB_RUN_STATE_UNSPECIFIED'>, description='', last_updated_timestamp=1750127849693, metrics=None, model_id=None, name='RandomForest', params=None, run_id='022cba340541489ebe791423bf23c462', run_link='', source='models:/m-d0694f50c2a34a50a1abe501a25deabb', status='READY', status_message=None, tags={}, user_id='', version='2'>

In [24]:
# Note: Deleting registered models or model versions is irrevocable, so use it judiciously.

# Delete version 1 of the model
client.delete_model_version(
    name="RandomForest", version=1
)