The vetiver framework offers functions to fluently compute, store, and plot model metrics. These functions are particularly suited to monitoring your model using multiple performance metrics over time.

When a model is deployed, new data comes in over time, even if time is not a feature for prediction. Even if your model does not explicitly use any dates, a measure of time like a date can affect your model performance.

## Build a model


In [1]:
#| output: false

from vetiver import VetiverModel
from pins import board_folder

model_board = board_folder(".", allow_pickle_read=True)
v = VetiverModel.from_pin(model_board, "cars")

## Compute metrics

Let's say we collect new data on fuel efficiency in cars and we want to monitor the performance of our model over time. We can compute multiple metrics at once over a certain time aggregation.


In [2]:
import vetiver

import pandas as pd
from sklearn import metrics
from datetime import timedelta

cars = pd.read_csv("https://vetiver.rstudio.com/get-started/new-cars.csv")
original_cars = cars.iloc[:14, :].copy()
original_cars["preds"] = v.model.predict(
    original_cars.drop(columns=["date_obs", "mpg"])
)

metric_set = [metrics.mean_absolute_error, 
  metrics.mean_squared_error, 
  metrics.r2_score]
  
td = timedelta(weeks = 1)

original_metrics = vetiver.compute_metrics(
    data = original_cars, 
    date_var = "date_obs", 
    period = td, 
    metric_set = metric_set, 
    truth = "mpg", 
    estimate = "preds"
)

original_metrics

Unnamed: 0,index,n,metric,estimate
0,2022-03-24,7,mean_absolute_error,1.784605
1,2022-03-24,7,mean_squared_error,4.158348
2,2022-03-24,7,r2_score,0.679499
3,2022-03-31,7,mean_absolute_error,1.45855
4,2022-03-31,7,mean_squared_error,3.370279
5,2022-03-31,7,r2_score,0.892011


## Pin metrics

The first time you pin monitoring metrics, you can write to a board as normal. 


In [3]:
#| output: false
model_board.pin_write(original_metrics, "tree_metrics", type = "csv")

Writing pin:
Name: 'tree_metrics'
Version: 20220809T163836Z-dbf62


Meta(title='tree_metrics: a pinned 6 x 4 DataFrame', description=None, created='20220809T163836Z', pin_hash='dbf62f6a203cff6e', file='tree_metrics.csv', file_size=309, type='csv', api_version=1, version=Version(created=datetime.datetime(2022, 8, 9, 16, 38, 36, 706771), hash='dbf62f6a203cff6e'), name='tree_metrics', user={})

However, when adding new metrics measurements to your pin as you continue to gather new data and monitor, you may have dates that overlap with those already in the pin, depending on your monitoring strategy. You can choose how to handle overlapping dates with the `overwrite` argument.


In [4]:
#| output: false
# dates overlap with existing metrics:
new_cars = cars.iloc[7:, :].copy()
new_cars["preds"] = v.model.predict(
    new_cars.drop(columns=["date_obs", "mpg"])
)

new_metrics = vetiver.compute_metrics(
    data = new_cars, 
    date_var = "date_obs", 
    period = td, 
    metric_set = metric_set, 
    truth = "mpg", 
    estimate = "preds"
)
                    
vetiver.pin_metrics(
    model_board, 
    new_metrics, 
    "tree_metrics", 
    overwrite = True
)

Writing pin:
Name: 'tree_metrics'
Version: 20220809T163838Z-225f4


Unnamed: 0,index,n,metric,estimate
0,2022-03-24,7,mean_absolute_error,1.784605
1,2022-03-24,7,mean_squared_error,4.158348
2,2022-03-24,7,r2_score,0.679499
3,2022-03-31,7,mean_absolute_error,1.45855
4,2022-03-31,7,mean_squared_error,3.370279
5,2022-03-31,7,r2_score,0.892011
6,2022-04-07,7,mean_absolute_error,2.62832
7,2022-04-07,7,mean_squared_error,10.673068
8,2022-04-07,7,r2_score,0.621554
9,2022-04-14,7,mean_absolute_error,1.847171


## Plot metrics

You can visualize your set of computed metrics and your model's performance.


In [5]:
#| eval: false
monitoring_metrics = model_board.pin_read("tree_metrics", version="20220809T163838Z-225f4")
p = vetiver.plot_metrics(df_metrics = monitoring_metrics)
p.update_yaxes(matches=None)
p.write_image("../images/monitor.png")

![plot of monitoring data](../images/monitor.png)