# 🖥️ Monitoring Regression Model Performance Metrics

In this tutorial, we'll show how you can log performance metrics of your ML Model with whylogs, and how to send it to your dashboard at Whylabs Platform.
We'll follow a regression use case, where we're trying to predict the number of bike rides within the given hour, using data from [Kaggle's Bike Sharing Demand Dataset](https://www.kaggle.com/c/bike-sharing-demand).

We will:
- Download Bike data
- Train a regression model with SKLearn
- Log Input/Output features with whylogs
- Log Performance Metrics (Targets and Predictions) with whylogs
- Show Performance summary at WhyLabs

# 🚴 The Data Story

In this example, we want to predict the total count of bikes rented during each hour for a given city bikeshare system, using information available prior to the rental period.

In real life, we may have the ground truths available only after a certain lag, and it is possible that only a subsample of the complete predictions will eventually be annotated. That is why WhyLabs enables you to decouple the logged features from the performance metrics, allowing you to send the metrics at different times and sizes.

To simulate this scenario, we will log the input and output features for every prediction made and, subsequently, we'll also log the predictions, along with the ground truth, for only a subsample of the bike rides. Finally, we'll see how regression metrics are calculated and displayed from the logged information.

### Data Fields

* __datetime__ - hourly date + timestamp  
* __season__
    * 1 = spring
    * 2 = summer
    * 3 = fall
    * 4 = winter 
* __holiday__ - whether the day is considered a holiday
* __workingday__ - whether the day is neither a weekend nor holiday
* __weather__
    * 1: Clear, Few clouds, Partly cloudy, Partly cloudy
    * 2: Mist + Cloudy, Mist + Broken clouds, Mist + Few clouds, Mist
    * 3: Light Snow, Light Rain + Thunderstorm + Scattered clouds, Light Rain + Scattered clouds
    * 4: Heavy Rain + Ice Pallets + Thunderstorm + Mist, Snow + Fog 
* __temp__ - temperature in Celsius
* __atemp__ - "feels like" temperature in Celsius
* __humidity__ - relative humidity
* __windspeed__ - wind speed
* __casual__ - number of non-registered user rentals initiated
* __registered__ - number of registered user rentals initiated
* __(target) count__ - number of total rentals

For more information about the data, please check https://www.kaggle.com/c/bike-sharing-demand/data



# Installing Required Packages

In [1]:
%%sh
pip install --upgrade pip -q
pip install whylogs -U -q
pip install sklearn -U -q

# Fetching the Data

Since we need the labels in order to calculate the metrics, we will train the model using only a fraction of the training set, and use the remainder of it as production data whose features and model performance we want to monitor.

In [2]:
import pandas as pd

df = pd.read_csv("https://whylabs-public.s3.us-west-2.amazonaws.com/whylogs_examples/bike_sharing_2012.csv")

# This number (8151) is just to split right at the beginning of a given day (2012-07-01)
df_val = df.iloc[8151:,:]
df = df.iloc[:8151,:]
df_val.head()

Unnamed: 0,datetime,season,holiday,workingday,weather,temp,atemp,humidity,windspeed,casual,registered,count
8151,2012-07-01 00:00:00,3,0,0,1,31.16,36.365,66,0.0,27,122,149
8152,2012-07-01 01:00:00,3,0,0,1,30.34,34.85,70,8.9981,12,81,93
8153,2012-07-01 02:00:00,3,0,0,1,29.52,34.85,74,6.0032,21,69,90
8154,2012-07-01 03:00:00,3,0,0,1,29.52,35.605,84,8.9981,6,27,33
8155,2012-07-01 04:00:00,3,0,0,1,28.7,33.335,79,12.998,0,4,4


Let's have a look at our training data. Our target feature is `count`.

In [3]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 8151 entries, 0 to 8150
Data columns (total 12 columns):
 #   Column      Non-Null Count  Dtype  
---  ------      --------------  -----  
 0   datetime    8151 non-null   object 
 1   season      8151 non-null   int64  
 2   holiday     8151 non-null   int64  
 3   workingday  8151 non-null   int64  
 4   weather     8151 non-null   int64  
 5   temp        8151 non-null   float64
 6   atemp       8151 non-null   float64
 7   humidity    8151 non-null   int64  
 8   windspeed   8151 non-null   float64
 9   casual      8151 non-null   int64  
 10  registered  8151 non-null   int64  
 11  count       8151 non-null   int64  
dtypes: float64(3), int64(8), object(1)
memory usage: 764.3+ KB


# 📖 Feature Preprocessing & Training the Model

We will train a Random Forest regression model using SKLearn. But first, we'll do some basic preprocessing and feature transformation to better suit the data to our model.

We won't get into the details of training the model in this example. However, the training code was heavily based on the content present in this [Kaggle Notebook](https://www.kaggle.com/rajmehra03/bike-sharing-demand-rmsle-0-3194/notebook), if you want more information on that (EDA included).

Let's define a function to transform our features for both training and validation sets. In addition to the input and target features, we'll also need the timestamps to log it into daily batches. Our function will return these 3 variables.

In [4]:
def transform_features(df):
    season=pd.get_dummies(df['season'],prefix='season')
    df=pd.concat([df,season],axis=1)
    weather=pd.get_dummies(df['weather'],prefix='weather')
    df=pd.concat([df,weather],axis=1)
    df.drop(['season','weather'],inplace=True,axis=1)
    df["hour"] = [t.hour for t in pd.DatetimeIndex(df.datetime)]
    df["day"] = [t.dayofweek for t in pd.DatetimeIndex(df.datetime)]
    df["month"] = [t.month for t in pd.DatetimeIndex(df.datetime)]
    df['year'] = [t.year for t in pd.DatetimeIndex(df.datetime)]
    df['year'] = df['year'].map({2011:0, 2012:1})
    df.drop(['casual','registered'],axis=1,inplace=True)
    df.columns.to_series().groupby(df.dtypes).groups
    
    # If given season or weather doesn't appear in the whole df, let's create an empty column for it
    for i in range(0,5):
        season_i = "season_{}".format(i)
        weather_i = "weather_{}".format(i)
        if season_i not in df.columns:
            df[season_i]=0
        if weather_i not in df.columns:
            df[weather_i]=0

    return df['datetime'],df.drop(['count','datetime'],axis=1),df['count']

In [5]:
from sklearn.ensemble import RandomForestRegressor
import numpy as np
from sklearn.metrics import mean_squared_log_error

_,x_train,y_train = transform_features(df)
dates,x_val,y_val = transform_features(df_val)


clf = RandomForestRegressor()
clf.fit(x_train,y_train)

x_val = x_val[x_train.columns]
val_pred = clf.predict(x_val)
rmsle = np.sqrt(mean_squared_log_error(val_pred,y_val))
print(rmsle)


0.33426422991287585


# ✔️ Setup WhyLabs/Credentials

In order to monitor our model's performance, let's first set up a WhyLabs account.
We will need two pieces of information:

- API token
- Organization ID

Go to https://whylabs.ai/free and grab a free account. You can follow along with the examples if you wish, but if you’re interested in only following this demonstration, you can go ahead and skip the quick start instructions.

After that, you’ll be prompted to create an API token. Once you create it, copy and store it locally. The second important information here is your org ID. Take note of it as well. WhyLabs gives you an example code of how to create a session and send data to your dashboard. You can test it as well and check if data is getting through. Otherwise, after you get your API Token and Org ID, you can go to https://hub.whylabsapp.com/models to see your model’s dashboard. To get to this step, we used the [WhyLabs API Documentation](https://docs.whylabs.ai/docs/whylabs-api/), which also provides additional information about token creation and basic examples on how to use it.


Now, when running the code below, you'll be prompted for your API token and ORG ID, which you just created.

In [1]:
from datetime import time, timedelta
import datetime
from typing import Sequence
from smart_open import open
import pandas as pd
import os
from whylogs.app import Session
from whylogs.app.writers import WhyLabsWriter
import getpass



# set your org-id here
print("Enter your WhyLabs Org ID")
os.environ["WHYLABS_DEFAULT_ORG_ID"] = input()

# set your API key here
print("Enter your WhyLabs API key")
os.environ["WHYLABS_API_KEY"] = getpass.getpass()
print("Using API Key ID: ", os.environ["WHYLABS_API_KEY"][0:10])

# Adding the WhyLabs Writer to utilize WhyLabs platform
writer = WhyLabsWriter()

session = Session(writers=[writer])

Enter your WhyLabs Org ID
Enter your WhyLabs API key
Using API Key ID:  xxtIbfnVKB


# 📊 Profiling Input/Output Data

We will first profile input/output data. Let's join the input features with the output feature, and then group the data according to the date.

We will create profiles in a daily basis, for a period of 7 days. The original dates go from Feb-11-2012 to Feb-17-2012. However, let's bring this closer to the current date and shift the dates so data can be logged within the last 7 days.

In [7]:
'''
features_df contains input features, as well as output features (predictions from the model).
'''

features_df = x_val
# Adding 'output' in the output feature's name enables WhyLabs to recognize it as an output feature.  
features_df['output_count'] = y_val
features_df['date'] = dates
features_df['date'] = pd.to_datetime(features_df['date'])

Remember to input your `datasetId` to match your model's ID (If you just created your account, it will be `model-1`)

In [12]:
# Run whylogs on historical data and upload to WhyLabs.
# Stick to data from the prior seven days for now.
import datetime

print("Enter your Dataset ID")
datasetID = input()


now = datetime.datetime.now()
for day in range(0, 7):
    timestamp = now - timedelta(days=6) + timedelta(days=day)
    #First day to log was 11 Feb, but we're logging as if it were 7 days ago. Adding 11 as offset
    original_date = 11 + day
    cond = (features_df['date'].dt.day==original_date) & (features_df['date'].dt.month==2)
    daily_features = features_df.loc[cond]
    # We don't need date anymore
    daily_features = daily_features.drop(['date'],axis=1)

    with session.logger(
        # Note: 'datasetId' in whylogs maps to 'model-id' that is provided when you set up a model in WhyLabs
        tags={"datasetId": datasetID}, dataset_timestamp=timestamp
    ) as ylog:
        print("logging input/output features for day {}....".format(day))
        ylog.log_dataframe(daily_features)

logging input/output features for day 0....
logging input/output features for day 1....
logging input/output features for day 2....
logging input/output features for day 3....
logging input/output features for day 4....
logging input/output features for day 5....
logging input/output features for day 6....


# 📊 Profiling Model Metrics

Labels might be available in a delayed manner, and in some cases only a subsample of the predictions will eventually be annotated. In this example, we'll log the metrics separately and only for a subsample of the predictions. The goal is to show that you're able to log standard features and metrics separately and of different sizes.

We'll basically repeat the process, but using `log_metrics` instead of `log_dataframe`. Let's assemble the labels, the predictions and the dates into a single dataframe, subsample it and then send it over to WhyLabs.

In [13]:
df_metrics = pd.DataFrame()
df_metrics['label'] = y_val
df_metrics['prediction'] = val_pred.astype(int)
df_metrics['date'] = pd.to_datetime(dates)

In [14]:
cond = (df_metrics['date'].dt.day>=11) & (df_metrics['date'].dt.month==2) & (df_metrics['date'].dt.day<18)
df_metrics = df_metrics.loc[cond]


In [15]:
metrics_subset = df_metrics.sample(frac=0.8)
metrics_subset = metrics_subset.sort_values(by=["date"])

In [16]:
# Run whylogs on historical data and upload to WhyLabs.
# Stick to data from the prior seven days for now.
now = datetime.datetime.now()

print("Enter your Dataset ID")
datasetID = input()

for day in range(0, 7):
    timestamp = now - timedelta(days=6) + timedelta(days=day)
    original_date = 11 + day
    cond = (df_metrics['date'].dt.day==original_date) & (df_metrics['date'].dt.month==2)
    daily_metrics = df_metrics.loc[cond]
    # We don't need date anymore
    daily_metrics = daily_metrics.drop(['date'],axis=1)

    with session.logger(
        # Note: 'datasetId' in whylogs maps to 'model-id' that is provided when you set up a model in WhyLabs
        tags={"datasetId": datasetID}, dataset_timestamp=timestamp
    ) as ylog:
        ylog.log_metrics(targets=daily_metrics["label"].tolist(), 
                    predictions=daily_metrics["prediction"].tolist(), 
                    target_field="count",
                    prediction_field="output_count")

In [17]:
#closing the session once we're done.
session.close()

At your model's dashboard, you should see the model metrics for the last seven days. For regression, the displayed metrics are:

- Total output and input count
- Mean Squared Error
- Mean Absolute Error
- Root Mean Squared Error

![alt text](images/regression_metrics.png)
