<font size=5>Machine Learning Model Validation Workshop, 2022</font>

<font size=4 color=blue>Session 2: Model Diagnostics and Validation</font>

By Aijun Zhang, July 6, 2022

This demo (based on BikeSharing data) covers: 

- Accuracy, WeakSpot and Overfit

- Robustness and Resilience Testing

- Reliability Testing

Today we mainly demonstrate the use of PiML through its low-code interface. In the forthcoming series of PiML tutorials, we will provide the automatic run through calling high-code APIs. 

# Initialize PiML Experiment

1. Run `!pip install piml` to install the latest version of PiML.
2. In Google Colab, we need restart the runtime in order to use newly installed version.
3. Initilaize a new experiment by `piml.Experiment()`

In [1]:
!pip install piml

In [None]:
from piml import Experiment
exp = Experiment()

# Load and Prepare Data

In [2]:
# Choose BikeSharing
exp.data_loader()

HTML(value='\n        <style>\n\n        .left-label {\n            width: 30%;\n        }\n\n        .card-pa…

<IPython.core.display.Javascript object>

VBox(children=(Dropdown(layout=Layout(width='20%'), options=('Select Data', 'CoCircles', 'Friedman', 'BikeShar…

In [3]:
# Exclude these features one-by-one: "season", "workingday", "atemp" (highly correlated with others)
exp.data_summary()

HTML(value='\n        <style>\n\n        .left-label {\n            width: 30%;\n        }\n\n        .card-pa…

<IPython.core.display.Javascript object>

HTML(value='<link rel="stylesheet" href="//stackpath.bootstrapcdn.com/font-awesome/4.7.0/css/font-awesome.min.…

VBox(children=(HTML(value='Data Shape:(17379, 13)'), Tab(children=(Output(), Output()), _dom_classes=('data-su…

In [4]:
exp.data_prepare()

HTML(value='\n        <style>\n\n        .left-label {\n            width: 30%;\n        }\n\n        .card-pa…

<IPython.core.display.Javascript object>

VBox(children=(HBox(children=(VBox(children=(HTML(value='<p>Target Variable:</p>'), HTML(value='<p>Test Ratio:…

In [5]:
exp.eda()

HTML(value='\n        <style>\n\n        .left-label {\n            width: 30%;\n        }\n\n        .card-pa…

<IPython.core.display.Javascript object>

VBox(children=(HBox(children=(VBox(children=(HTML(value='<h4>Univariate:</h4>'), HBox(children=(Dropdown(layou…

# Train Intepretable Models


In [6]:
# Choose EBM and GAMI-Net, ReLU-DNN (dense vs. L1 = 0.0005)
exp.model_train()

HTML(value='\n        <style>\n\n        .left-label {\n            width: 30%;\n        }\n\n        .card-pa…

<IPython.core.display.Javascript object>

VBox(children=(Box(children=(Box(children=(HTML(value="<h4 style='margin: 10px 0px;'>Choose Model</h4>"), Box(…

In [7]:
# Choose EBM
exp.model_interpret()

HTML(value='\n        <style>\n\n        .left-label {\n            width: 30%;\n        }\n\n        .card-pa…

<IPython.core.display.Javascript object>

VBox(children=(Dropdown(layout=Layout(width='20%'), options=('Select Model', 'ReLU-DNN', 'GAMI-Net', 'EBM'), s…

In [8]:
# Choose GAMI-Net
exp.model_interpret()

HTML(value='\n        <style>\n\n        .left-label {\n            width: 30%;\n        }\n\n        .card-pa…

<IPython.core.display.Javascript object>

VBox(children=(Dropdown(layout=Layout(width='20%'), options=('Select Model', 'ReLU-DNN', 'GAMI-Net', 'EBM'), s…

In [9]:
# Choose GAMI-Net or EBM to compare Post-hoc explain results 
#   local: sample_id=0 check rank-order/mangitude
#  global: FI (weakday), Effect plots (hr, hr x weekday)
exp.model_explain()

HTML(value='\n        <style>\n\n        .left-label {\n            width: 30%;\n        }\n\n        .card-pa…

<IPython.core.display.Javascript object>

VBox(children=(Dropdown(layout=Layout(width='20%'), options=('Select Model', 'ReLU-DNN', 'GAMI-Net', 'EBM'), s…

# Diagnose/Compare

In [10]:
exp.model_diagnose()

HTML(value='\n        <style>\n\n        .left-label {\n            width: 30%;\n        }\n\n        .card-pa…

<IPython.core.display.Javascript object>

VBox(children=(Dropdown(layout=Layout(width='20%'), options=('Select Model', 'ReLU-DNN', 'GAMI-Net', 'EBM'), s…

In [11]:
import xgboost as xgb
exp.model_train(model=xgb.XGBRegressor(max_depth=7, n_estimators=500), name='XGBoost')

HTML(value='\n        <style>\n\n        .left-label {\n            width: 30%;\n        }\n\n        .card-pa…

<IPython.core.display.Javascript object>

HTML(value="<p class='notification info'>Register XGBoost Done</p>")

In [12]:
exp.model_compare()

HTML(value='\n        <style>\n\n        .left-label {\n            width: 30%;\n        }\n\n        .card-pa…

<IPython.core.display.Javascript object>

VBox(children=(HBox(children=(Dropdown(layout=Layout(width='30%'), options=('Select Model', 'ReLU-DNN', 'GAMI-…

# Appendix - High-code Automation

## Load and Prepare Data

In [None]:
# Choose BikeSharing  
exp.data_loader(data='BikeSharing')

In [None]:
# Exclude these features one-by-one: "season", "workingday", "atemp" (highly correlated with others)
exp.data_summary(feature_type={}, feature_exclude=["season", "workingday", "atemp"])

In [None]:
# exp.data_prepare() #Low-code
exp.data_prepare(target='cnt', task_type='regression', test_ratio=0.2, random_state=0)

In [None]:
exp.eda(show='univariate', uni_feature='temp')

## Post-hoc Puzzle: XGBoost Model

In [None]:
import xgboost as xgb

model = xgb.XGBRegressor(max_depth=7, n_estimators=500)
exp.model_train(model=model, name='XGBoost')

In [None]:
# Check model performance
exp.model_diagnose(model="XGBoost", show='accuracy_table')

In [None]:
# Tree-based Variable Importance
import matplotlib.pyplot as plt
plt.rcParams["figure.figsize"] = (6, 5)

feature_names = exp.get_feature_names()
model.get_booster().feature_names = feature_names
xgb.plot_importance(model, title="XGBoost Variable Importance", show_values=False)
plt.show()

In [None]:
# Permutation Feature Importance
exp.model_explain(model='XGBoost', show='pfi', figsize=(6,5))

In [None]:
# SHAP Feature Importance
exp.model_explain(model='XGBoost', show='shap_fi', sample_size=500)

In [None]:
# PDP 
exp.model_explain(model='XGBoost', show='pdp', uni_feature='temp', figsize=(5,5))

In [None]:
# ICE
exp.model_explain(model='XGBoost', show='ice', uni_feature='temp', figsize=(5,5))

In [None]:
# ALE
exp.model_explain(model='XGBoost', show='ale', uni_feature='temp', figsize=(5,5))

In [None]:
# Local - LIME
exp.model_explain(model='XGBoost', show='lime', sample_id=1)

In [None]:
# Local - SHAP (TreeSHAP)
exp.model_explain(model='XGBoost', show='shap_waterfall', sample_id=1)

## Post-hoc Puzzle: DNN Model

In [None]:
from sklearn.neural_network import MLPRegressor

clf = MLPRegressor(hidden_layer_sizes=[100]*4, activation="relu", random_state=0)
pipeline = exp.make_pipeline(model=clf, name='MLP')
pipeline.fit()
exp.register(pipeline=pipeline)

In [None]:
# Check model performance
exp.model_compare(models=['XGBoost', 'MLP'], show="accuracy_table", metric="MSE", figsize=(6,5))

In [None]:
# Permutation Feature Importance
exp.model_explain(model='MLP', show='pfi', figsize=(6,5))

In [None]:
# SHAP Feature Importance
exp.model_explain(model='MLP', show='shap_fi', sample_size=10)

In [None]:
# PDP 
exp.model_explain(model='MLP', show='pdp', uni_feature='temp', figsize=(5,5))

In [None]:
# ICE
exp.model_explain(model='MLP', show='ice', uni_feature='temp', figsize=(5,5))

In [None]:
# ALE 
exp.model_explain(model='MLP', show='ale', uni_feature='temp', figsize=(5,5))

In [None]:
# Local - LIME
exp.model_explain(model='MLP', show='lime', sample_id=1)

In [None]:
# Local - LIME (KernelSHAP)
exp.model_explain(model='MLP', show='shap_waterfall', sample_id=1)

## Comparison and Benchmarking

In [None]:
exp.model_compare(models=['XGBoost', 'MLP'], show='accuracy_table')

In [None]:
exp.model_compare(models=['XGBoost', 'MLP'], show='robustness_perf', alpha = 0.2)

In [None]:
# exp.model_compare(models=['XGBoost', 'MLP'],  show='reliability', alpha = 0.1)

In [None]:
exp.model_compare(models=['XGBoost', 'MLP'], show='resilience_perf', immu_feature=None, alpha=0.2)