# Machine Learning Model Validation

June 21-23, 2023

This demo (based on BikeSharing data, a regression task) covers:

- Explainable Boosting Machines (EBM)

- XGBoost Depth 2 (XGB2)

- Generalized Additive Model with Interaction Network (GAMI-Net)

- Fast Interpretable Greedy-Tree Sums (FIGS)

## Install PiML Toolbox

- Run `!pip install piml` to install the latest version of PiML.
- In Google Colab, we need restart the runtime in order to use newly installed version.

In [None]:
!pip install piml

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/


## Load and Prepare Data

Initilaize a new experiment by `piml.Experiment()`

In [None]:
from piml import Experiment
exp = Experiment()

Choose BikeSharing

In [None]:
exp.data_loader()

HTML(value='\n        <style>\n\n        .left-label {\n            width: 30%;\n        }\n\n        .card-pa…

VBox(children=(Dropdown(layout=Layout(width='20%'), options=('Select Data', 'CoCircles', 'Friedman', 'BikeShar…

Exclude these features one-by-one: "yr", "mnth", "temp" (highly correlated with others)

In [None]:
exp.data_summary()

HTML(value='\n        <style>\n\n        .left-label {\n            width: 30%;\n        }\n\n        .card-pa…

HTML(value='<link rel="stylesheet" href="//stackpath.bootstrapcdn.com/font-awesome/4.7.0/css/font-awesome.min.…

VBox(children=(HTML(value='Data Shape:(17379, 13)'), Tab(children=(Output(), Output()), _dom_classes=('data-su…

Prepare dataset with default settings

In [None]:
exp.data_prepare()

HTML(value='\n        <style>\n\n        .left-label {\n            width: 30%;\n        }\n\n        .card-pa…

VBox(children=(HBox(children=(VBox(children=(HTML(value='<p>Target Variable:</p>'), HTML(value='<p>Split Metho…

## Train Intepretable Models


- Choose EBM, XGB2, GAMI-Net, and FIGS; then click on the "RUN" button.
- As training is finished, choose these 4 models in the second dropdown and click on the "Register" button one-by-one.

In [None]:
exp.model_train()

HTML(value='\n        <style>\n\n        .left-label {\n            width: 30%;\n        }\n\n        .card-pa…

<IPython.core.display.Javascript object>

VBox(children=(Box(children=(Box(children=(HTML(value="<h4 style='margin: 10px 0px;'>Choose Model</h4>"), Box(…

## Global Model Interpretation for Functional GNOVA Models

Choose [EBM](https://selfexplainml.github.io/PiML-Toolbox/_build/html/guides/models/ebm.html), [XGB2](https://selfexplainml.github.io/PiML-Toolbox/_build/html/guides/models/xgb2.html), or [GAMINet](https://selfexplainml.github.io/PiML-Toolbox/_build/html/guides/models/gaminet.html).

- Switch to the "Global-Interpretability" tab.

- Try the following options to view the different aspects of the model:

    - **Feature**: choose the name of the main effect (feature).

    - **Interaction effect**: choose the name of the interaction effect.

    - **Shown in original scale**: enable the check box to display the features in their original scale, instead of the Minmax scaled between 0 to 1.

- The displayed results include:

    - **Feature Importance**: displays the top-10 features' importance.

    - **Effect Importance**: displays the top-10 effects' importance. The effects includes main effects (for each feature) and pairwise interaction (for each selected 2-feature interaction).

    - **1D / 2D Effect Plot**: displays the estimated effect against the corresponding features.

In [None]:
exp.model_interpret()

HTML(value='\n        <style>\n\n        .left-label {\n            width: 30%;\n        }\n\n        .card-pa…

<IPython.core.display.Javascript object>

VBox(children=(Dropdown(layout=Layout(width='20%'), options=('Select Model', 'XGB2', 'FIGS', 'XGB1', 'GAMI-Net…

## Local Model Interpretation for Functional GNOVA Models

Choose [EBM](https://selfexplainml.github.io/PiML-Toolbox/_build/html/guides/models/ebm.html), [XGB2](https://selfexplainml.github.io/PiML-Toolbox/_build/html/guides/models/xgb2.html), or [GAMINet](https://selfexplainml.github.io/PiML-Toolbox/_build/html/guides/models/gaminet.html).

- Switch to the "Local-Interpretability" tab.

- Try the following options to view the different aspects of the model:

    - **Sample Index**: choose the sample index (in the training set) to be explained.

    - **Shown in original scale**: the check box can be enabled to display the features in their original scale, instead of the Minmax scaled between 0 to 1.

- The displayed results include:

    - **Local Feature Importance**: displays the top-10 features' contribution.

    - **Local Effect Importance**: displays the top-10 effects' contribution. The effects includes main effects (for each feature) and pairwise interaction (for each selected 2-feature interaction).

In [None]:
exp.model_interpret()

HTML(value='\n        <style>\n\n        .left-label {\n            width: 30%;\n        }\n\n        .card-pa…

<IPython.core.display.Javascript object>

VBox(children=(Dropdown(layout=Layout(width='20%'), options=('Select Model', 'XGB2', 'FIGS', 'XGB1', 'GAMI-Net…

## Global Interpretation for FIGS

Choose [FIGS](https://selfexplainml.github.io/PiML-Toolbox/_build/html/guides/models/figs.html).

- Switch to the "Global-Interpretability" tab.

- Try the following options to view the different aspects of the model:

    - **Tree ID #**: switch the individual trees to display.

    - **Tree Depth**: specify the maximum depth to display.

    - **Node #**: specify the node id as the starting node to display.

    - **Shown in original scale**: enable the check box to display the features in their original scale, instead of the Minmax scaled between 0 to 1.

- The displayed results include:

    - **Tree Diagram**: a subset of the tree diagram. You may adjust the parameters to view different subsets of the trees.

    - **Feature Importance Heatmap**: see the details [here](https://selfexplainml.github.io/PiML-Toolbox/_build/html/guides/models/figs.html#feature-importance-heatmap).

In [None]:
exp.model_interpret()

HTML(value='\n        <style>\n\n        .left-label {\n            width: 30%;\n        }\n\n        .card-pa…

<IPython.core.display.Javascript object>

VBox(children=(Dropdown(layout=Layout(width='20%'), options=('Select Model', 'XGB2', 'FIGS', 'XGB1', 'GAMI-Net…

## Global Interpretation for FIGS

Choose [FIGS](https://selfexplainml.github.io/PiML-Toolbox/_build/html/guides/models/figs.html).

- Switch to the "Local-Interpretability" tab.

- Try the following options to view the different aspects of the model:

    - **Sample Index**: specify the traing sample index to interpret.

    - **Tree ID #**: switch the individual trees to display.

    - **Shown in original scale**: enable the check box to display the features in their original scale, instead of the Minmax scaled between 0 to 1.

- The displayed results include:

    - The tree diagram with highlighted path for the sample being interpreted.

In [None]:
exp.model_interpret()

HTML(value='\n        <style>\n\n        .left-label {\n            width: 30%;\n        }\n\n        .card-pa…

<IPython.core.display.Javascript object>

VBox(children=(Dropdown(layout=Layout(width='20%'), options=('Select Model', 'XGB2', 'FIGS', 'XGB1', 'GAMI-Net…