# PiML Toolbox for Model Development and Validation: Low-code Demo

April 2022 by WF-CMoR-AToM-HK Dev Team

PiML (Python Interpretable Machine Learning) is a new Python toolbox for IML model deveopment and validation. Through low-code automation and high-code programming, PiML support various machine learning models in the following two categories:

- **Inherently interpretable models**: 
  1. EBM: Explainable Boosting Machine (Nori, et al. 2019; Lou, et al. 2013)
  2. GAMI-Net: Generalized Additive Model with Struatured Interaction Network (Yang, Zhang and Sudjianto, 2021)
  3. ReLU-DNN: Deep ReLU networks using Aletheia unwrapper (Sudjianto, et al. 2020)

- **Arbitrary black-box models**，e.g.
  1. LightGBM or XGBoost of varying depth
  2. RandomForest of varying depth
  3. DNNs with softmax/tanh activations

This example notebook demonstrates how to use PiML in its low-code mode for developing the above listed models, interpreting them and testing them. The toolbox has the following built-in datasets for demo purposes. 

- **CoCircles** classification data: simulated by `sklearn.datasets.make_make_circles(n_samples=10000, noise=0.1)`
- **Friedman** regression data: simulated by `sklearn.datasets.make_friedman1(n_samples=10000, n_features=10, and noise=0.1)`   
- **BikeSharing** classification data:  
- **CaliforniaHousing** regression data:
- **TaiwanCredit** regression data:

Other details to add ... 

# Step 0: Installing PiML package on Google Colab

1. Vist [https://github.com/SelfExplainML/PiML-Toolbox/releases/](https://github.com/SelfExplainML/PiML-Toolbox/releases/) to copy the address of the latest version of PiML wheel file;
2. Run the following piece of sript to download and install PiML v1.0.0;
3. In Colab, you may need restart the runtime in order to use newly installed PiML version.

In [None]:
!pip install wget
import wget
url = "https://github.com/SelfExplainML/PiML-Toolbox/releases/download/V1.0.0/PiML-1.0.0-cp37-cp37m-linux_x86_64.whl"
wget.download(url, 'PiML-1.0.0-cp37-cp37m-linux_x86_64.whl')
!pip install PiML-1.0.0-cp37-cp37m-linux_x86_64.whl

# Stage 1: Initialize an experiment, load and process data <a name="expdata"></a>

In [None]:
from piml import Experiment
exp = Experiment(platform="colab")

In [None]:
exp.data_loader()

In [None]:
exp.data_summary()

In [None]:
exp.data_prepare()

In [None]:
exp.eda()

# Stage 2. Train and tune intepretable models <a name="modeltrain"></a>



In [None]:
exp.model_train()

In [None]:
# HPO Function to be improved. 
exp.model_tune()

# Stage 3. Interpret and explain <a name="modelinterpret"></a>

In [None]:
exp.model_interpret()

In [None]:
exp.model_explain()

# Stage 4. Diagnose and compare

In [None]:
exp.model_diagnose()

In [None]:
exp.model_compare()

#Stage 5. Register an arbitrary model ... 

In [None]:
# train_x, train_y, test_x, test_y, Xnames, yname = exp.get_processed_data() 

from lightgbm import LGBMRegressor
pipeline = exp.make_pipeline(LGBMRegressor())
pipeline.fit() #train_x, train_y
exp.register(pipeline=pipeline, name='LGBM')

In [None]:
exp.model_explain()

In [None]:
exp.model_diagnose()

In [None]:
exp.model_compare()