## Chapter 3

### Inherently Interpretable Models - Glassbox models
#### Here we will look at some of the Inherently Interpretable model

##### In this example of a Explainable Boosting Classifier model, we will look at the interpretability aspect added in the model

**Supervised Learning - Explainable Boosting Classifier**

We use the credit-card dataset from huggingface. <br>

This is a Classification problem and we will use a Explainable Boosting Classifier to classify loan eligibility. Explainability is inbuilt into this model and hence the name.<br>


We are using the credit-card datasets from huggingface. More details on the dataset can be found here <br>
 https://huggingface.co/datasets/imodels/credit-card




There are 25 variables:

|Variable                |Description|
|-----------------------|-----------|
|ID|  ID of each client|
|LIMIT_BAL| Amount of given credit in NT dollars (includes individual and family/supplementary credit|
|SEX            | Gender (1=male, 2=female)|
| EDUCATION                       | (1=graduate school, 2=university, 3=high school, 4=others, 5=unknown, 6=unknown)                  |
| MARRIAGE                        | Marital status (1=married, 2=single, 3=others)                                                    |
| AGE                             | Age in years                                                                                      |
| PAY_0                           | Repayment status in September, 2005 (-1=pay duly, 1=payment delay for one month, 2=payment delay for two months, … 8=payment delay for eight months, 9=payment delay for nine months and above) |
| PAY_2                           | Repayment status in August, 2005 (scale same as above)                                            |
| PAY_3                           | Repayment status in July, 2005 (scale same as above)                                              |
| PAY_4                           | Repayment status in June, 2005 (scale same as above)                                              |
| PAY_5                           | Repayment status in May, 2005 (scale same as above)                                               |
| PAY_6                           | Repayment status in April, 2005 (scale same as above)                                             |
| BILL_AMT1                       | Amount of bill statement in September, 2005 (NT dollar)                                           |
| BILL_AMT2                       | Amount of bill statement in August, 2005 (NT dollar)                                              |
| BILL_AMT3                       | Amount of bill statement in July, 2005 (NT dollar)                                                |
| BILL_AMT4                       | Amount of bill statement in June, 2005 (NT dollar)                                                |
| BILL_AMT5                       | Amount of bill statement in May, 2005 (NT dollar)                                                 |
| BILL_AMT6                       | Amount of bill statement in April, 2005 (NT dollar)                                               |
| PAY_AMT1                        | Amount of previous payment in September, 2005 (NT dollar)                                         |
| PAY_AMT2                        | Amount of previous payment in August, 2005 (NT dollar)                                            |
| PAY_AMT3                        | Amount of previous payment in July, 2005 (NT dollar)                                              |
| PAY_AMT4                        | Amount of previous payment in June, 2005 (NT dollar)                                              |
| PAY_AMT5                        | Amount of previous payment in May, 2005 (NT dollar)                                               |
| PAY_AMT6                        | Amount of previous payment in April, 2005 (NT dollar)                                             |
| default.payment.next.month      | Default payment (1=yes, 0=no)                                                                     |

The variable default.payment.next.month is the Target or Label.

**Import the required libraries**

In [1]:

!pip install interpret



In [2]:
!pip install datasets



#### EBM Classifier from the InterpretML library

**Load the huggingface credit-card dataset, split the dataset and train the EBM Classifier, finally use the model for predictions on the test data**

In [7]:
# Import necessary libraries
from datasets import load_dataset
from sklearn.model_selection import train_test_split
from interpret.glassbox import ExplainableBoostingClassifier
import pandas as pd

# Load the credit-card dataset
#If loading fails, use “https://huggingface.co/datasets”
dataset = load_dataset('imodels/credit-card')
df = pd.DataFrame(dataset['train'])

# Split the dataset into features (X) and target (y)
X = df.drop(columns='default.payment.next.month')
y = df['default.payment.next.month'].values

# Split the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Create an EBM classifier
ebm = ExplainableBoostingClassifier(random_state=42)

# Train the classifier
ebm.fit(X_train, y_train)
y_pred = ebm.predict(X_test)



Using the latest cached version of the dataset since imodels/credit-card couldn't be found on the Hugging Face Hub
Found the latest cached dataset configuration 'default' at C:\Users\achakkirala\.cache\huggingface\datasets\imodels___credit-card\default\0.0.0\aa2d71d4fb7c056745552c6b401f626e601f22a4 (last modified on Thu May  9 21:49:16 2024).


**The AUC and accuracy metrics for the predictions are computed and displayed**

In [8]:
from sklearn.metrics import roc_auc_score, accuracy_score

auc = roc_auc_score(y_test, ebm.predict_proba(X_test)[:, 1])
accuracy = accuracy_score(y_test, y_pred)
print("AUC: {:.3f}".format(auc))
print("Accuracy: {:.3f}".format(accuracy))

AUC: 0.779
Accuracy: 0.815


**Using the EBM's inbuilt explain module to get global explanations**

In [9]:
from interpret import show

show(ebm.explain_global())

**Using the EBM's inbuilt explain module to get local explanations**

In [10]:
show(ebm.explain_local(X_test[:5], y_test[:5]), 0)

The graphs are interactive and can be use to derive the insights required. 