<a href="https://colab.research.google.com/github/luferIPCA/MIA-MLA-24-25/blob/main/11_XAI.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [None]:
#!Begin

# Masters' in Applied Artificial Intelligence
## Machine Learning Algorithms Course

Notebooks for the MLA course

by [*lufer*](mailto:lufer@ipca.pt)

(ver 2.0)

---



# ML Modelling - Part IV - Explainable Artificial Models

**Contents**:

1. **White-Box versus Black-Box Models**
2. **Case Study**



This notebook promotes Explainable Artiticial Intelligence!

# Explainable Artificial Intelligent models

"...help extract insight and clarity regarding how these algorithms are performing and why one prediction is made over another..."


* White Box Models

  - possible to explain

* Black Box models
  - hard to explain
  - ex: Deep Learning models

# Environment preparation


**Install necessary Libraries**

In [None]:
#Install libraries for trainning for Explainable AI
!pip install lime
!pip install interpret
#!pip install eli5
!pip install shap

**Install necessary libraries**

In [None]:
#Importing Libraries

#general
import pandas as pd
import numpy as np

#for AI model
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import LabelEncoder
from sklearn.ensemble import RandomForestClassifier
import IPython  #visualiztion

#for XAI

#eli5 - models Debugging - require a minor sklearn release
#import eli5
#from eli5 import show_prediction

#shap - SHapley Additive exPlanations (SHAP)
import shap

#lime - Local Interpretable Model-Agnostic Explanations
import lime.lime_tabular

#interpret
from interpret import set_visualize_provider
from interpret.provider import InlineProvider
from interpret.glassbox import ExplainableBoostingClassifier
from interpret import show

In [None]:
import datetime
print(f"Last updated: {datetime.datetime.now()}")

**Mounting Drive**

In [None]:

from google.colab import drive

# it will ask for your google drive credentiaals
drive.mount('/content/gDrive/', force_remount=True)

## Get data

In [None]:
filePath="/content/gDrive/MyDrive/Colab Notebooks/MIA - ML - 2024-2025/Datasets/"
df = pd.read_csv(filePath+"Credit.csv")

In [None]:
#check dataset size
df.shape

In [None]:
df.head()

Get Features and Target

In [None]:
features = df.drop(columns=["class"])
target = df["class"]
# Reassign feature names explicitly (optional, but ensures consistency)
#features = pd.DataFrame(features, columns=features.columns)

#or
#features = df.iloc[:, :-1].values  #except last column
#target = df.iloc[:,-1].values      #last column
#NOTE: in this case, features and target come as a NumPy arrays

In [None]:
#check features
features.dtypes

In [None]:
target

In [None]:
features.shape[1]

Convert all Categorical values in Numerical

In [None]:
#convert clategorial features in numerical
labelencoder = LabelEncoder()
features = features.apply(lambda col: labelencoder.fit_transform(col) if col.dtype == 'object' else col)
features

In [None]:
#convert clategorial target in numerical
#Use a different LabelEncoder for the target
target_encoder = LabelEncoder()
#target=target_encoder.fit_transform(target)
target = pd.Series(target_encoder.fit_transform(target), name="class")  # Restore column name

In [None]:
target

In [None]:
#split Train+Test: 70+30
Xtrain, Xtest, ytrain, ytest = train_test_split(features, target, test_size=0.3)

In [None]:
Xtrain

In [None]:
#create the model
#Xtrain = pd.DataFrame(Xtrain, columns=feature_names)  # Use actual feature names
m = RandomForestClassifier(n_estimators=1000)
m.fit(Xtrain, ytrain)

## Exploring XAI Tools

### Lime

LIME (Local Interpretable Model-agnostic Explanations)

Works mainly for local instances!

In [None]:
Xtest[0:1]

In [None]:
ytest[:1]

In [None]:
#check all column names
list(df)

In [None]:
#check train columns names
Xtrain.columns
#same as
#list(df)[0:20]

In [None]:
#lime require Numpy arrays
#Ensure Xtrain is converted to NumPy array
Xtrain_np = Xtrain.values  # Convert DataFrame to NumPy array

expl = lime.lime_tabular.LimeTabularExplainer(Xtrain_np, feature_names=list(Xtrain.columns),class_names="class")
#prever = lambda x: m.predict_proba(x).astype(float)  #"x" loose column names
prever = lambda x: m.predict_proba(pd.DataFrame(x, columns=Xtrain.columns)).astype(float)

#Explain the first instance - remember that lime is local!
#Ensure the Xtest instance is converted to NumPy array
Xtest_np = Xtest[0:1].values
exp = expl.explain_instance(Xtest_np[0], prever, num_features=5)
exp.show_in_notebook(show_all=True)


### Dalex


**Dalex: moDel Agnostic Language for Exploration and eXplanation**

- Helps to analyze and interpret black-box models by providing tools to understand how features influence predictions.
- Is model agnóstic too
- Supports Global and Local explanation

See [Dalex Explanatory Model Analysis](https://medium.com/@ModelOriented/dalex-v-1-0-and-the-explanatory-model-analysis-419585a4ba91)



---


**The explanaion analysis can me made on Xtrain or Xtest:**

**Analysing Xtest**

*exp = dx.Explainer(model, Xtest, ytest.astype(float))*

- Better for evaluation: How the model behaves on unseen data.
- Checks for overfitting: Explaining test set predictions, shows if the model generalizes well.
- Common practice: To avoid biased conclusions.


**Analysing Xtrain**

*exp = dx.Explainer(model, Xtrain, ytrain.astype(float))*

- Useful for debugging model behavior: How the model learned patterns from training data.
- For feature engineering analysis: Ccan reveal model reliance to new features
- Might overfit explanations: explanations may reflect patterns the model memorized rather than general trends.

---



In [None]:
!pip install dalex

In [None]:
#create the dalex explainer
import dalex as dx

explainer = dx.Explainer(m, Xtrain, ytrain)

In [None]:
#Predicting with the model
# Predict on the test data
predictions = explainer.predict(Xtest)

# check predictions
print(predictions[:10])  # Show first 10 predictions

Global Explanation

*exp.model_parts()*

In [None]:
import warnings

with warnings.catch_warnings():
    warnings.simplefilter("ignore", UserWarning)

# Compute feature importance (global explanation)
global_explanation = explainer.model_parts()

# Plot the global explanation
global_explanation.plot()

Note:

To avoid the warning "RandomForestClassifier was fitted without feature names"
prepare the model differently:

```
# import pandas as pd
features:names=Xtrain.columns
Xtrain = pd.DataFrame(Xtrain, columns=feature_names)  # Use actual feature names
m.fit(Xtrain, ytrain)
```

Instances Explanation

*exp.predict_parts()*

In [None]:
# Select an instance to explain (e.g., the first row in Xtrain)
instance = Xtrain.iloc[0]
#or int the Ttest part
#instance = Xtest.iloc[0]

# Compute local explanation using Shapley values
localExplanation = explainer.predict_parts(instance, type='shap')

#try with different explanation types:
#type='break_down_interactions'
#type='break_down'

# Plot the local explanation
localExplanation.plot()
#try
#localExplanation.result

In [None]:
#predict_profile()
#Analyze how predictions change when varying a single feature while keeping others constant

#explainer = dx.Explainer(m, Xtrain, ytrain)
# ✅ Fix: Convert target variable to float
explainer = dx.Explainer(m, Xtest, ytest.astype(float))  # <-- Fix here

# Generate a profile for "credit_history"
profile = explainer.predict_profile(Xtest, variables="credit_history")

# Plot the profile
profile.plot()

More:

*exp.predict_diagnostic()*

Checking model stability; Understanding prediction distributions: Identifying potential biases

*exp1.predict_diagnostic(exp2*)

Compare diagnostics between two models



### Shap

SHAP (SHapley Additive exPlanations)

- Supports Global and Local explanation


See [How to interpret and explain your machine learning models using SHAP values](https://m.mage.ai/how-to-interpret-and-explain-your-machine-learning-models-using-shap-values-471c2635b78e)

In [None]:
Xtrain.head()

In [None]:
Xtrain.columns

In [None]:
#create the explainer

Xtrain2 = Xtrain.astype(float)  #preserv original Xtrain, converting all values to float. Shpa requires it!
explainer = shap.Explainer(m, Xtrain2)

shap_values = explainer.shap_values(Xtest)    #2d array


In [None]:
#shap_values


In [None]:
#Get Explanations for each class
#column names
columns=Xtrain.columns
shap.summary_plot(shap_values, Xtest, feature_names=columns, plot_type='bar')
#shap.summary_plot(shap_values, Xtest)

shap.force_plot(explainer.expected_value[1], shap_values[1])  #explain class "1"

shap.initjs()
#check for class "0"


Note:

- The char shows the most relevant features
- In this chart, red color means higher value of a feature. Blue means lower value of a feature.
- We can get the general sense of features’ directionality impact based on the distribution of the red and blue dots!

### Interpret

In [None]:
set_visualize_provider(InlineProvider())
ebm = ExplainableBoostingClassifier(feature_names=Xtrain.columns)
ebm.fit(Xtrain, ytrain)
globalExplanation = ebm.explain_global()
show(globalExplanation)

# References

*  [Explainable AI - Understanding and Trusting Machine Learning Models](https://www.datacamp.com/tutorial/explainable-ai-understanding-and-trusting-machine-learning-models)  
*  [Why is explainability important?](https://xai-tutorials.readthedocs.io/en/latest/_xai/importance.html)
*  [SHapley Additive exPlanations (SHAP)](https://xai-tutorials.readthedocs.io/en/latest/_model_agnostic_xai/shap.html)
*  [Local Interpretable Model-Agnostic Explanations (LIME)](https://xai-tutorials.readthedocs.io/en/latest/_model_agnostic_xai/lime.html)
*  [Eli5 (Explain it like I am 5) Model Explainability in Python](https://medium.com/chat-gpt-now-writes-all-my-articles/eli5-explain-it-like-i-am-5-model-explainability-in-python-d4922f021037)
*  [ELI5’s documentation!](https://eli5.readthedocs.io/en/latest/overview.html)
*  [Techniques for Interpreting and Explaining ML Models](https://www.markovml.com/blog/model-interpreting)

Applied

*  [Explanatory Model Analysis (Section 3.2.2)](https://ema.drwhy.ai/)