# **Lab 3a - Explainable and Trustworthy AI**


---



**Teaching Assistant**: Eleonora Poeta (eleonora.poeta@polito.it)

**Lab 3a:** Local post-hoc explainable models on structured data - LIME

# **LIME**


---

LIME is a **local surrogate model**. It tests **what happens to the predictions** when you **give variations of your data** into the machine learning model.

The main steps are:

* LIME generates **a new dataset** consisting of **perturbed samples** and the corresponding **predictions** of the black box model.

* On the new dataset → LIME  trains an **interpretable model** (weighted by the proximity of the sampled instances to the instance of interest).

* The learned model should be a **good approximation** of the **machine learning model** predictions **locally**, but it does not have to be a good global approximation.



---
## **Exercise 1:**

The [**Titanic**](https://www.openml.org/search?type=data&sort=runs&id=40945&status=active) dataset describes the survival status of individual passengers on the Titanic. In this exercise you have to:

* **Preprocess** the Titanic dataset. Please, follow these main steps:
> * **Load** the dataset
  * **Split** the dataset into training and test set using the **80/20** ratio. **Shuffle** the dataset and **stratify** it using the target variable.
  * Fill **null** values. `age` column with the mean, `fare` with the median and `embarked` with the most frequent values.
  * **Remove** columns that are *not informative for the final task*, or that *contain information about target variable*.
  * **Encoding**: in this exercise, the encoding of the dataset ***will be different from previous exercises of the past labs.***
    * Follow the **step-by-step procedure** that is written in the Exercise.


* Fit the **RandomForestClassifier()** with `n_estimators=500`
  * Calculate the predictions with `.predict()`
  * Calculate the `accuracy_score()`




## **Solution:**

####Imports

In [436]:
# Import the required libraries for this exercise
import pandas as pd
import numpy as np

from sklearn.model_selection import train_test_split
from sklearn.preprocessing import LabelEncoder, OneHotEncoder, MinMaxScaler
from sklearn.compose import ColumnTransformer
from sklearn.pipeline import Pipeline
from sklearn.ensemble import RandomForestClassifier

#### Data Preprocessing - Until Encoding part

Load the dataset

In [437]:
# Load input features and target variable
dataset = pd.read_csv('./titanic.csv')

Split the dataset - 80/20 train/test ratio.


In [438]:
# Split the dataset. 80% for training data and 20% for test data. Shuffle the dataset and perform stratification by label
df_train, df_test = train_test_split(dataset, train_size = 0.8, test_size = 0.2, shuffle = True, random_state = 42, stratify = dataset["survived"])

Fill Null Values - `age`column

In [439]:
df_train['age'].fillna(df_train['age'].mean(), inplace = True)
df_test['age'].fillna(df_test['age'].mean(), inplace = True)

Fill Null Values - `fare`column

In [440]:
df_train['fare'].fillna(df_train['fare'].median(), inplace = True)
df_test['fare'].fillna(df_test['fare'].median(), inplace = True)

Fill Null Values - `embarked`column

In [441]:
df_train['embarked'].fillna(df_train['embarked'].mode()[0], inplace = True)
df_test['embarked'].fillna(df_test['embarked'].mode()[0], inplace = True)

Drop useless columns - `name`, `ticket`

In [442]:
df_train.drop(columns = ['name', 'ticket'], inplace = True)
df_test.drop(columns = ['name', 'ticket'], inplace = True)

Drop columns that contains info of the target classe (survived) - `cabin` , `body` , `boat` , `home.dest`.

In [443]:
df_train.drop(columns = ['cabin', 'body', 'boat', 'home.dest'], inplace = True)
df_test.drop(columns = ['cabin', 'body', 'boat', 'home.dest'], inplace = True)

Extract target variable and input features for the training and test data

In [444]:
Y_train = df_train['survived']
X_train = df_train.drop('survived', axis = 1)
Y_test = df_test['survived']
X_test = df_test.drop('survived', axis = 1)

In [445]:
X_train

Unnamed: 0,pclass,sex,age,sibsp,parch,fare,embarked
999,3,female,29.604316,0,0,7.7500,Q
392,2,female,24.000000,1,0,27.7208,C
628,3,female,11.000000,4,2,31.2750,S
1165,3,male,25.000000,0,0,7.2250,C
604,3,female,16.000000,0,0,7.6500,S
...,...,...,...,...,...,...,...
1290,3,female,47.000000,1,0,7.0000,S
1103,3,male,2.000000,4,1,39.6875,S
755,3,male,17.000000,2,0,8.0500,S
530,2,male,19.000000,0,0,10.5000,S


### Encoding

Our **LIME explainer** (and most classifiers) takes in **numerical data**, **even if the features are categorical**.


* We thus **transform** **all of the string attributes into integers**, using sklearn's **LabelEncoder**.
* We *use a dictionary to save the correspondence between the integer values and the original strings* so we can present this later in the explanations.

1. **Identify** the **categorical columns** in the dataset and save them into a list.
  * They are the same for training and test data.
  * In this case, both `category` and `object` dtype represent categorical columns.

In [446]:
from os import X_OK
# Display .info() for training and test datasets
X_train.info()
X_test.info()

<class 'pandas.core.frame.DataFrame'>
Index: 1047 entries, 999 to 668
Data columns (total 7 columns):
 #   Column    Non-Null Count  Dtype  
---  ------    --------------  -----  
 0   pclass    1047 non-null   int64  
 1   sex       1047 non-null   object 
 2   age       1047 non-null   float64
 3   sibsp     1047 non-null   int64  
 4   parch     1047 non-null   int64  
 5   fare      1047 non-null   float64
 6   embarked  1047 non-null   object 
dtypes: float64(2), int64(3), object(2)
memory usage: 65.4+ KB
<class 'pandas.core.frame.DataFrame'>
Index: 262 entries, 1028 to 203
Data columns (total 7 columns):
 #   Column    Non-Null Count  Dtype  
---  ------    --------------  -----  
 0   pclass    262 non-null    int64  
 1   sex       262 non-null    object 
 2   age       262 non-null    float64
 3   sibsp     262 non-null    int64  
 4   parch     262 non-null    int64  
 5   fare      262 non-null    float64
 6   embarked  262 non-null    object 
dtypes: float64(2), int64(3), o

In [447]:
# Identify categorical columns in train dataset --- they are the same for test dataset!!
# You have to indicate the index of the categorical columns

categorical_cols = ['sex', 'embarked']



2. Create a dictionary of categorical_names. `categorical_names = {}`

3. Create a dictionary of the LabelEncoders. `le_dict = {}`

4. For each categorical feature, you have to:
  * Instanciate the **LabelEncoder()** from sklearn.  `le = LabelEncoder()`
  *   Fit the **LabelEncoder()** over the categorical feature of interest.
   *  Transform the the categorical feature of interest.
  *   Keep trace of the transformation done as follows: `categorical_names[feature] = le.classes_`
  * Save the label encoder in the dictionary above as follows: `le_dict[feature] = le`
> Do this procedure **only for the train set**. Then, **for the test set**, you will **apply** only `.transform()`
.Rember to use the right label encoder for the right categorical feature that you just saved in the `le_dict`.







In [448]:
categorical_names = {}
le_dict = {}
for feature in categorical_cols:
  le = LabelEncoder()
  le.fit(X_train[feature])
  le.transform(X_train[feature])
  categorical_names[feature] = le.classes_
  le_dict[feature] = le

In [449]:
categorical_names

{'sex': array(['female', 'male'], dtype=object),
 'embarked': array(['C', 'Q', 'S'], dtype=object)}

In [450]:
categorical_names_test = {}
for feature in categorical_cols:
  le_dict[feature].transform(X_test[feature])
  categorical_names_test[feature] = categorical_names[feature]

categorical_names_test

{'sex': array(['female', 'male'], dtype=object),
 'embarked': array(['C', 'Q', 'S'], dtype=object)}

Now, **use a One-hot encoder**, so that our **classifier does not take the categorical features as continuous features.**


---




> ***We will use this encoder only for the classifier***, *not for the explainer* - and the reason is that the **explainer must make sure that a categorical feature only has one value.**



In [451]:
# Identify the numerical columns - you must save  in a list the index of the column!

numerical_columns = ['pclass', 'sibsp', 'parch', 'fare']

1.   Instanciate the **OneHotEncoder()** to encode the categorical variables.
2. Apply the **MinMaxScaler()** to the numerical features.
3. Use the **ColumnTransformer()**.



In [452]:
# Initialize OneHotEncoder
ohe = OneHotEncoder(handle_unknown = 'ignore')

# Initialize MinMaxScaler
minmax_s = MinMaxScaler()

# Create ColumnTransformer
preprocessing = ColumnTransformer(
    [
        ("cat", ohe, categorical_cols),
        ("num", minmax_s, numerical_columns),
    ],
    verbose_feature_names_out=False,
)

# Apply ColumnTransformer to your train data
X_train = pd.DataFrame(preprocessing.fit_transform(X_train), columns = preprocessing.get_feature_names_out())
# Apply ColumnTransformer to your test data
X_test = pd.DataFrame(preprocessing.fit_transform(X_test), columns = preprocessing.get_feature_names_out())

In [453]:
x_train

Unnamed: 0,sex_female,sex_male,embarked_C,embarked_Q,embarked_S,pclass,sibsp,parch,fare
0,1.0,0.0,0.0,1.0,0.0,1.0,0.000,0.000000,0.015127
1,1.0,0.0,1.0,0.0,0.0,0.5,0.125,0.000000,0.054107
2,1.0,0.0,0.0,0.0,1.0,1.0,0.500,0.222222,0.061045
3,0.0,1.0,1.0,0.0,0.0,1.0,0.000,0.000000,0.014102
4,1.0,0.0,0.0,0.0,1.0,1.0,0.000,0.000000,0.014932
...,...,...,...,...,...,...,...,...,...
1042,1.0,0.0,0.0,0.0,1.0,1.0,0.125,0.000000,0.013663
1043,0.0,1.0,0.0,0.0,1.0,1.0,0.500,0.111111,0.077465
1044,0.0,1.0,0.0,0.0,1.0,1.0,0.250,0.000000,0.015713
1045,0.0,1.0,0.0,0.0,1.0,0.5,0.000,0.000000,0.020495


####Fit the RandomForestClassifier with `n_estimators=500`

In [454]:
clf = RandomForestClassifier(random_state = 42, n_estimators = 500)
clf.fit(x_train, Y_train)

Calculate the y_pred with the `.predict() ` function from sklearn

In [455]:
clf.predict(X_test)

array([1, 1, 0, 0, 0, 0, 0, 1, 1, 0, 1, 0, 0, 1, 1, 0, 1, 0, 1, 1, 1, 1,
       0, 0, 1, 0, 1, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 1, 1, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 1, 0, 1, 0, 0, 1, 1,
       1, 1, 0, 1, 0, 0, 1, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 1, 0, 1, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0,
       0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 1, 0, 1, 0, 1, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 1, 1, 0, 0, 0, 1, 1, 1, 0, 1, 0,
       1, 0, 1, 0, 1, 0, 0, 0, 1, 0, 0, 1, 0, 1, 0, 0, 0, 0, 1, 0, 1, 0,
       1, 1, 0, 0, 1, 0, 1, 0, 0, 0, 1, 1, 0, 0, 1, 0, 1, 0, 0, 0, 0, 1,
       0, 1, 0, 0, 1, 1, 0, 0, 1, 0, 0, 1, 1, 0, 0, 0, 0, 1, 1, 0, 0, 0,
       1, 1, 0, 0, 0, 0, 0, 1, 0, 1, 1, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1,
       0, 1, 0, 1, 0, 1, 1, 1, 1, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 1])

Calculate the Accuracy Score

In [456]:
clf.score(X_test, Y_test)

0.7862595419847328



---

## **Exercise 1b:**

Let's now explain the predictions obtained in the Exercise 1a using **LIME**. Before starting the exercise you have to:

* Install the lime library running the following command in a cell `!pip install lime`
* Import the module for tabular data as:
`from lime import lime_tabular`


Then, the goal of this exercise is to explain an individual prediction of interest. To get you started in understanding how the library works, this exercise will be mostly guided. You have to:

* Fix the random seed.
* Instanciate the explainer as:  `explainer = lime_tabular.LimeTabularExplainer`.
  * Read the [documentation](https://lime-ml.readthedocs.io/en/latest/lime.html#module-lime.lime_tabular) and try to understand the role of each parameter.
  * In this case, the prediction function `pred_fn` has to be custom. *Follow the guide in the notebook.*
  * Now, try to explain the `instance i=0 ` with `explainer.explain_instance`.  *What can you infer? What is the predicted class for that instance?*

In [457]:
!pip install lime
from lime import lime_tabular



### **Explaining predictions**

Fix the random seed with `np.random.seed(42)`

In [458]:
np.random.seed(42)

In [459]:
X_train

Unnamed: 0,sex_female,sex_male,embarked_C,embarked_Q,embarked_S,pclass,sibsp,parch,fare
0,1.0,0.0,0.0,1.0,0.0,1.0,0.000,0.000000,0.015127
1,1.0,0.0,1.0,0.0,0.0,0.5,0.125,0.000000,0.054107
2,1.0,0.0,0.0,0.0,1.0,1.0,0.500,0.222222,0.061045
3,0.0,1.0,1.0,0.0,0.0,1.0,0.000,0.000000,0.014102
4,1.0,0.0,0.0,0.0,1.0,1.0,0.000,0.000000,0.014932
...,...,...,...,...,...,...,...,...,...
1042,1.0,0.0,0.0,0.0,1.0,1.0,0.125,0.000000,0.013663
1043,0.0,1.0,0.0,0.0,1.0,1.0,0.500,0.111111,0.077465
1044,0.0,1.0,0.0,0.0,1.0,1.0,0.250,0.000000,0.015713
1045,0.0,1.0,0.0,0.0,1.0,0.5,0.000,0.000000,0.020495


Instanciate the LimeTabularExplainer

In [461]:
explainer = lime_tabular.LimeTabularExplainer(X_train, # here the function requires a numpy array
                                              mode = 'classification',
                                              class_names=['not survived' , 'survived'],
                                              feature_names = X_train.columns,
                                              categorical_features=categorical_cols,
                                              categorical_names=categorical_names,
                                              kernel_width=3,
                                              verbose=True)


InvalidIndexError: (slice(None, None, None), 0)

In [None]:
def predict_fn(x):
  temporary_df = pd.DataFrame(x, columns=X_train.columns, dtype='object') # to apply the ColumnTransformer you have to have a dataframe
  print(temporary_df.head(2))
  transf = ct.transform(temporary_df)
  pred = rf.predict_proba(transf).astype(float)
  return pred

In [None]:
i = 1
exp = explainer.explain_instance(X_test.values[i],
                                 predict_fn,
                                 num_samples=3)
exp.show_in_notebook()



---

## **Exercise 1.c**

**It's time to play with LIME!** 😀


The purpose of this exercise is to make you familiar with the LIME library and make you understand the main features.

* Instanciate **a new LimeTabularExplainer**
* Use the **same predict_fn** as before
* `explain_instance` for the instance `i=1`.
  * **Run** this for **5 times** and **pay attention **to the part about what features and to what extent they contributed to that prediction (explanation).
  * *Did you always obtain the same explanation?* If no, *what is the missing step? *

* Let's now change the parameter num_samples to `num_samples=15`.
  * Can you guess what is the role of this parameter?
* The parameter `num_features` indicates the  maximum number of features present in explanation.
  * Try to vary this number between 1 and 6. Where can you see a change?
* Change the distance parameter to ` distance_metric='l2'`.
  * Where is the distance used?

In [None]:
### Write your code here!