# **Lab 3a - Explainable and Trustworthy AI**


---



**Teaching Assistant**: Eleonora Poeta (eleonora.poeta@polito.it)

**Lab 3a:** Local post-hoc explainable models on structured data - LIME

# **LIME**


---

LIME is a **local surrogate model**. It tests **what happens to the predictions** when you **give variations of your data** into the machine learning model.

The main steps are:

* LIME generates **a new dataset** consisting of **perturbed samples** and the corresponding **predictions** of the black box model.

* On the new dataset â†’ LIME  trains an **interpretable model** (weighted by the proximity of the sampled instances to the instance of interest).

* The learned model should be a **good approximation** of the **machine learning model** predictions **locally**, but it does not have to be a good global approximation.



---
## **Exercise 1:**

The [**Titanic**](https://www.openml.org/search?type=data&sort=runs&id=40945&status=active) dataset describes the survival status of individual passengers on the Titanic. In this exercise you have to:

* **Preprocess** the Titanic dataset. Please, follow these main steps:
> * **Load** the dataset
  * **Split** the dataset into training and test set using the **80/20** ratio. **Shuffle** the dataset and **stratify** it using the target variable.
  * Fill **null** values. `age` column with the mean, `fare` with the median and `embarked` with the most frequent values.
  * **Remove** columns that are *not informative for the final task*, or that *contain information about target variable*.
  * **Encoding**: in this exercise, the encoding of the dataset ***will be different from previous exercises of the past labs.***
    * Follow the **step-by-step procedure** that is written in the Exercise.


* Fit the **RandomForestClassifier()** with `n_estimators=500`
  * Calculate the predictions with `.predict()`
  * Calculate the `accuracy_score()`




## **Solution:**

In [None]:
import pandas as pd
df = pd.read_csv('titanic.csv')
df.head()

Unnamed: 0,pclass,name,sex,age,sibsp,parch,ticket,fare,cabin,embarked,boat,body,home.dest,survived
0,1,"Allen, Miss. Elisabeth Walton",female,29.0,0,0,24160,211.3375,B5,S,2.0,,"St Louis, MO",1
1,1,"Allison, Master. Hudson Trevor",male,0.9167,1,2,113781,151.55,C22 C26,S,11.0,,"Montreal, PQ / Chesterville, ON",1
2,1,"Allison, Miss. Helen Loraine",female,2.0,1,2,113781,151.55,C22 C26,S,,,"Montreal, PQ / Chesterville, ON",0
3,1,"Allison, Mr. Hudson Joshua Creighton",male,30.0,1,2,113781,151.55,C22 C26,S,,135.0,"Montreal, PQ / Chesterville, ON",0
4,1,"Allison, Mrs. Hudson J C (Bessie Waldo Daniels)",female,25.0,1,2,113781,151.55,C22 C26,S,,,"Montreal, PQ / Chesterville, ON",0


In [None]:
df.drop(['name','ticket', 'cabin','boat', 'body', 'home.dest'], axis=1, inplace=True)

In [None]:
from sklearn.model_selection import train_test_split

# Split the data into train and test sets
X_train, X_test, y_train, y_test = train_test_split(df.drop('survived', axis=1), df['survived'], test_size=0.2, random_state=42, shuffle=True, stratify=df['survived'])

age_mean = X_train['age'].mean()
fare_median = X_train['fare'].median()
most_frequent_embarked = X_train['embarked'].mode()[0]

In [None]:
# Working with missing values
X_train['age'].fillna(age_mean, inplace=True)
X_train['fare'].fillna(fare_median, inplace=True)
X_train['embarked'].fillna(most_frequent_embarked, inplace=True)

X_test['age'].fillna(age_mean, inplace=True)
X_test['fare'].fillna(fare_median, inplace=True)
X_test['embarked'].fillna(most_frequent_embarked, inplace=True)

In [None]:
X_train.head()

Unnamed: 0,pclass,sex,age,sibsp,parch,fare,embarked
999,3,female,29.604316,0,0,7.75,Q
392,2,female,24.0,1,0,27.7208,C
628,3,female,11.0,4,2,31.275,S
1165,3,male,25.0,0,0,7.225,C
604,3,female,16.0,0,0,7.65,S


In [None]:
X_test.head()

Unnamed: 0,pclass,sex,age,sibsp,parch,fare,embarked
1028,3,female,29.604316,1,0,24.15,Q
1121,3,male,29.604316,1,1,22.3583,C
1155,3,male,29.604316,0,0,7.775,S
1251,3,male,30.5,0,0,8.05,S
721,3,male,36.0,0,0,7.4958,S


In [None]:
from sklearn.compose import ColumnTransformer
from sklearn.preprocessing import OneHotEncoder, StandardScaler, MinMaxScaler, FunctionTransformer
from sklearn.impute import SimpleImputer
from sklearn.pipeline import Pipeline
import numpy as np

column_transformer = ColumnTransformer(
    transformers=[
        ('pclass_one_hot', OneHotEncoder(), ['pclass']),
        ('sex_one_hot', OneHotEncoder(), ['sex']),
        ('age_scale', MinMaxScaler(), ['age']),
        ('sibsp_scale', MinMaxScaler(), ['sibsp']),
        ('parch_scale', MinMaxScaler(), ['parch']),
        ('fare_log_scale', Pipeline(steps=[
            ('log', FunctionTransformer(np.log1p, validate=False)),  # Apply log transformation
            ('scale', MinMaxScaler())]), ['fare']),
        ('embarked_one_hot', OneHotEncoder(), ['embarked']),
    ])


In [None]:
column_transformer.fit(X_train)

In [None]:
transformed_Xtrain = column_transformer.transform(X_train)
transformed_Xtest = column_transformer.transform(X_test)

### Encoding

Our **LIME explainer** (and most classifiers) takes in **numerical data**, **even if the features are categorical**.


* We thus **transform** **all of the string attributes into integers**, using sklearn's **LabelEncoder**.
* We *use a dictionary to save the correspondence between the integer values and the original strings* so we can present this later in the explanations.

1. **Identify** the **categorical columns** in the dataset and save them into a list.
  * They are the same for training and test data.
  * In this case, both `category` and `object` dtype represent categorical columns.

In [None]:
# Display .info() for training and test datasets

In [None]:
# Identify categorical columns in train dataset --- they are the same for test dataset!!
# You have to indicate the index of the categorical columns

### Write your code here!



2. Create a dictionary of categorical_names. `categorical_names = {}`

3. Create a dictionary of the LabelEncoders. `le_dict = {}`

4. For each categorical feature, you have to:
  * Instanciate the **LabelEncoder()** from sklearn.  `le = LabelEncoder()`
  *   Fit the **LabelEncoder()** over the categorical feature of interest.
   *  Transform the the categorical feature of interest.
  *   Keep trace of the transformation done as follows: `categorical_names[feature] = le.classes_`
  * Save the label encoder in the dictionary above as follows: `le_dict[feature] = le`
> Do this procedure **only for the train set**. Then, **for the test set**, you will **apply** only `.transform()`
.Rember to use the right label encoder for the right categorical feature that you just saved in the `le_dict`.







In [None]:
categorical_names = {}
le_dict = {}
for feature in categorical_cols:
  ## Continue with your code here

In [None]:
categorical_names

In [None]:
categorical_names_test = {}
for feature in categorical_cols:
  ## Continue with your code here

Now, **use a One-hot encoder**, so that our **classifier does not take the categorical features as continuous features.**


---




> ***We will use this encoder only for the classifier***, *not for the explainer* - and the reason is that the **explainer must make sure that a categorical feature only has one value.**



In [None]:
# Identify the numerical columns - you must save  in a list the index of the column!

### Write your code here!

1.   Instanciate the **OneHotEncoder()** to encode the categorical variables.
2. Apply the **MinMaxScaler()** to the numerical features.
3. Use the **ColumnTransformer()**.



In [None]:
# Initialize OneHotEncoder

# Initialize MinMaxScaler


# Create ColumnTransformer

# Apply ColumnTransformer to your train data

# Apply ColumnTransformer to your test data

####Fit the RandomForestClassifier with `n_estimators=500`

In [None]:
### Write your code here!

Calculate the y_pred with the `.predict() ` function from sklearn

In [None]:
### Write your code here!

Calculate the Accuracy Score

In [None]:
### Write your code here!



---

## **Exercise 1b:**

Let's now explain the predictions obtained in the Exercise 1a using **LIME**. Before starting the exercise you have to:

* Install the lime library running the following command in a cell `!pip install lime`
* Import the module for tabular data as:
`from lime import lime_tabular`


Then, the goal of this exercise is to explain an individual prediction of interest. To get you started in understanding how the library works, this exercise will be mostly guided. You have to:

* Fix the random seed.
* Instanciate the explainer as:  `explainer = lime_tabular.LimeTabularExplainer`.
  * Read the [documentation](https://lime-ml.readthedocs.io/en/latest/lime.html#module-lime.lime_tabular) and try to understand the role of each parameter.
  * In this case, the prediction function `pred_fn` has to be custom. *Follow the guide in the notebook.*
  * Now, try to explain the `instance i=0 ` with `explainer.explain_instance`.  *What can you infer? What is the predicted class for that instance?*

In [None]:
!pip install lime
from lime import lime_tabular

### **Explaining predictions**

Fix the random seed with `np.random.seed(42)`

In [None]:
### Write your code here!

Instanciate the LimeTabularExplainer

In [None]:
explainer = lime_tabular.LimeTabularExplainer(X_train.values, # here the function requires a numpy array
                                              mode = 'classification',
                                              class_names=['not survived' , 'survived'],
                                              feature_names = X_train.columns,
                                              categorical_features=categorical_cols,
                                              categorical_names=categorical_names,
                                              kernel_width=3,
                                              verbose=True)


In [None]:
def predict_fn(x):
  temporary_df = pd.DataFrame(x, columns=X_train.columns, dtype='object') # to apply the ColumnTransformer you have to have a dataframe
  print(temporary_df.head(2))
  transf = ct.transform(temporary_df)
  pred = rf.predict_proba(transf).astype(float)
  return pred

In [None]:
i = 1
exp = explainer.explain_instance(X_test.values[i],
                                 predict_fn,
                                 num_samples=3)
exp.show_in_notebook()



---

## **Exercise 1.c**

**It's time to play with LIME!** ðŸ˜€


The purpose of this exercise is to make you familiar with the LIME library and make you understand the main features.

* Instanciate **a new LimeTabularExplainer**
* Use the **same predict_fn** as before
* `explain_instance` for the instance `i=1`.
  * **Run** this for **5 times** and **pay attention **to the part about what features and to what extent they contributed to that prediction (explanation).
  * *Did you always obtain the same explanation?* If no, *what is the missing step? *

* Let's now change the parameter num_samples to `num_samples=15`.
  * Can you guess what is the role of this parameter?
* The parameter `num_features` indicates the  maximum number of features present in explanation.
  * Try to vary this number between 1 and 6. Where can you see a change?
* Change the distance parameter to ` distance_metric='l2'`.
  * Where is the distance used?

In [None]:
### Write your code here!