<a href="https://colab.research.google.com/github/jamespaultg/XAI_workshop/blob/master/Explainable_AI_DE_v3_noCat.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Explainable AI workshop
# 0) Check setup for the Explainable AI workshop

Copy the contents of the github repo 'XAI_workshop' to the folder 'XAI_workshop'.
This will copy the data and the notebooks that we will use during the worksop

In [None]:
!rm -r XAI_workshop
!git clone https://github.com/jamespaultg/XAI_workshop/

In [None]:
!pip install lime -q
!pip install shap -q


## 1) Load required libraries and data

In [None]:
# Load required libraries

import lime
import lime.lime_tabular

import pandas as pd
import numpy as np

# For converting textual categories to integer labels 
from sklearn.preprocessing import LabelEncoder#
from sklearn.preprocessing import OneHotEncoder
from sklearn.compose import ColumnTransformer
from sklearn.pipeline import Pipeline
from sklearn.model_selection import train_test_split, GridSearchCV # for creating train test split
from sklearn.metrics import accuracy_score, classification_report
#from sklearn.metrics import accuracy_score, precision_score, recall_score, roc_curve, precision_recall_curve, auc, make_scorer, confusion_matrix, f1_score, fbeta_score, classification_report

# Our algorithms, by from the easiest to the hardest to intepret.
from sklearn.linear_model import LogisticRegression
from sklearn.tree import DecisionTreeClassifier
from sklearn.ensemble import RandomForestClassifier
from xgboost.sklearn import XGBClassifier

import lightgbm as lgb


print('libraries loaded')

In [None]:
# Reading the titanic data
df_titanic = pd.read_csv('XAI_workshop/data/titanic.csv')
assert len(df_titanic) == 891, 'There is an error. Please email james gnanasekaran with the details of the error message you get'
df_titanic.head()


# 2) Data preparation


In [None]:
df_titanic.dtypes

In [None]:
# data preparation
df_titanic.fillna(0,inplace=True)

# label encoding textual data
le = LabelEncoder()
df_titanic['Sex_le'] = le.fit_transform(df_titanic['Sex'])

features = ["PassengerId", "Pclass", "Age", "SibSp", "Parch", "Fare", "Sex_le"]

preprocessor = ColumnTransformer([("numerical", "passthrough", features)])

X = df_titanic[features]
y = df_titanic['Survived']

# using train test split to create validation set
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state = 42, stratify = y)

**!Question general 1!** What is the percentage of people in the data surviving?

# 3) Creating pipelines

In [None]:
# Logistic Regression
lr_model = Pipeline([("preprocessor", preprocessor), 
                     ("model", LogisticRegression(class_weight="balanced", solver="liblinear", random_state=42))])

# Random Forest
rf_model = Pipeline([("preprocessor", preprocessor), 
                     ("model", RandomForestClassifier(class_weight="balanced", n_estimators=100, n_jobs=-1))])

# XGBoost
xgb_model = Pipeline([("preprocessor", preprocessor), 
                      # Add a scale_pos_weight to make it balanced
                      ("model", XGBClassifier(scale_pos_weight=(1 - y.mean()), n_jobs=-1))])

In [None]:
# Logistic Regression
lr_model.fit(X_train, y_train)

y_pred = lr_model.predict(X_test)

print(accuracy_score(y_test, y_pred))
print(classification_report(y_test, y_pred))

In [None]:
# Random Forest
rf_model.fit(X_train, y_train)

y_pred = rf_model.predict(X_test)

print(accuracy_score(y_test, y_pred))
print(classification_report(y_test, y_pred))

In [None]:
# XG boost

xgb_model.fit(X_train, y_train)

y_pred = xgb_model.predict(X_test)

print(accuracy_score(y_test, y_pred))
print(classification_report(y_test, y_pred))

**!Question general 2!** The classification reports provide a lot of numbers, but what do they mean? What is the difference between precision and recall? And what could be a reason that both models show better f1-scores for group 0 than 1? 

# 4) Lime

Lime (Local Interpretable Model-Agnostic Explanations) is setup by following three steps:

1. Create a new explainer
  - my expaliner = Explainer()

2. Select an observation and create an explanation for it
  - observation = np.array([...]) -> note that this is not a pandas dataframe as used in the scikit pipelines
  - my_explanation = explainer.explain_instance(observation, predict_function)

3. Use methods on explantion to visaulise results
  - my_explantion.show_in_notebook()

In [None]:
from lime.lime_tabular import LimeTabularExplainer


explainer = LimeTabularExplainer(X_train.values,
                                 mode = "classification",
                                 feature_names = X_train.columns.values.tolist(),
                                 discretize_continuous = True,
                                 random_state = 42)

In [None]:
## Explanation for the different models
observation_to_explain = 23

observation = X_test.iloc[[observation_to_explain], :].values[0]
observation

In [None]:
# Let write a custom predict_proba functions for our models:
from functools import partial

def custom_predict_proba(X, model):
    X_str = pd.DataFrame(X, columns = X_train.columns)
    return model.predict_proba(X_str)

lr_predict_proba = partial(custom_predict_proba, model=lr_model)
rf_predict_proba = partial(custom_predict_proba, model=rf_model)
xgb_predict_proba = partial(custom_predict_proba, model=xgb_model)

In [None]:
## logistic regression
j=10
explanation = explainer.explain_instance(X_test.values[j], lr_predict_proba, num_features=5)
explanation.show_in_notebook(show_table=True, show_all=False)
print(explanation.local_exp)
print(explanation.score)

## There are three parts to the explanation :
- Left most part gives the prediction probabilities for class 0 and class 1.
- Middle part gives the 5 most important features. As it is an example of binary class we are looking at 2 colours. Attributes having orange colour support the class 1 and those with colour blue support class 0. Sex_le ≤0 means when this feature’s value satisfy this criteria it support class 0. Float point number on the horizontal bars represent the relative importance of these features.
- Right most part follows the same colour coding as the left and the middle part. It contains the actual values of for the top 5 variables.

In [None]:
## XG-boost 

explanation = explainer.explain_instance(observation, xgb_predict_proba, num_features=5)
explanation.show_in_notebook(show_table=True, show_all=False)
print(explanation.score)

**!Question LIME 1!** After examining a few of the observations in the test set, which variables are typically the main drivers behind the different models?  

**!Question LIME 2!** How is Lime different from more general feature importance methods? I.e. when would you use Lime and when would you use feature importance methods?

**!Question LIME 3!**
Why is it necessary to provide the Explainer with the entire table X_train? What is this table used for by the Explainer?

**!Question LIME 4!** Why can't we use the normal predict_proba function? And do we have to use the custom_predict_proba function? 

**!Question LIME 5!** What does the explanation.score depict. And why is it not suprising for the score to be higher for the logistic regression model compared to the XG-boost model?

**EXTRA HARD QUESTION** To start of easy we removed all the catagorical variables from the data, but Lime is able to interpret catagorical variables (as can be seen in the Explainer() documentation). Adjust the whole code such that catagorical variable are taken into account. Start by using the OneHotEncoder  instead of the LabelEncoder.

# 5) Where can I find more resources?

- Link to the original LIME paper 'Why should I trust you' - https://arxiv.org/abs/1602.04938
- Github link : https://github.com/marcotcr/lime

- https://www.oreilly.com/learning/introduction-to-local-interpretable-model-agnostic-explanations-lime
- https://blog.dominodatalab.com/shap-lime-python-libraries-part-2-using-shap-lime/
