# LIME for Model Interpretability

#### One such popular approach is Local Interpretable Model-Agnostic Explanations (LIME), which has been widely adopted to provide model-agnostic local explainability. The LIME Python library is a robust framework that provides human-friendly explanations to tabular, text, and image data and helps in interpreting black-box supervised machine learning algorithms.

- Local fidelity: LIME tries to replicate the behavior of the entire model by exploring the proximity of the data instance being predicted. So, it provides local explainability to the data instance being used for prediction. This is important for any non-technical user to understand the exact reason for the model's decision-making process.
- Global intuition: Although the algorithm provides local explainability, it does try to explain a representative set to the end users, thereby providing a global perspective to the functioning of the model. SP-LIME provides a global understanding of the model by explaining a collection of data instances. This will be covered in more detail in the next section.

In [5]:
import pandas as pd

url = "/Users/maukanmir/Downloads/titanic.csv"

data = pd.read_csv(url)
data

Unnamed: 0,PassengerId,Pclass,Survived,Name,Sex,Age,SibSp,Parch,Ticket,Fare,Cabin,Embarked
0,1,1,1,"Allen,Miss. Elisabeth Walton",female,29,0,0,24160,211.3375,B5,S
1,2,1,1,"Allison,Master. Hudson Trevor",male,0.9167,1,2,113781,151.55,C22 C26,S
2,3,1,0,"Allison,Miss. Helen Loraine",female,2,1,2,113781,151.55,C22 C26,S
3,4,1,0,"Allison,Mr. Hudson Joshua Creighton",male,30,1,2,113781,151.55,C22 C26,S
4,5,1,0,"Allison,Mrs. Hudson J C (Bessie Waldo Daniels)",female,25,1,2,113781,151.55,C22 C26,S
...,...,...,...,...,...,...,...,...,...,...,...,...
1304,1305,3,0,"Zabour,Miss. Hileni",female,14.5,1,0,2665,14.4542,?,C
1305,1306,3,0,"Zabour,Miss. Thamine",female,?,1,0,2665,14.4542,?,C
1306,1307,3,0,"Zakarian,Mr. Mapriededer",male,26.5,0,0,2656,7.225,?,C
1307,1308,3,0,"Zakarian,Mr. Ortin",male,27,0,0,2670,7.225,?,C


In [4]:
import warnings
warnings.filterwarnings('ignore')

import pandas as pd
import numpy as np
import lime
import lime.lime_tabular
from lime import submodular_pick

from xgboost import XGBClassifier
from sklearn.model_selection import train_test_split 
from sklearn.preprocessing import LabelEncoder

In [6]:
data.replace('?',np.NaN, inplace=True)
data.drop(columns=['PassengerId', 'Name', 'Cabin', 'Ticket'], inplace = True)
data.dropna(inplace=True)
data['Age'] = data['Age'].astype('float')
data['Fare'] = data['Fare'].astype('float')

# Label Encoding features 
categorical_feat = ['Sex']

# Using label encoder to transform string categories to integer labels
le = LabelEncoder()
for feat in categorical_feat:
    data[feat] = le.fit_transform(data[feat]).astype('int')
data.head()

data = pd.get_dummies(data, columns=['Embarked'])
data.head()

Unnamed: 0,Pclass,Survived,Sex,Age,SibSp,Parch,Fare,Embarked_C,Embarked_Q,Embarked_S
0,1,1,0,29.0,0,0,211.3375,False,False,True
1,1,1,1,0.9167,1,2,151.55,False,False,True
2,1,0,0,2.0,1,2,151.55,False,False,True
3,1,0,1,30.0,1,2,151.55,False,False,True
4,1,0,0,25.0,1,2,151.55,False,False,True


In [7]:
features = data.drop(columns=['Survived'])
labels = data['Survived']
# Dividing the data into training-test set with 80:20 split ratio
x_train,x_test,y_train,y_test = train_test_split(features,labels,test_size=0.2, random_state=123)

In [8]:
model = XGBClassifier(n_estimators = 300, random_state = 123)
model.fit(x_train, y_train)

In [9]:
model.score(x_test, y_test)

0.7655502392344498

In [10]:
predict_fn = lambda x: model.predict_proba

In [11]:
explainer = lime.lime_tabular.LimeTabularExplainer(data[features.columns].astype(int).values,
                                                   mode='classification',
                                                   class_names=['Did not Survive', 'Survived'],
                                                   training_labels=data['Survived'],
                                                   feature_names=features.columns)

In [13]:
i = 0
exp = explainer.explain_instance(data.loc[i,features.columns].astype(int).values, predict_fn, num_features=5)

AttributeError: 'function' object has no attribute 'shape'