# Interpreting Machine Learning Models using Lime (Local interpretable model-agnostic explanations)




### I have used XGBoostClassifier model to work on "Did it rain in Seattle" dataset.

We start by importing the 3 main basic libraries.

In [1]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline

We then import few sklearn libraries for splitting the dataset and for defining a metrics

In [2]:
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
from xgboost import XGBClassifier # Importing classifier model of xgboost

Importing our dataset

In [3]:
df = pd.read_csv('seattleWeather_1948-2017.csv')
df.head()

Unnamed: 0,DATE,PRCP,TMAX,TMIN,RAIN
0,1948-01-01,0.47,51,42,True
1,1948-01-02,0.59,45,36,True
2,1948-01-03,0.42,45,35,True
3,1948-01-04,0.31,45,34,True
4,1948-01-05,0.17,45,32,True


So the data consists of 4 feature columns and target colummn i.e. RAIN
Our task is to predict if there was RAIN in seattle

In [4]:
df.shape

(25551, 5)

In [5]:
df.isnull().sum() # To check missing values

DATE    0
PRCP    3
TMAX    0
TMIN    0
RAIN    3
dtype: int64

Since we have 25000+ rows , we discard the missing value rows

In [6]:
df.dropna(inplace=True) 
df.isnull().sum()

DATE    0
PRCP    0
TMAX    0
TMIN    0
RAIN    0
dtype: int64

For simplicity I would remove the DATE column

In [7]:
df.pop('DATE')
print("Date column removed")

Date column removed


In [8]:
df.RAIN.replace({True:1,False:0},inplace=True)       # Label encode the target column

In [9]:
df.head()

Unnamed: 0,PRCP,TMAX,TMIN,RAIN
0,0.47,51,42,1
1,0.59,45,36,1
2,0.42,45,35,1
3,0.31,45,34,1
4,0.17,45,32,1


In [10]:
target = df.pop('RAIN')

Splitting the data into train , test with train size = 75% of original data

In [11]:
x_train , x_test , y_train , y_test = train_test_split(df, target, train_size=0.75)

### Creating the model

In [12]:
rfc = RandomForestClassifier()   

NameError: name 'RandomForestClassifier' is not defined

In [None]:
rfc.fit(x_train,y_train)   # Fit the model to the training samples

In [None]:
accuracy_score(y_test,rfc.predict(x_test)) # calculating accuracy

### Lime for explaining the model

### Theory : 
    LIME generates a new dataset consisting of permuted samples and the corresponding predictions of the black box model. On this new dataset LIME then trains an interpretable model, which is weighted by the proximity of the sampled instances to the instance of interest. The interpretable model can be anything from the interpretable models chapter, for example Lasso or a decision tree. The learned model should be a good approximation of the machine learning model predictions locally, but it does not have to be a good global approximation. This kind of accuracy is also called local fidelity.

In [None]:
import lime
from lime import lime_tabular

#### The recipe for training local surrogate models:

    Select instance for which you want to have an explanation of black box prediction.
    Perturb your dataset and get the black box predictions for these new points.
    Weight the new samples according to their proximity to the instance of interest.
    Train a weighted, interpretable model on the dataset with the variations.
    Explain the prediction by interpreting the local model.


Creating a Lime table explainer. 
Parameters - Training sample , Feature names , class names

In [None]:
explainer = lime_tabular.LimeTabularExplainer(x_train.values,feature_names=['PRCP','TMAX','TMIN'],class_names=['False','True'],discretize_continuous=True)

We then call explain_instance() function of explainer we created 

Parameters -  test sample , predict function of model , number of features , top labels to consider

In [None]:
i = np.random.randint(0,x_test.shape[0])
exp = explainer.explain_instance(x_test.iloc[i],rfc.predict_proba,num_features=x_train.shape[1],top_labels=1)

In [None]:
exp.show_in_notebook()     # To display the explanation in notebook

In [None]:
fig = exp.as_pyplot_figure()       # To display the explanation as a plot 

In [None]:
exp.as_list()     # To display the explanation as a list

The important part to remember here is the numbers assigned to the features

They represents the local weights assigned to each feature ,

If we dont consider PRCP , then the predictio for True label would reduce by 0.56 margin

Values of PRCP and TMAX are an indicative of Target to be True
whereas value of TMIN favours False Target

# End