How to interpret LIME results? #113

jorgecarleitao · 2017-10-18T12:17:39Z

I am considering using LIME, and I am having some struggle to understand what exactly it outputs.

I posed a question on stack exchange with a MCVE, but maybe this is more suitable here.

Consider the following code, that uses logistic regression to fit a logistic process, and uses LIME for a new example.

import numpy as np
import lime.lime_tabular
from sklearn.linear_model import LogisticRegression

# generate a logistic latent variable from `a` and `b` with coef. 1, 1
data = []
for t in range(100000):
    a = 1 - 2 * np.random.random()
    b = 1 - 2 * np.random.random()
    noise = np.random.logistic()
    c = int(a + b + noise > 0)  # to predict
    data.append([a, b, c])
data = np.array(data)

x = data[:, :-1]
y = data[:, -1]

# fit Logistic regression without regularization (C=inf)
classifier = LogisticRegression(C=1e10)
classifier.fit(x, y)

print(classifier.coef_)

# "explain" with LIME
explainer = lime.lime_tabular.LimeTabularExplainer(
                x, mode='classification',
                feature_names=['a', 'b'])

explanation = explainer.explain_instance(np.array([1, 1]), classifier.predict_proba, num_samples=100000)
print(explanation.as_list())

output:

[[ 0.9981159   0.99478328]]  # print(classifier.coef_)
[('a > 0.50', 0.219), ('b > 0.50', 0.219)] # print(explanation.as_list())

the ~[[1, 1]] is because we are doing logistic regression to a Logistic process with these coefficients.

What do the values 0.219... mean? Are they relatable to any quantity of this example?

The text was updated successfully, but these errors were encountered:

marcotcr · 2017-10-18T16:56:48Z

I think this is a bit confusing because you're using numerical data, but the default parameters in TabularExplainer discretize the data into quartiles. It is harder to interpret explanations for numerical features for the following reasons:

The values may be in different ranges. We can always standardize the data, but then the meaning of the coefficients changes
It's hard to think about double negatives (i.e. negative weight for a negative feature = positive contribution)

Anyway, let's consider the meaning of the explanations in the discretized version.
What ('a > 0.50', 0.219) is saying is that on average (considering the training data distribution), having a in this bucket raises the prediction by 0.219. Consider the following:

import itertools
other_values = np.arange(-1, .49, .01)
current_pred = classifier.predict_proba([1, 1])[0, 1]
current_pred - classifier.predict_proba(np.array(list(itertools.product(other_values, [1]))))[:, 1].mean()
# output: (0.21064350778528229)

Roughly, what I'm doing above is integrating over other values of a while keeping b fixed. On average, if we do that, the output moves by 0.211. Think of doing that for both features, while weighting by locality - that is what the coefficients in the explanation are getting at.

You could set discretize_continuous=False in the LimeTabularExplainer constructor. This example would still be a tricky one, because there are many equivalent linear models that fit the data equally well with different intercepts, and LIME will pick an arbitrary one (so the weights are not necessarily going to be the same, even if the approximation is almost perfect)

jorgecarleitao · 2017-10-19T07:48:20Z

Thanks Marc for the input. It did help.

As a follow-up, here are the results of the relative error of LIME with increasing number of samples, where relative error means the explanation with a given number of samples x, against sampling from the opposite side a < 0.5 (x samples), like you did.

Would you expect the relative error to go to zero? If not, what variables would I need to increase for the error to go to zero? If none, what approximations explain the discrepancy of ~5%?

Figure generated with the code below:

import numpy as np
import lime.lime_tabular
from sklearn.linear_model import LogisticRegression

data = []
for t in range(1000000):
    a = 1 - 2 * np.random.random()
    b = 1 - 2 * np.random.random()
    noise = np.random.logistic()
    c = int(a + b + noise > 0)  # to predict
    data.append([a, b, c])
data = np.array(data)

x = data[:, :-1]
y = data[:, -1]

classifier = LogisticRegression(C=1e10)
classifier.fit(x, y)

print(classifier.coef_)

explainer = lime.lime_tabular.LimeTabularExplainer(x, mode='classification', feature_names=['a', 'b'])

event = np.array([1, 1])

current_pred = classifier.predict_proba(event)[0, 1]

result = []
for samples in [1, 2, 4, 8, 16, 32, 64, 128, 256, 512]:
    samples = samples * 1000
    print(samples)
    # increase number of samples for the explanation
    explanation = explainer.explain_instance(event, classifier.predict_proba, num_samples=samples).as_list()

    # freeze b and sample `a` from the interval `-1 < a < 0.50`
    import itertools
    other_values = -1 + 1.5 * np.random.random(samples)  # a_i from U(-1,0.5)
    other_values = np.array(list(itertools.product(other_values, event[1])))  # as a matrix [[a_1, b], [a_2, b], ...]
    residuals = current_pred - classifier.predict_proba(other_values)[:, 1]

    relative_error = (explanation[0][1] - residuals.mean())/residuals.mean()

    result.append([samples, relative_error])
result = np.array(result)

import matplotlib.pyplot as plt
plt.figure()
plt.plot(result[:, 0], result[:, 1])
plt.ylabel('relative error')
plt.xlabel('samples')
plt.xscale('log')
plt.savefig('test.png')

Output to stdout:

[[ 0.99826413  1.00231008]]
1000
2000
4000
8000
...

marcotcr · 2017-10-19T16:34:49Z

I would not expect the error to go to zero, because the model is using continuous data while LIME is approximating it with the discretized version. Also, there is the locality weighting, i.e. samples near the point being explained are weighted more heavily than samples far away.
The error would go to zero if the model was actually using discretized data and if you set the kernel width to infinity.

jorgecarleitao · 2017-10-20T14:16:50Z

Marcos, thank you for the explanation and for taking the time to read and comment this, much appreciated!

To double-check we understood everything so far, these are the facts so far:

F1: on a simple logistic regression (as defined on the code above), there is a systematic error of ~5% for the event [1,1].

These are the hypothesis on the table:

H1: LIME result of 0.219 in ('a > 0.50', 0.219) above is roughly how much the probability increases when a > 0.5 in comparison to a in other quantiles. [code in your first comment]
H2: the systematic error (F1) is explained by two factors:

LIME is discretizing data
there is a Kernel, so it is not a simple average over the other quartiles

Let's assume that "roughly" in H1 means within 10%. I.e. if LIME's systematic error is within 10% on points on the quartile, then H1 is not rejected.

To test H1, we can repeat the same experiment as we did for F1 on different events. Under H1, the error remains "roughly" small (10%).

Below I show the same errors as before for different events (in the legend, a single run per point):

(the code I used is at the end of this comment, in case someone wants to double-check)

We see that there are events with errors of 120%, way above the 10% threshold. Only the event (1,1) is below 10% (reproducing my first comment). I conclude from this result that the hypothesis H1 is false. In other words, regardless of H2, the hypothesis H1 that LIME result of 0.219 in ('a > 0.50', 0.219) is how much the probability increases when a > 0.5 is not supported by the results in the figure above.

Maybe the interpretation is different? Or do you think that LIME is not applicable for this case? If not, why would you expect it to be applicable in continuous data? (Logistic regression is the simplest classification example I know of...)

Have you tested LIME on this type of examples? I went through the tests folder and haven't found a test on the actual values. I was also not able to find anything on the arxiv paper.

If you think that we should switch to the non-discretized version, please let me know, I would happily repeat this for the non-discrete (with an equivalent test).

import numpy as np
import lime.lime_tabular
from sklearn.linear_model import LogisticRegression

data = []
for t in range(1000000):
    a = 1 - 2 * np.random.random()
    b = 1 - 2 * np.random.random()
    noise = np.random.logistic()
    c = int(a + b + noise > 0)  # to predict
    data.append([a, b, c])
data = np.array(data)

x = data[:, :-1]
y = data[:, -1]

classifier = LogisticRegression(C=1e10)
classifier.fit(x, y)

explainer = lime.lime_tabular.LimeTabularExplainer(x, mode='classification', feature_names=['a', 'b'])

print(classifier.coef_)

import matplotlib.pyplot as plt
plt.figure()
for i in range(1, 6):
    event = np.array([0.5 + 0.1*i, 0.5 + 0.1*i])

    current_pred = classifier.predict_proba(event)[0, 1]

    print(event, current_pred)

    result = []
    for samples in [1, 2, 4, 8, 16, 32, 64, 128, 256, 512]:
        samples = samples * 1000
        print(samples)
        # increase number of samples for the explanation
        explanation = explainer.explain_instance(event, classifier.predict_proba, num_samples=samples).as_list()

        # freeze b and sample `a` from the interval `-1 < a < 0.50`
        import itertools
        other_values = -1 + 1.5 * np.random.random(samples)  # a_i from U(-1,0.5)
        other_values = np.array(list(itertools.product(other_values, [1])))  # as a matrix [[a_1, b], [a_2, b], ...]
        residuals = current_pred - classifier.predict_proba(other_values)[:, 1]

        print(explanation, residuals.mean())

        relative_error = (explanation[0][1] - residuals.mean())/residuals.mean()

        result.append([samples, relative_error])
    result = np.array(result)

    plt.plot(result[:, 0], result[:, 1], 'o-', label='(%.2f, %.2f)' % tuple(event))
plt.ylabel('relative error')
plt.xlabel('samples')
plt.xscale('log', basex=2)
plt.legend()
plt.savefig('test.png')

marcotcr · 2017-10-20T18:13:45Z

Discretization does make everything tricky. In your code, you're computing the residual with respect to the prediction of the event. However, LIME is taking the event to be a > 0.5 and b > 0.5, not two specific values. So, instead of:

current_pred = classifier.predict_proba(event)[0, 1]

We should have

lime_event = (.5 +  .5 * np.random.random(samples * 2)).reshape(-1, 2) # a > .5 and b > .5
current_pred = classifier.predict_proba(lime_event)[:,1].mean()

Also, the value of b for LIME is b > 0.5, so instead of:

other_values = -1 + 1.5 * np.random.random(samples)  # a_i from U(-1,0.5)
other_values = np.array(list(itertools.product(other_values, [1])))  # as a matrix [[a_1, b], [a_2, b], ...]

let's have:

other_values = -1 + 1.5 * np.random.random(samples)  # a_i from U(-1,0.5)
other_b = (.5 +  .5 * np.random.random(samples)) # b_i from U(.5, 1)
other_values = np.vstack((other_values, other_b)).T

These two would explain why your error goes up the further you are from a=1, I think.
Also, explanation.as_list() returns the features in decreasing order of importance, so relative error should be:

relative_error = (dict(explanation)['a > 0.50'] - residuals.mean())/residuals.mean()

Doing these results in a relative error that is ~constant with respect to the events (around 10%).
Anyway, note that the explanation also has an intercept. What I meant by 'roughly' before is that the weight for 'a > 0.50' is going to be close to:
explanation.intercept[1]+ dict(explanation.as_list())['b > 0.50'] - classifier.predict_proba(other_values)[:, 1]

jorgecarleitao · 2017-10-23T13:49:04Z

Thank you @marcotcr for the explanation. That does indeed explain the error above:

(code below)

To summarize: the interpretation of

[('a > 0.50', 0.219), ('b > 0.50', 0.219)]

is

the probability of 1 increases by 0.219 when a in [0.5,1] when compared to a in [-1,0.5], averaged over b in [0.5,1].

Doesn't this imply that LIME result only depends on the quartile that the events belong to? For example, isn't it possible for LIME to provide the same explanation for two events whose outcome is opposite? (e.g. the model gives a different prediction on different values of the same quartile).

import numpy as np
import lime.lime_tabular
from sklearn.linear_model import LogisticRegression

data = []
for t in range(1000000):
    a = 1 - 2 * np.random.random()
    b = 1 - 2 * np.random.random()
    noise = np.random.logistic()
    c = int(a + b + noise > 0)  # to predict
    data.append([a, b, c])
data = np.array(data)

x = data[:, :-1]
y = data[:, -1]

classifier = LogisticRegression(C=1e10)
classifier.fit(x, y)

explainer = lime.lime_tabular.LimeTabularExplainer(x, mode='classification', feature_names=['a', 'b'])

import matplotlib.pyplot as plt
plt.figure()
for i in range(1, 6):
    event = np.array([0.5 + 0.1*i, 0.5 + 0.1*i])
    print(event)

    result = []
    for samples in [1, 2, 4, 8, 16, 32, 64, 128, 256, 512]:
        samples = samples * 1000

        lime_events = (.5 + .5 * np.random.random(samples * 2)).reshape(-1, 2)  # a > .5 and b > .5
        current_pred = classifier.predict_proba(lime_events)[:, 1].mean()
        del lime_events

        # increase number of samples for the explanation
        explanation = explainer.explain_instance(event, classifier.predict_proba, num_samples=samples).as_list()

        # freeze b and sample `a` from the interval `-1 < a < 0.50`
        other_values = -1 + 1.5 * np.random.random(samples)  # a_i from U(-1,0.5)
        other_b = (.5 + .5 * np.random.random(samples))  # b_i from U(.5, 1)
        other_values = np.vstack((other_values, other_b)).T
        residuals = current_pred - classifier.predict_proba(other_values)[:, 1]

        relative_error = (dict(explanation)['a > 0.50'] - residuals.mean())/residuals.mean()

        print(samples, relative_error)
        result.append([samples, relative_error])
    result = np.array(result)

    plt.plot(result[:, 0], result[:, 1], 'o-', label='(%.2f, %.2f)' % tuple(event))
plt.ylabel('relative error')
plt.xlabel('samples')
plt.xscale('log', basex=2)
plt.legend()
plt.savefig('test.png')

marcotcr · 2017-10-30T17:33:34Z

Yes, that is possible. That is a problem with discretization, we lose the ability to differentiate things within the discretized bins. Obvious solutions to this involve using more bins (deciles, entropy-based discretization) or not discretizing at all. Not discretizing at all has its own drawbacks, which is why I decided to leave discretization in as the default.

jorgecarleitao · 2017-10-31T08:24:01Z

@marcotcr. I understand. Regardless of this particular point, LIME remains a useful tool for interpretability. Thank you and the other authors for taking the time to develop and publish it, provide source code to reproduce its results, and thank you especially for clarifying the points raised here. Definitely a great example of how science should be done!

I will close this as resolved.

marcotcr · 2017-10-31T20:10:51Z

Thanks for the thoughtful questions!

jorgecarleitao · 2017-11-03T13:37:17Z

Ok, I returned to this, now for the non-discretized version. Essentially, trying to do the same for the non-discretized version of LIME. My expectation, based on the results from the discretized version, is that LIME approximates the partial derivative of the function in respect to each input. However, I may be mistaken, because I am getting a 20% systematic error between LIME and the partial derivate.

import numpy as np
import lime.lime_tabular
from sklearn.linear_model import LogisticRegression

data = []
for t in range(1000000):
    a = 1 - 2 * np.random.random()
    b = 1 - 2 * np.random.random()
    noise = np.random.logistic()
    c = int(a + b + noise > 0)  # to predict
    data.append([a, b, c])
data = np.array(data)

x = data[:, :-1]
y = data[:, -1]

classifier = LogisticRegression(C=1e10)
classifier.fit(x, y)

print(classifier.coef_)

explainer = lime.lime_tabular.LimeTabularExplainer(x, mode='classification', feature_names=['a', 'b'], discretize_continuous=False)

event = np.array([0.7, 0.7])

current_pred = classifier.predict_proba(event)[0, 1]

result = []
for samples in [1, 2, 4, 8, 16, 32, 64, 128, 256, 512]:
    samples = samples * 1000
    # increase number of samples for the explanation
    explanation = explainer.explain_instance(event, classifier.predict_proba, num_samples=samples).as_list()

    # freeze b and sample `a_i` (`a` + N(0, 0.001), `b`) to compute partial derivatives
    a_i = event[0] + np.random.normal(scale=0.001, size=samples)

    a_i = np.array([[x, event[1]] for x in a_i])  # as an array of events [[a_1, b], [a_2, b], ...]

    # partial derivatives df/da
    d_a = (current_pred - classifier.predict_proba(a_i)[:, 1])/(event[0] - a_i[:, 0])

    # confirmed that d_a is approximatelly d_a1 below, the analytical derivative of predict_proba
    # exp = np.exp(np.dot(event, np.array([1, 1])))
    # d_a1 = 1 * exp / (1 + exp)**2

    relative_error = (dict(explanation)['a'] - d_a.mean())/d_a.mean()

    result.append([samples, relative_error])
result = np.array(result)


import matplotlib.pyplot as plt
plt.figure()
plt.plot(result[:, 0], result[:, 1])
plt.ylabel('relative error')
plt.xlabel('samples')
plt.xscale('log')
plt.savefig('test.png')

marcotcr · 2017-11-21T01:31:46Z

Maybe we should move away from the partial derivative interpretation, and back to the original meaning of an explanation: a linear model that approximates the black box model locally.
The additional complication is that we scale the data inside explainer if data is not discretized.
Thus, for some x, if you take

scaled_x = (x - explainer.scaler.mean_) / explainer.scaler.scale_
fhat = exp.intercept[1] + dict(exp.as_list())['a'] * scaled_x[:, 0] + dict(exp.as_list())['b'] * scaled_x[:, 1]
f = classifier.predict_proba(x)[:, 1]

We should have that (f - fhat).mean() is small, in particular for x that are close to the original instance.
Does this make sense?

jorgecarleitao · 2017-11-22T14:11:10Z

It makes sense.

The reason I approached it from the partial derivatives is that given a point x' = x + h, f(x') - f(x) = (x' - x)^T.Df + O(h^2) (multivariable Taylor series, first order around x). In this view, if the local regressor is a simple linear regression (without lasso), shouldn't the coefficients be equal to the partial derivatives of f?

marcotcr · 2017-11-29T00:31:50Z

Let's call f'(x') the gradient of x'.
Let x be your 'event' above. f(x) is then current_prediction
The taylor expansion gives us the following linear approximation
f(x') = f(x) + f'(x').dot(x' - x)

LIME is trying to find w such that (ignoring the local weighting for now):
f(x') = intercept + w.dot(x')

I don't see why w should be equal to f'(x') in this case. The taylor expansion as an approximation requires us to compute f'(x') for every point we're predicting.

Also, I think f'(x) should be x * exp(x * w) / (exp(x * w) + 1) ^2 , you had exp(x * w) / (exp(x * w) + 1) ^2 if I understood the code correctly.

EoinKenny · 2018-06-21T09:57:48Z

Hi guys, I hope I'm not hijacking this thread, but it's kind of relevant to interpreting LIME.

If I want to print the coefficients that the local LIME model learned, is there a way to do that? Thanks in advance.

marcotcr · 2018-06-30T21:16:56Z

exp.as_list() or exp.as_map()

hanzigs · 2019-06-06T03:42:25Z

I am finding the set of variables list we getting from explainer is keep changing when we re-run the rf_explainer.explain_instance on the same test data, may I please know do we have to set.seed or something, or why the variable importance changes. Thanks

marcotcr · 2019-06-07T23:07:34Z

see #67, #119, #199.

hanzigs · 2019-06-12T06:47:20Z

Hi marcotcr,
Thanks for the reply, I am still getting different values,

#My explainer
model_explainer = lime.lime_tabular.LimeTabularExplainer(X_train.values[:,:], mode='classification',verbose=True, training_labels=data_norm['class'], feature_names=feature_names, random_state=np.random.seed(42))

#My Function
def explain(exp, instance, predict_fn):
  np.random.seed(42)
  exp_data = exp.explain_instance(instance, predict_fn)
  return exp_data.as_list()

#My Call
explain(model_explainer,X_test.values[1], model.predict_proba)

Still i am getting different values

In [146]: explain(model_explainer,X_test.values[1], model.predict_proba)
Intercept 0.9839002793712486
Prediction_local [0.98390028]
Right: 0.9936423124350471
Out[146]: 
[('Dec_Reason_CLA_URNED <= 0.00', 0.0),
 ('Property_Acceptable_Y <= 1.00', 0.0),
 ('Trading_State_NT <= 0.00', 0.0),
 ('Decision_Reason_REQUEST_F <= 0.00', 0.0),
 ('Product_Type_P <= 0.00', 0.0),
 ('Product_Type_I <= 0.00', 0.0),
 ('Fax_Number <= 0.00', 0.0),
 ('Product_Type_D <= 0.00', 0.0),
 ('Valuation_Acceptable_Y <= 1.00', 0.0),
 ('Product_Type_B <= 0.00', 0.0)]

In [147]: explain(model_explainer,X_test.values[1], model.predict_proba)
Intercept 0.9849758158162301
Prediction_local [0.98497582]
Right: 0.9936423124350471
Out[147]: 
[('Product_Type_B <= 0.00', 0.0),
 ('Product_Type_8 <= 0.00', 0.0),
 ('Valuation_Acceptable_Y <= 1.00', 0.0),
 ('Home_Phone <= 0.00', 0.0),
 ('Permanent_Resident_Y <= 1.00', 0.0),
 ('Product_Type_P <= 0.00', 0.0),
 ('Dec_Reason_PRE_CLAPOL <= 0.00', 0.0),
 ('Dec_Reason_FR_ERT <= 0.00', 0.0),
 ('Product_Type_C <= 0.00', 0.0),
 ('Product_Type_I <= 0.00', 0.0)]

Also, I am getting all values as zeros, which I don't understand why, may I have some help please?

nooraliraeeji · 2020-08-17T14:31:14Z

hello
I have a lstm model. it is a classification model.
I want to use explain_instance. what is predict function in explain_predict?
thanks

jorgecarleitao closed this as completed Oct 31, 2017

marcotcr mentioned this issue Jun 11, 2018

LIME discretization? #196

Closed

hanzigs mentioned this issue Jul 6, 2019

LIME explain instance resulting in empty graph #243

Closed

marcotcr mentioned this issue May 29, 2020

Should discretize_continuous affect the explanation? #466

Closed

marcotcr mentioned this issue Jul 17, 2020

Discretization and Categorical features #485

Closed

marcotcr mentioned this issue Oct 9, 2020

What can we interpret the x-axis in the LIME graph in case of regression & classification? #515

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to interpret LIME results? #113

How to interpret LIME results? #113

jorgecarleitao commented Oct 18, 2017 •

edited

marcotcr commented Oct 18, 2017

jorgecarleitao commented Oct 19, 2017 •

edited

marcotcr commented Oct 19, 2017

jorgecarleitao commented Oct 20, 2017 •

edited

marcotcr commented Oct 20, 2017

jorgecarleitao commented Oct 23, 2017

marcotcr commented Oct 30, 2017

jorgecarleitao commented Oct 31, 2017

marcotcr commented Oct 31, 2017

jorgecarleitao commented Nov 3, 2017 •

edited

marcotcr commented Nov 21, 2017

jorgecarleitao commented Nov 22, 2017 •

edited

marcotcr commented Nov 29, 2017

EoinKenny commented Jun 21, 2018 •

edited

marcotcr commented Jun 30, 2018

hanzigs commented Jun 6, 2019 •

edited

marcotcr commented Jun 7, 2019

hanzigs commented Jun 12, 2019 •

edited

nooraliraeeji commented Aug 17, 2020

How to interpret LIME results? #113

How to interpret LIME results? #113

Comments

jorgecarleitao commented Oct 18, 2017 • edited

marcotcr commented Oct 18, 2017

jorgecarleitao commented Oct 19, 2017 • edited

marcotcr commented Oct 19, 2017

jorgecarleitao commented Oct 20, 2017 • edited

marcotcr commented Oct 20, 2017

jorgecarleitao commented Oct 23, 2017

marcotcr commented Oct 30, 2017

jorgecarleitao commented Oct 31, 2017

marcotcr commented Oct 31, 2017

jorgecarleitao commented Nov 3, 2017 • edited

marcotcr commented Nov 21, 2017

jorgecarleitao commented Nov 22, 2017 • edited

marcotcr commented Nov 29, 2017

EoinKenny commented Jun 21, 2018 • edited

marcotcr commented Jun 30, 2018

hanzigs commented Jun 6, 2019 • edited

marcotcr commented Jun 7, 2019

hanzigs commented Jun 12, 2019 • edited

nooraliraeeji commented Aug 17, 2020

jorgecarleitao commented Oct 18, 2017 •

edited

jorgecarleitao commented Oct 19, 2017 •

edited

jorgecarleitao commented Oct 20, 2017 •

edited

jorgecarleitao commented Nov 3, 2017 •

edited

jorgecarleitao commented Nov 22, 2017 •

edited

EoinKenny commented Jun 21, 2018 •

edited

hanzigs commented Jun 6, 2019 •

edited

hanzigs commented Jun 12, 2019 •

edited