Regression: Reproducability of Explanations #199

haimivan · 2018-06-13T15:25:32Z

Hi,

with this code:

from sklearn.datasets import load_boston
import sklearn.ensemble
import sklearn.model_selection
import numpy as np

random_seed = np.random.seed(42)
boston = load_boston()
rf = sklearn.ensemble.RandomForestRegressor(n_estimators=1000)
train, test, labels_train, labels_test = sklearn.model_selection.train_test_split(boston.data, boston.target, train_size=0.80, test_size=0.20, random_state=random_seed, )
rf.fit(train, labels_train)
categorical_features = np.argwhere(np.array([len(set(boston.data[:, x])) for x in range(boston.data.shape[1])]) <= 10).flatten()
import lime.lime_tabular

explainer = lime.lime_tabular.LimeTabularExplainer(train, feature_names=boston.feature_names,
                                                   class_names=['price'],
                                                   categorical_features=categorical_features,
                                                   verbose=True,
                                                   mode='regression',
                                                   random_state=random_seed
                                                   )
for runner in range(0, 10):
    i = 25
    print("i:", i)
    exp = explainer.explain_instance(test[i], rf.predict, num_features=5, )

    print("exp.as_list():")
    for runner in exp.as_list():
        print(runner[0], "\t", runner[1])
    print("finished")

I would expect, that calling the explanation for the same datapoint several times would lead to the same result. Unfortunately, this is not the case, as I always get different values for Prediction local and the feature weights.

How would it be possible to get reproducable results here?

Thanks in advance!

i: 25
Intercept 26.96624248904287
Prediction_local [15.57582084]
Right: 14.781200000000075
exp.as_list():
LSTAT > 16.37 	 -5.7745992682999345
RM <= 5.89 	 -4.217471761634603
AGE > 93.65 	 -0.5797488651887968
330.00 < TAX <= 666.00 	 -0.4256846067675723
18.70 < PTRATIO <= 20.20 	 -0.392917145233727
finished
i: 25
Intercept 26.91518408347521
Prediction_local [15.74360485]
Right: 14.781200000000075
exp.as_list():
LSTAT > 16.37 	 -5.033509513789926
RM <= 5.89 	 -4.168263639284199
18.70 < PTRATIO <= 20.20 	 -0.7960302258548482
AGE > 93.65 	 -0.7140419338438325
330.00 < TAX <= 666.00 	 -0.4597339207353438
finished
i: 25
Intercept 26.794511198789607
Prediction_local [15.6597915]
Right: 14.781200000000075
exp.as_list():
LSTAT > 16.37 	 -5.568773001130888
RM <= 5.89 	 -4.011766502712668
AGE > 93.65 	 -0.6980400829714188
330.00 < TAX <= 666.00 	 -0.45845666385592176
18.70 < PTRATIO <= 20.20 	 -0.3976834519419227
finished
i: 25
Intercept 26.705471909398856
Prediction_local [16.14207568]
Right: 14.781200000000075
exp.as_list():
LSTAT > 16.37 	 -5.201417169912826
RM <= 5.89 	 -3.887443753132728
18.70 < PTRATIO <= 20.20 	 -0.5607614136256827
CRIM > 2.98 	 -0.5207842475579768
330.00 < TAX <= 666.00 	 -0.3929896411805496
finished
i: 25
Intercept 27.145844938574687
Prediction_local [15.91921247]
Right: 14.781200000000075
exp.as_list():
LSTAT > 16.37 	 -5.457748583385723
RM <= 5.89 	 -4.208402849725137
18.70 < PTRATIO <= 20.20 	 -0.5717724846249358
AGE > 93.65 	 -0.528545356109088
CHAS=0 	 -0.4601631899866295
finished

The text was updated successfully, but these errors were encountered:

chirjeev94 · 2018-06-13T16:26:48Z

I think this is due to the randomness of the perturbations around the test input( the input you need the explanation for). The way to reduce the variance is to calculate the mean feature importance over n(=100) runs.

Please let me know your thoughts.

Thanks,
Chirjeev

haimivan · 2018-06-13T17:54:17Z

Hi @chirjeev94 ,

thanks for your reply.

My assumption would be, that setting random_state=random_seed in lime.lime_tabular.LimeTabularExplainer() would avoid randomness.

If this does not referr to a single explanation, I would expect, that I could avoid randomness by supplying the random_seed in explainer.explain_instance(). But this method does not have such a parameter.

I would like to have reproducability e.g. in order to exchange projects with colleagues.

marcotcr · 2018-06-15T22:41:30Z

While setting random_state in the constructor makes it easy to reproduce a whole experiment exactly, it does not mean the random state is reset at each explanation, as you noted.
I don't quite understand the need for this kind of 'reproducibility'. An explanation is always a random artifact, and if it really does matter that some weight is 5.22 rather than 5.21, something is probably off.
I guess we could add random_seed to explain_instance, or you could manually reset it before calling it if it is really important.

fnasiri-gannett · 2019-04-30T18:34:50Z

Hello all,

I've recently started using LIME and have run into the same reproduciblity issue. As mentioned above, passing the random_seed argument to lime.lime_tabular.LimeTabularExplainer doesn't seem to do the trick nor does setting np.randon.seed() right before calling explain_instance(). @marcotcr mentioned small differences in weights for multiple runs but in case of my project, I'm getting vastly different results. Some have suggested increasing the number of sample points as a workaround but this can slow things down if you're explaining on a large number of instances. I also tried hard coding the random state in lime_tabular.__data_inverse but no joy.

The way I understand how lime works, is that if random seed is set to a given number, then every time the code is run, the same perturbed points are selected. Therefore, the distance dependent weight of each point and everything else should be the same as well. So we are regressing on the same points using the same parameters every time. I pass on the Sklearn LinearRegression() as the model_regressor (getting the same unstable behavior with the built-in Ridge regressor as well). This is just a simple line going through the selected points. So every time the code is run one should get the same line and the same coefficients. Am I missing something?

rahimentezari · 2019-10-28T10:30:41Z

Hello all,

I've recently started using LIME and have run into the same reproduciblity issue. As mentioned above, passing the random_seed argument to lime.lime_tabular.LimeTabularExplainer doesn't seem to do the trick nor does setting np.randon.seed() right before calling explain_instance(). @marcotcr mentioned small differences in weights for multiple runs but in case of my project, I'm getting vastly different results. Some have suggested increasing the number of sample points as a workaround but this can slow things down if you're explaining on a large number of instances. I also tried hard coding the random state in lime_tabular.__data_inverse but no joy.

The way I understand how lime works, is that if random seed is set to a given number, then every time the code is run, the same perturbed points are selected. Therefore, the distance dependent weight of each point and everything else should be the same as well. So we are regressing on the same points using the same parameters every time. I pass on the Sklearn LinearRegression() as the model_regressor (getting the same unstable behavior with the built-in Ridge regressor as well). This is just a simple line going through the selected points. So every time the code is run one should get the same line and the same coefficients. Am I missing something?

The same for me. Did you find the solution?

fnasiri-gannett · 2019-10-28T13:03:50Z

I couldn't get it to work with this LIME package so ended up writing my own LIME implementation. Having said that, the correct way to use LIME is to do it in an iterative manner where you increase the number of perturbation points at every iteration until the weights converge. It's computationally expensive and very susceptible to the curse of dimensionality but that's how one is supposed to use it. In that case, you are less likely to run into the problem of different explanations every time you run the code.

rahimentezari · 2019-10-29T16:45:20Z

I couldn't get it to work with this LIME package so ended up writing my own LIME implementation. Having said that, the correct way to use LIME is to do it in an iterative manner where you increase the number of perturbation points at every iteration until the weights converge. It's computationally expensive and very susceptible to the curse of dimensionality but that's how one is supposed to use it. In that case, you are less likely to run into the problem of different explanations every time you run the code.

@fnasiri-gannett Dear Farshad, Would you please send me your LIME implementation, which produces the same output(explanations) for same inputs?

fnasiri-gannett · 2019-10-29T16:54:25Z

Sadly this was done in a business context and I'm not permitted to share code.

rahimentezari · 2019-10-29T16:59:52Z

Sadly this was done in a business context and I'm not permitted to share code.

I see. Would you please give a more detailed hint to solve this problem?

fnasiri-gannett · 2019-10-29T17:14:15Z

My code creates a "cloud" of random points around the instance. The cloud is created within an n-sphere (https://en.wikipedia.org/wiki/N-sphere) where "n" is the number of dimensions of your decision space. In a spherical coordinate, you need n-1 angles along with the distance from the origin in order to uniquely define the position of a given point.
So for each of the perturbation points, you need to create n-1 random angles (constrains are in the wiki article) and define its distance from the origin (in your case the instance) at random from 0< <r_max. It's a one-liner with numpy and you can set the random seed beforehand so you'll get the same numbers every time. Once you do so, transfer the perturbation points back to Euclidean coordinates using the formulation in the wikipedia article. Then classify using your trained model and finally run a regression line through them.

marcotcr closed this as completed Jun 15, 2018

marcotcr mentioned this issue Sep 14, 2018

Non-deterministic output even setting random_state #231

Closed

marcotcr mentioned this issue Mar 21, 2019

Not getting consistent result for MLPRegression #307

Closed

marcotcr mentioned this issue Jun 7, 2019

How to interpret LIME results? #113

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Regression: Reproducability of Explanations #199

Regression: Reproducability of Explanations #199

haimivan commented Jun 13, 2018

chirjeev94 commented Jun 13, 2018

haimivan commented Jun 13, 2018

marcotcr commented Jun 15, 2018

fnasiri-gannett commented Apr 30, 2019

rahimentezari commented Oct 28, 2019

fnasiri-gannett commented Oct 28, 2019

rahimentezari commented Oct 29, 2019

fnasiri-gannett commented Oct 29, 2019

rahimentezari commented Oct 29, 2019

fnasiri-gannett commented Oct 29, 2019

Regression: Reproducability of Explanations #199

Regression: Reproducability of Explanations #199

Comments

haimivan commented Jun 13, 2018

chirjeev94 commented Jun 13, 2018

haimivan commented Jun 13, 2018

marcotcr commented Jun 15, 2018

fnasiri-gannett commented Apr 30, 2019

rahimentezari commented Oct 28, 2019

fnasiri-gannett commented Oct 28, 2019

rahimentezari commented Oct 29, 2019

fnasiri-gannett commented Oct 29, 2019

rahimentezari commented Oct 29, 2019

fnasiri-gannett commented Oct 29, 2019