Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regression: Reproducability of Explanations #199

Closed
haimivan opened this issue Jun 13, 2018 · 10 comments
Closed

Regression: Reproducability of Explanations #199

haimivan opened this issue Jun 13, 2018 · 10 comments

Comments

@haimivan
Copy link

Hi,

with this code:

from sklearn.datasets import load_boston
import sklearn.ensemble
import sklearn.model_selection
import numpy as np

random_seed = np.random.seed(42)
boston = load_boston()
rf = sklearn.ensemble.RandomForestRegressor(n_estimators=1000)
train, test, labels_train, labels_test = sklearn.model_selection.train_test_split(boston.data, boston.target, train_size=0.80, test_size=0.20, random_state=random_seed, )
rf.fit(train, labels_train)
categorical_features = np.argwhere(np.array([len(set(boston.data[:, x])) for x in range(boston.data.shape[1])]) <= 10).flatten()
import lime.lime_tabular

explainer = lime.lime_tabular.LimeTabularExplainer(train, feature_names=boston.feature_names,
                                                   class_names=['price'],
                                                   categorical_features=categorical_features,
                                                   verbose=True,
                                                   mode='regression',
                                                   random_state=random_seed
                                                   )
for runner in range(0, 10):
    i = 25
    print("i:", i)
    exp = explainer.explain_instance(test[i], rf.predict, num_features=5, )

    print("exp.as_list():")
    for runner in exp.as_list():
        print(runner[0], "\t", runner[1])
    print("finished")

I would expect, that calling the explanation for the same datapoint several times would lead to the same result. Unfortunately, this is not the case, as I always get different values for Prediction local and the feature weights.

How would it be possible to get reproducable results here?

Thanks in advance!

i: 25
Intercept 26.96624248904287
Prediction_local [15.57582084]
Right: 14.781200000000075
exp.as_list():
LSTAT > 16.37 	 -5.7745992682999345
RM <= 5.89 	 -4.217471761634603
AGE > 93.65 	 -0.5797488651887968
330.00 < TAX <= 666.00 	 -0.4256846067675723
18.70 < PTRATIO <= 20.20 	 -0.392917145233727
finished
i: 25
Intercept 26.91518408347521
Prediction_local [15.74360485]
Right: 14.781200000000075
exp.as_list():
LSTAT > 16.37 	 -5.033509513789926
RM <= 5.89 	 -4.168263639284199
18.70 < PTRATIO <= 20.20 	 -0.7960302258548482
AGE > 93.65 	 -0.7140419338438325
330.00 < TAX <= 666.00 	 -0.4597339207353438
finished
i: 25
Intercept 26.794511198789607
Prediction_local [15.6597915]
Right: 14.781200000000075
exp.as_list():
LSTAT > 16.37 	 -5.568773001130888
RM <= 5.89 	 -4.011766502712668
AGE > 93.65 	 -0.6980400829714188
330.00 < TAX <= 666.00 	 -0.45845666385592176
18.70 < PTRATIO <= 20.20 	 -0.3976834519419227
finished
i: 25
Intercept 26.705471909398856
Prediction_local [16.14207568]
Right: 14.781200000000075
exp.as_list():
LSTAT > 16.37 	 -5.201417169912826
RM <= 5.89 	 -3.887443753132728
18.70 < PTRATIO <= 20.20 	 -0.5607614136256827
CRIM > 2.98 	 -0.5207842475579768
330.00 < TAX <= 666.00 	 -0.3929896411805496
finished
i: 25
Intercept 27.145844938574687
Prediction_local [15.91921247]
Right: 14.781200000000075
exp.as_list():
LSTAT > 16.37 	 -5.457748583385723
RM <= 5.89 	 -4.208402849725137
18.70 < PTRATIO <= 20.20 	 -0.5717724846249358
AGE > 93.65 	 -0.528545356109088
CHAS=0 	 -0.4601631899866295
finished
@chirjeev94
Copy link

I think this is due to the randomness of the perturbations around the test input( the input you need the explanation for). The way to reduce the variance is to calculate the mean feature importance over n(=100) runs.

Please let me know your thoughts.

Thanks,
Chirjeev

@haimivan
Copy link
Author

Hi @chirjeev94 ,

thanks for your reply.

My assumption would be, that setting random_state=random_seed in lime.lime_tabular.LimeTabularExplainer() would avoid randomness.

If this does not referr to a single explanation, I would expect, that I could avoid randomness by supplying the random_seed in explainer.explain_instance(). But this method does not have such a parameter.

I would like to have reproducability e.g. in order to exchange projects with colleagues.

@marcotcr
Copy link
Owner

While setting random_state in the constructor makes it easy to reproduce a whole experiment exactly, it does not mean the random state is reset at each explanation, as you noted.
I don't quite understand the need for this kind of 'reproducibility'. An explanation is always a random artifact, and if it really does matter that some weight is 5.22 rather than 5.21, something is probably off.
I guess we could add random_seed to explain_instance, or you could manually reset it before calling it if it is really important.

@fnasiri-gannett
Copy link

Hello all,

I've recently started using LIME and have run into the same reproduciblity issue. As mentioned above, passing the random_seed argument to lime.lime_tabular.LimeTabularExplainer doesn't seem to do the trick nor does setting np.randon.seed() right before calling explain_instance(). @marcotcr mentioned small differences in weights for multiple runs but in case of my project, I'm getting vastly different results. Some have suggested increasing the number of sample points as a workaround but this can slow things down if you're explaining on a large number of instances. I also tried hard coding the random state in lime_tabular.__data_inverse but no joy.

The way I understand how lime works, is that if random seed is set to a given number, then every time the code is run, the same perturbed points are selected. Therefore, the distance dependent weight of each point and everything else should be the same as well. So we are regressing on the same points using the same parameters every time. I pass on the Sklearn LinearRegression() as the model_regressor (getting the same unstable behavior with the built-in Ridge regressor as well). This is just a simple line going through the selected points. So every time the code is run one should get the same line and the same coefficients. Am I missing something?

@rahimentezari
Copy link

Hello all,

I've recently started using LIME and have run into the same reproduciblity issue. As mentioned above, passing the random_seed argument to lime.lime_tabular.LimeTabularExplainer doesn't seem to do the trick nor does setting np.randon.seed() right before calling explain_instance(). @marcotcr mentioned small differences in weights for multiple runs but in case of my project, I'm getting vastly different results. Some have suggested increasing the number of sample points as a workaround but this can slow things down if you're explaining on a large number of instances. I also tried hard coding the random state in lime_tabular.__data_inverse but no joy.

The way I understand how lime works, is that if random seed is set to a given number, then every time the code is run, the same perturbed points are selected. Therefore, the distance dependent weight of each point and everything else should be the same as well. So we are regressing on the same points using the same parameters every time. I pass on the Sklearn LinearRegression() as the model_regressor (getting the same unstable behavior with the built-in Ridge regressor as well). This is just a simple line going through the selected points. So every time the code is run one should get the same line and the same coefficients. Am I missing something?

The same for me. Did you find the solution?

@fnasiri-gannett
Copy link

I couldn't get it to work with this LIME package so ended up writing my own LIME implementation. Having said that, the correct way to use LIME is to do it in an iterative manner where you increase the number of perturbation points at every iteration until the weights converge. It's computationally expensive and very susceptible to the curse of dimensionality but that's how one is supposed to use it. In that case, you are less likely to run into the problem of different explanations every time you run the code.

@rahimentezari
Copy link

I couldn't get it to work with this LIME package so ended up writing my own LIME implementation. Having said that, the correct way to use LIME is to do it in an iterative manner where you increase the number of perturbation points at every iteration until the weights converge. It's computationally expensive and very susceptible to the curse of dimensionality but that's how one is supposed to use it. In that case, you are less likely to run into the problem of different explanations every time you run the code.

@fnasiri-gannett Dear Farshad, Would you please send me your LIME implementation, which produces the same output(explanations) for same inputs?

@fnasiri-gannett
Copy link

Sadly this was done in a business context and I'm not permitted to share code.

@rahimentezari
Copy link

Sadly this was done in a business context and I'm not permitted to share code.

I see. Would you please give a more detailed hint to solve this problem?

@fnasiri-gannett
Copy link

My code creates a "cloud" of random points around the instance. The cloud is created within an n-sphere (https://en.wikipedia.org/wiki/N-sphere) where "n" is the number of dimensions of your decision space. In a spherical coordinate, you need n-1 angles along with the distance from the origin in order to uniquely define the position of a given point.
So for each of the perturbation points, you need to create n-1 random angles (constrains are in the wiki article) and define its distance from the origin (in your case the instance) at random from 0< <r_max. It's a one-liner with numpy and you can set the random seed beforehand so you'll get the same numbers every time. Once you do so, transfer the perturbation points back to Euclidean coordinates using the formulation in the wikipedia article. Then classify using your trained model and finally run a regression line through them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants