Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

explain_instance - issue with predict_proba #430

Closed
sreeja-guha opened this issue Feb 11, 2020 · 1 comment
Closed

explain_instance - issue with predict_proba #430

sreeja-guha opened this issue Feb 11, 2020 · 1 comment

Comments

@sreeja-guha
Copy link

sreeja-guha commented Feb 11, 2020

Hello,
Thank you for the LIME codes. I have been trying to use them to explain the predictions of my xgBoost model. The model is trained on Xtrain, which is a dataframe, and predict_proba() too accepts dataframes (I had to transform series, arrays to dataframe for it to work). But explain_instance gives error, and I am unable to fix it.

My code below (including some of the modeling code for reference and the full error in the end):

xgb_v3 = XGBClassifier(learning_rate=0.3, max_depth=1, n_estimators=300, random_state=100, scale_pos_weight=152, nthread=-1)
%time xgb_v3.fit(Xtrain,Ytrain) # Xtrain is the dataframe of features, Ytrain is a series with 1/0 values

proba_train = xgb_v3.predict_proba(Xtrain)[:, 1] # gives prob(Y=1)

features = list(Xtrain.columns)

explainer = lime.lime_tabular.LimeTabularExplainer(training_data=Xtrain.values,mode='classification', training_labels=Ytrain, feature_names=features)

If I try the following, error says training data did not have the following fields: f256, f83, f22, ...

exp = explainer.explain_instance(Xtrain.iloc[0], predict_fn=xgb_v3.predict_proba, num_features=10)

I realized predict_proba is not accepting the series Xtrain.iloc[0], so I converted to a dataframe and checked that this works:

xgb_v3.predict_proba(pd.DataFrame(Xtrain.iloc[0]).transpose())

so i changed my explain_instance code to:

exp = explainer.explain_instance(pd.DataFrame(Xtrain.iloc[0]).transpose(), predict_fn=xgb_v3.predict_proba, num_features=10)

above gives the error: TypeError: unhashable type: 'slice'. Full error message below:

TypeError Traceback (most recent call last)
in
----> 1 exp = explainer.explain_instance(pd.DataFrame(Xtrain.iloc[0]).transpose(), predict_fn=xgb_v3.predict_proba, num_features=10)

~\AppData\Local\Continuum\anaconda3\lib\site-packages\lime\lime_tabular.py in explain_instance(self, data_row, predict_fn, labels, top_labels, num_features, num_samples, distance_metric, model_regressor)
338 # Preventative code: if sparse, convert to csr format if not in csr format already
339 data_row = data_row.tocsr()
--> 340 data, inverse = self.__data_inverse(data_row, num_samples)
341 if sp.sparse.issparse(data):
342 # Note in sparse case we don't subtract mean since data would become dense

~\AppData\Local\Continuum\anaconda3\lib\site-packages\lime\lime_tabular.py in __data_inverse(self, data_row, num_samples)
535 first_row = data_row
536 else:
--> 537 first_row = self.discretizer.discretize(data_row)
538 data[0] = data_row.copy()
539 inverse = data.copy()

~\AppData\Local\Continuum\anaconda3\lib\site-packages\lime\discretize.py in discretize(self, data)
111 else:
112 ret[:, feature] = self.lambdas[feature](
--> 113 ret[:, feature]).astype(int)
114 return ret
115

~\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\frame.py in getitem(self, key)
2686 return self._getitem_multilevel(key)
2687 else:
-> 2688 return self._getitem_column(key)
2689
2690 def _getitem_column(self, key):

~\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\frame.py in _getitem_column(self, key)
2693 # get column
2694 if self.columns.is_unique:
-> 2695 return self._get_item_cache(key)
2696
2697 # duplicate columns & possible reduce dimensionality

~\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\generic.py in _get_item_cache(self, item)
2485 """Return the cached item, item represents a label indexer."""
2486 cache = self._item_cache
-> 2487 res = cache.get(item)
2488 if res is None:
2489 values = self._data.get(item)

TypeError: unhashable type: 'slice'

@marcotcr
Copy link
Owner

marcotcr commented Apr 3, 2020

See any of #293, #168, #120

@marcotcr marcotcr closed this as completed Apr 3, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants