Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LightGBM Predict Function returns the Logit Base value rather than row probabilities #1969

Closed
sarahboufelja opened this issue Jan 25, 2019 · 9 comments

Comments

@sarahboufelja
Copy link

sarahboufelja commented Jan 25, 2019

Environment info

Operating System: Windows

CPU/GPU model:

C++/Python/R version: Python3

LightGBM version or commit hash: 2.2.2

Error

LightGBM predict method produces log odds Base Value (as indicated in the Shap Forces Graph) rather than probabilities and the output is exaclty the same for all rows. # Reproducible examples

Predicting Probabilities

y_proba = pd.DataFrame(lgbm_ult.predict(test_x_enc, raw_score=False, num_iteration=None, pred_leaf=False, pred_contrib=True))
image

image

@guolinke
Copy link
Collaborator

if you want the probabilities, I think you should set pred_contrib=False

@sarahboufelja
Copy link
Author

Ok thank you guolinke. So when pred_contrib=True, the last columns corresponds to the "Base Estimate" rather than the element-wide prediction, right? I can process the probabilities myself by summing the Shap values and applying a sigmoid function?

@mm5631
Copy link

mm5631 commented Apr 7, 2019

@sarahboufelja Setting pred_contrib=True makes your predict return the shap values with the last item as the expected value (equivalent to shap_values + shap.TreeExplainer(tree_estimator).expected_value).

@slundberg Some inconsistency here in how expected_value is handled in LGBM vs shap (LGBM still includes the expected value in the shap array)

@StrikerRUS
Copy link
Collaborator

@maximemerabet According to the @slundberg's answer here #1350 (comment), it's more likely that new features and updates will be released only in shap package, no update for LightGBM codebase.

@mm5631
Copy link

mm5631 commented Apr 18, 2019

@StrikerRUS Yup that makes sense. I'm more trying to point that lgbm's shap_values are not identical to shap's shap_values. I think some form of documentation around that would be preferred, or simply switch off the LGBM modules if it won't be maintained

@slundberg
Copy link
Contributor

slundberg commented Apr 18, 2019

I think having a working version in LightGBM is good to keep of course. But trying to keep the API's perfectly aligned is tricky, since it requires deprecating LightGBM API's at the moment. Noting in the doc string seems like a good idea.

slundberg added a commit to slundberg/LightGBM that referenced this issue Apr 18, 2019
@StrikerRUS
Copy link
Collaborator

@maximemerabet Thanks for your reasonable thoughts! It seems to me that it can be considered as already documented in the return shape:
https://github.com/Microsoft/LightGBM/blob/beb35d567de899b140bd61e174ef3b9ef5fd0769/python-package/lightgbm/sklearn.py#L598-L599

Disabling completely is not a variant, because someone may want to not install shap package and be happy with existent functionality of let name it "trial" version included in LightGBM. Also, some users will not discover the existence of the shap package at all without this prediction mode.

@StrikerRUS
Copy link
Collaborator

StrikerRUS commented Apr 18, 2019

Oh, while I was typing, a PR has been proposed! Great!

@mm5631
Copy link

mm5631 commented Apr 18, 2019

@StrikerRUS You're absolutely right, retracting my earlier statement :) Thanks @slundberg

StrikerRUS pushed a commit that referenced this issue Apr 19, 2019
* Update doc string for pred_contrib

See comments at the end of #1969

* Update basic.py

* Update basic.py

* update doc strings

* update equals sign in doc string

* strip whitespace and gen rst

* strip whitespace
@lock lock bot locked as resolved and limited conversation to collaborators Mar 11, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants