Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix SHAP prediction explanations bug #3221

Merged
merged 5 commits into from
Jan 12, 2022
Merged

Fix SHAP prediction explanations bug #3221

merged 5 commits into from
Jan 12, 2022

Conversation

eccabay
Copy link
Contributor

@eccabay eccabay commented Jan 11, 2022

Closes #3220

Proposed solution is to remove the usage of the logit link function in our explanations altogether. The only difference between the function is what space the final SHAP values are reported in. With the logit function they sit in log-odds space, but between -1 and 1 with the identity function. Due to the lack of significant difference between the two methods, I figure there’s no significant impact in making this change.

@codecov
Copy link

codecov bot commented Jan 11, 2022

Codecov Report

Merging #3221 (0b8dc24) into main (119c8c7) will increase coverage by 0.1%.
The diff coverage is 100.0%.

Impacted file tree graph

@@           Coverage Diff           @@
##            main   #3221     +/-   ##
=======================================
+ Coverage   99.8%   99.8%   +0.1%     
=======================================
  Files        326     326             
  Lines      31388   31395      +7     
=======================================
+ Hits       31297   31304      +7     
  Misses        91      91             
Impacted Files Coverage Δ
...derstanding/prediction_explanations/_algorithms.py 100.0% <ø> (ø)
...s/prediction_explanations_tests/test_algorithms.py 100.0% <100.0%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 119c8c7...0b8dc24. Read the comment docs.

Copy link
Contributor

@freddyaboulton freddyaboulton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me @eccabay ! Thank you for the fix. Can you please run the model understanding perf tests to be sure we're not accidentally introducing a regression?

Copy link
Contributor

@angela97lin angela97lin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thank you @eccabay! Agreed with Freddy's suggestion about running perf tests, and I'm almost tempted to say that perhaps the identity method could be easier to understand than the logit link function :)

Copy link
Contributor

@bchen1116 bchen1116 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed with both above! Nice fix!

Copy link
Contributor

@chukarsten chukarsten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

@eccabay
Copy link
Contributor Author

eccabay commented Jan 12, 2022

@freddyaboulton great suggestion for perf testing! Results look pretty consistent to me, lmk what you think!
report.html.zip

@freddyaboulton
Copy link
Contributor

@eccabay Thanks for running the tests! Send it!

@eccabay eccabay merged commit 76661ac into main Jan 12, 2022
@eccabay eccabay deleted the 91_float-to-nan branch January 12, 2022 21:17
@chukarsten chukarsten mentioned this pull request Jan 18, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

SHAP prediction explanations fail when classification predicts probability of 1
5 participants