-
Notifications
You must be signed in to change notification settings - Fork 87
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Displaying the top_k features with largest shap value magnitudes #1374
Conversation
Codecov Report
@@ Coverage Diff @@
## main #1374 +/- ##
=========================================
- Coverage 100.0% 100.0% -0.0%
=========================================
Files 213 213
Lines 13942 13940 -2
=========================================
- Hits 13935 13933 -2
Misses 7 7
Continue to review full report at Codecov.
|
d594e60
to
41761ba
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@@ -452,7 +453,7 @@ | |||
"name": "python", | |||
"nbconvert_exporter": "python", | |||
"pygments_lexer": "ipython3", | |||
"version": "3.7.5" | |||
"version": "3.8.3" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this doesnt need to be changed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reverted!
|
||
# Sort the features s.t the top_k w the largest shap values are the firt | ||
# top_k elements | ||
tuples = sorted(tuples, key=_neg_shap_value_magnitude) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is really cool!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, left a small comment about a comment typo and how we can simplify to not need to define _neg_shap_value_magnitude
but nothing blocking :d
def _neg_shap_value_magnitude(shap_tuple): | ||
return -abs(shap_tuple[0]) | ||
|
||
# Sort the features s.t the top_k w the largest shap values are the firt |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo? the top_k w the largest shap values are the firt --> the top_k with the largest shap values are the first
features_to_display = reversed(tuples) | ||
else: | ||
features_to_display = tuples[-top_k:][::-1] + tuples[:top_k][::-1] | ||
def _neg_shap_value_magnitude(shap_tuple): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this done strictly to sort the features w/ largest SHAP values first? If so, maybe we don't need this, and can use:
tuples = sorted(tuples, key=lambda x: x[0], reverse=True)
(Syntax not guaranteed ahaha)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I originally went with a def
instead of a lambda
because I thought our linting prevented lambdas but it prevents assigning to a lambda, not all lambdas.
So this is not allowed:
first_value_abs = lambda x: abs(x[0])
tuples = sorted(tuples, key=first_value, reverse=True)
But your suggestion is allowed! I guess the intention is to keep lambdas anonymous which makes sense.
TIL something hehe. I'll update to use the in-line lambda.
edd16df
to
a15f514
Compare
a15f514
to
080e232
Compare
080e232
to
2afa992
Compare
@@ -5,6 +5,7 @@ Release Notes | |||
* Enhancements | |||
* Added ability to freeze hyperparameters for AutoMLSearch :pr:`1284` | |||
* Added the index id to the ``explain_predictions_best_worst`` output to help users identify which rows in their data are included :pr:`1365` | |||
* The top_k features displayed in ``explain_predictions_*`` functions are now determined by the magnitude of shap values as opposed to the ``top_k`` largest and smallest shap values. :pr:`1374` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
.. warning:: | ||
|
||
**Breaking Changes** | ||
* The ``top_k`` and ``top_k_features`` parameters in ``explain_predictions_*`` functions now return ``k`` features as opposed to ``2 * k`` features :pr:`1374` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
Pull Request Description
Fixes #1360
FYI @RezaAkhtar
After creating the pull request: in order to pass the release_notes_updated check you will need to update the "Future Release" section of
docs/source/release_notes.rst
to include this pull request by adding :pr:123
.