Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Displaying the top_k features with largest shap value magnitudes #1374

Merged
merged 7 commits into from
Nov 6, 2020

Conversation

freddyaboulton
Copy link
Contributor

@freddyaboulton freddyaboulton commented Oct 30, 2020

Pull Request Description

Fixes #1360

FYI @RezaAkhtar


After creating the pull request: in order to pass the release_notes_updated check you will need to update the "Future Release" section of docs/source/release_notes.rst to include this pull request by adding :pr:123.

@codecov
Copy link

codecov bot commented Oct 30, 2020

Codecov Report

Merging #1374 (2afa992) into main (4bf2f9b) will decrease coverage by 0.1%.
The diff coverage is 100.0%.

Impacted file tree graph

@@            Coverage Diff            @@
##             main    #1374     +/-   ##
=========================================
- Coverage   100.0%   100.0%   -0.0%     
=========================================
  Files         213      213             
  Lines       13942    13940      -2     
=========================================
- Hits        13935    13933      -2     
  Misses          7        7             
Impacted Files Coverage Δ
...s/prediction_explanations_tests/test_explainers.py 100.0% <ø> (ø)
...tanding/prediction_explanations/_user_interface.py 100.0% <100.0%> (ø)
...ediction_explanations_tests/test_user_interface.py 100.0% <100.0%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 4bf2f9b...2afa992. Read the comment docs.

@freddyaboulton freddyaboulton marked this pull request as ready for review October 30, 2020 17:04
@freddyaboulton freddyaboulton force-pushed the 1360-sort-by-magnitude-of-shap branch 2 times, most recently from d594e60 to 41761ba Compare October 30, 2020 20:21
@freddyaboulton freddyaboulton self-assigned this Oct 30, 2020
Copy link
Contributor

@bchen1116 bchen1116 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@@ -452,7 +453,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.5"
"version": "3.8.3"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this doesnt need to be changed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reverted!


# Sort the features s.t the top_k w the largest shap values are the firt
# top_k elements
tuples = sorted(tuples, key=_neg_shap_value_magnitude)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is really cool!

Copy link
Contributor

@angela97lin angela97lin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, left a small comment about a comment typo and how we can simplify to not need to define _neg_shap_value_magnitude but nothing blocking :d

def _neg_shap_value_magnitude(shap_tuple):
return -abs(shap_tuple[0])

# Sort the features s.t the top_k w the largest shap values are the firt
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo? the top_k w the largest shap values are the firt --> the top_k with the largest shap values are the first

features_to_display = reversed(tuples)
else:
features_to_display = tuples[-top_k:][::-1] + tuples[:top_k][::-1]
def _neg_shap_value_magnitude(shap_tuple):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this done strictly to sort the features w/ largest SHAP values first? If so, maybe we don't need this, and can use:

tuples = sorted(tuples, key=lambda x: x[0], reverse=True)

(Syntax not guaranteed ahaha)

Copy link
Contributor Author

@freddyaboulton freddyaboulton Nov 4, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I originally went with a def instead of a lambda because I thought our linting prevented lambdas but it prevents assigning to a lambda, not all lambdas.

So this is not allowed:

first_value_abs = lambda x: abs(x[0])
tuples = sorted(tuples, key=first_value, reverse=True)

But your suggestion is allowed! I guess the intention is to keep lambdas anonymous which makes sense.

TIL something hehe. I'll update to use the in-line lambda.

@CLAassistant
Copy link

CLAassistant commented Nov 4, 2020

CLA assistant check
All committers have signed the CLA.

@freddyaboulton freddyaboulton force-pushed the 1360-sort-by-magnitude-of-shap branch 4 times, most recently from edd16df to a15f514 Compare November 5, 2020 20:50
@freddyaboulton freddyaboulton merged commit 6a80b40 into main Nov 6, 2020
@@ -5,6 +5,7 @@ Release Notes
* Enhancements
* Added ability to freeze hyperparameters for AutoMLSearch :pr:`1284`
* Added the index id to the ``explain_predictions_best_worst`` output to help users identify which rows in their data are included :pr:`1365`
* The top_k features displayed in ``explain_predictions_*`` functions are now determined by the magnitude of shap values as opposed to the ``top_k`` largest and smallest shap values. :pr:`1374`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

.. warning::

**Breaking Changes**
* The ``top_k`` and ``top_k_features`` parameters in ``explain_predictions_*`` functions now return ``k`` features as opposed to ``2 * k`` features :pr:`1374`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@dsherry dsherry mentioned this pull request Nov 24, 2020
@freddyaboulton freddyaboulton deleted the 1360-sort-by-magnitude-of-shap branch June 3, 2021 14:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

explain_predictions_ functions should return top_k features by magnitude of SHAP value
5 participants