-
Notifications
You must be signed in to change notification settings - Fork 92
Prediction Explanations: Force Plots #2157
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
6ef2baa to
b49165d
Compare
Codecov Report
@@ Coverage Diff @@
## main #2157 +/- ##
=======================================
+ Coverage 99.7% 99.7% +0.1%
=======================================
Files 281 283 +2
Lines 25057 25233 +176
=======================================
+ Hits 24957 25133 +176
Misses 100 100
Continue to review full report at Codecov.
|
ec4e02b to
94d9241
Compare
ff6e290 to
0389f5f
Compare
| features (pd.DataFrame): Dataframe of features - needs to correspond to data the pipeline was fit on. | ||
| training_data (pd.DataFrame): Training data the pipeline was fit on. | ||
| For non-tree estimators, we need a sample of training data for the KernelSHAP algorithm. | ||
| return_explainer (bool): Whether to return the explainer used in the SHAP computation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The intent behind this is to calculate the expected value.
| for d in data: | ||
| if "expected_value" in d.keys(): | ||
| del(d["expected_value"]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the similar style as drill down.
0389f5f to
44ed95a
Compare
f64c411 to
465d464
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@chukarsten Thank you for making the changes! I think one clean-up I'd like to do is remove the _include_expected_value parameter since it's not used by the forceplot code and it doesn't seem to change the output of the explainers.
I left a comment about adding some tests for pipelines that use OHE/Text Featurizers so that we check the features are aggregated. Let me know what you think about this change.
I also think there may be a bug with the jupyter display? The following code raises TypeError: Object of type Timestamp is not JSON serializable:
from evalml.demos import load_fraud
from evalml.pipelines import BinaryClassificationPipeline
from evalml.model_understanding import explain_predictions
from evalml.model_understanding.force_plots import graph_force_plot
import pytest
X, y = load_fraud(1000)
pipeline = BinaryClassificationPipeline(["Text Featurization Component", "DateTime Featurization Component",
"One Hot Encoder", "Random Forest Classifier"])
pipeline.fit(X, y)
results = graph_force_plot(pipeline, rows_to_explain=[1],
training_data=X, y=y)
for result in results:
for cls in result:
print("Class:", cls)
display(result[cls]["plot"])I think the issue is that the input features are timestamps:
I say we don't block merge on that bug though cause it's coming from the display call. My hypothesis is that it might work if users use their own code to display the data from force_plot. I think we should file a spike issue to investigate further. Let me know what you think.
evalml/model_understanding/prediction_explanations/_user_interface.py
Outdated
Show resolved
Hide resolved
evalml/model_understanding/prediction_explanations/_user_interface.py
Outdated
Show resolved
Hide resolved
evalml/model_understanding/prediction_explanations/_user_interface.py
Outdated
Show resolved
Hide resolved
evalml/tests/model_understanding_tests/prediction_explanations_tests/test_force_plots.py
Show resolved
Hide resolved
evalml/tests/model_understanding_tests/prediction_explanations_tests/test_force_plots.py
Show resolved
Hide resolved
4f7911b to
9cb264d
Compare
9cb264d to
6761ad6
Compare
| # for positive class | ||
| expected_value = explainer.expected_value[1] | ||
| except IndexError: | ||
| expected_value = explainer.expected_value |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have test coverage for this logic?
dsherry
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@chukarsten looking good to me :) left a couple questions
bchen1116
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good! Left a few questions and clean-up comments.
| ] | ||
| }, | ||
| { | ||
| "cell_type": "code", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nitpick: empty cell
| "for idx, result in enumerate(results):\n", | ||
| " print(\"Row:\", idx)\n", | ||
| " for cls in result:\n", | ||
| " print(\"Class:\", cls)\n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How long does this take? Can we limit it to only doing the force plots for a few columns rather than all? Our docs take a while to build already
| 'shap_values': [0.05, 0.03, 0.02], | ||
| 'plot': AdditiveForceVisualizer}] | ||
| Raises: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! I like this
| y=y, | ||
| matplotlib=False, | ||
| ) | ||
| assert not initjs.called |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the difference between this and initjs.assert_called()?
|
|
||
| for result in results: | ||
| class_labels = result.keys() | ||
| assert len(class_labels) == 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do we assert len(class_labels)==1 here? Does this imply that there's only 1 class we're doing this for on regression pipelines?


Addresses #1970
Single row explanations force plots: