Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

1434 prediction explanations dataframe output #1781

Merged
merged 10 commits into from Feb 4, 2021

Conversation

freddyaboulton
Copy link
Contributor

Pull Request Description

Fixes #1434


After creating the pull request: in order to pass the release_notes_updated check you will need to update the "Future Release" section of docs/source/release_notes.rst to include this pull request by adding :pr:123.

@freddyaboulton freddyaboulton self-assigned this Feb 2, 2021
@codecov
Copy link

codecov bot commented Feb 2, 2021

Codecov Report

Merging #1781 (32cb9c8) into main (30fac41) will increase coverage by 0.1%.
The diff coverage is 100.0%.

Impacted file tree graph

@@            Coverage Diff            @@
##             main    #1781     +/-   ##
=========================================
+ Coverage   100.0%   100.0%   +0.1%     
=========================================
  Files         247      247             
  Lines       19573    19618     +45     
=========================================
+ Hits        19564    19609     +45     
  Misses          9        9             
Impacted Files Coverage Δ
...prediction_explanations/_report_creator_factory.py 100.0% <100.0%> (ø)
...tanding/prediction_explanations/_user_interface.py 100.0% <100.0%> (ø)
...nderstanding/prediction_explanations/explainers.py 100.0% <100.0%> (ø)
...s/prediction_explanations_tests/test_explainers.py 100.0% <100.0%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 30fac41...32cb9c8. Read the comment docs.

Copy link
Collaborator

@chukarsten chukarsten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me!

@@ -399,30 +491,51 @@ def test_output_format_checked():
@pytest.mark.parametrize("problem_type,output_format,answer,explain_predictions_answer,custom_index",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an epic parameterization.

input_features, y_true = pd.DataFrame(data=range(15)), pd.Series(range(15))
with pytest.raises(ValueError, match="Parameter output_format must be either text or dict. Received foo"):
with pytest.raises(ValueError, match="Parameter output_format must be either text, dict, or dataframe. Received foo"):
explain_predictions_best_worst(None, input_features, y_true=y_true, output_format="foo")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can probably get rid of the foo and bar tests, right? They don't add anything the "xml" one does.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

100%

"quantitative_explanation": [0],
})
# Use side effect so that we always get a new copy of the dataframe
mock_make_table.side_effect = lambda *args, **kwargs: shap_table.copy()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is fancy and cool. I'm going to need to go figure out what this does.

elif output_format == "dataframe":
# Check dataframes equal without caring about column order
assert sorted(best_worst_report.columns.tolist()) == sorted(answer.columns.tolist())
pd.testing.assert_frame_equal(best_worst_report, answer[best_worst_report.columns])
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Won't assert_frame_equal compare the columns as well? L597 might be redundant?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes but in the event best_worst_report has fewer columns than expected, the check will pass because answer[best_worst_report.columns] (needed to reorder the columns) will subset to the columns in best_worst_report.

The root cause is that I modify the index_id column at the beginning of the test to add the expected custom index and this unfortunately messes up the column order. I don't think there's a way to get assert_frame_equal to ignore the column order so this two-step check verifies the same columns are present in both and that they contain the same values. I think this is ok given how many different scenarios we're testing. The alternative would be to manually specify the answer for each test but that would be hairy.

@freddyaboulton freddyaboulton force-pushed the 1434-prediction-explanations-dataframe-output branch from f25c120 to cf35e57 Compare February 3, 2021 16:54
Copy link
Contributor

@dsherry dsherry left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@freddyaboulton this is looking great!

I left some comments and questions on the implementation. Nothing blocking, but it would be great to close out those conversations soon.

@@ -246,6 +253,10 @@ def make_text(self, *args, **kwargs):
def make_dict(self, *args, **kwargs):
"""Makes the report section formatted as a dictionary."""

@abc.abstractmethod
def make_dataframe(self, *args, **kwargs):
"""Makes the report section formatted as a dictionary."""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean " formatted as a dataframe?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes! 😅

for key, value in heading.items():
if key == "probabilities":
for class_name, probability in value.items():
shap_table[f'label_{class_name}_probability'] = probability
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we replace this for with

labels = [f'label_{class_name}_probability' for class_name in value.keys()]
shap_table[labels] = value.values()

I guess that code would rely on value.values() being ordered in the same way as value.keys(), and I don't see this set up as an OrderedDict, so maybe what you have is fine :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea I think we can add another for-comprehension to make sure the order is the same but at that point the main benefit of that implementation over this one is saving an indent. So I think I'll stick to this one!

shap_table['rank'] = heading['index']
shap_table['prefix'] = heading['prefix']
report.append(shap_table)
df = pd.concat(report).reset_index(drop=True)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this code will create one df for each prediction, and then concatenate all the columns together horizontally? How do users know which columns correspond to which prediction?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To resolve: They'll use the prediction_number column we talked about above!

@@ -423,3 +451,25 @@ def make_dict(self, data):
data.input_features)["explanations"]
report.append(section)
return {"explanations": report}

def make_dataframe(self, data):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there some way we can have this implementation use _make_single_prediction_shap_table to make the tables per-prediction, and then concatenate them together? This would avoid confusion and bugs stemming from current or future differences in the implementations.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that's what we're doing! self.table_maker.make_dataframe(index, data.pipeline, data.input_features) calls _make_single_prediction_shap_table to make the shap table per prediction. Then we add metadata to each table (prediction probabilities, etc) and then we concat all the tables at the end.

"report = explain_predictions(pipeline=pipeline, input_features=X.iloc[[0, 4, 9]], include_shap_values=True)\n",
"print(report)"
"report = explain_predictions(pipeline=pipeline, input_features=X.iloc[[0, 4, 9]], include_shap_values=True,\n",
" output_format='dataframe')\n",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@freddyaboulton could you explain how to read the output here?

Screen Shot 2021-02-03 at 12 26 19 PM

Each grouping of rows corresponds to a different prediction explanation, yes? And that's indicated by the "rank" column. What confuses me is that "rank" ranges from 0 to n-1 whereas the indices you passed to input_features are 0, 4 and 9. I'm concerned someone looking at the dataframe output will have trouble matching those row groups up with the original dataframe row that info corresponds to.

If I'm following correctly here, I suggest we either find a way to get the original dataframe index into this dataframe report, or we change our language a bit and call this "prediction number" instead of "rank". The latter option seems more appealing.

What do you think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dsherry You're spot on about how to read the table based on the grouping! I agree with the rename of rank to prediction_number. Adding the original dataframe index is possible but we don't do that in explain_prediction for the other output formats.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

image

@@ -448,8 +448,18 @@
"source": [
"from evalml.model_understanding.prediction_explanations import explain_predictions\n",
Copy link
Contributor

@dsherry dsherry Feb 3, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@freddyaboulton I see the updates below are for the "Explaining Multiple Predictions" section. Should we also update the "Explaining Individual Predictions" section? I see that example still uses print(table)
Screen Shot 2021-02-03 at 12 32 53 PM

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a section on changing the output format for explain_prediction in the "Changing output formats" section!

@freddyaboulton freddyaboulton added the enhancement An improvement to an existing feature. label Feb 4, 2021
@freddyaboulton freddyaboulton merged commit 42e01f3 into main Feb 4, 2021
2 checks passed
@freddyaboulton freddyaboulton deleted the 1434-prediction-explanations-dataframe-output branch February 4, 2021 15:12
@ParthivNaresh ParthivNaresh mentioned this pull request Feb 9, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement An improvement to an existing feature.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Prediction explanations: output pandas dataframe
3 participants