Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Human Readable Pipeline Explanations #2861

Merged
merged 24 commits into from Oct 5, 2021
Merged

Human Readable Pipeline Explanations #2861

merged 24 commits into from Oct 5, 2021

Conversation

eccabay
Copy link
Contributor

@eccabay eccabay commented Sep 29, 2021

Closes #2194. Design doc here

@codecov
Copy link

codecov bot commented Sep 29, 2021

Codecov Report

Merging #2861 (95c4040) into main (4d2fa96) will decrease coverage by 0.5%.
The diff coverage is 100.0%.

Impacted file tree graph

@@           Coverage Diff           @@
##            main   #2861     +/-   ##
=======================================
- Coverage   99.8%   99.4%   -0.4%     
=======================================
  Files        305     307      +2     
  Lines      28373   28670    +297     
=======================================
+ Hits       28292   28471    +179     
- Misses        81     199    +118     
Impacted Files Coverage Δ
evalml/model_understanding/__init__.py 100.0% <100.0%> (ø)
evalml/model_understanding/feature_explanations.py 100.0% <100.0%> (ø)
evalml/tests/component_tests/test_utils.py 95.3% <100.0%> (ø)
...l_understanding_tests/test_feature_explanations.py 100.0% <100.0%> (ø)
evalml/tests/pipeline_tests/test_pipelines.py 99.8% <100.0%> (ø)
...ml/tests/component_tests/test_prophet_regressor.py 10.2% <0.0%> (-89.8%) ⬇️
...ponents/estimators/regressors/prophet_regressor.py 41.7% <0.0%> (-58.3%) ⬇️
...ests/automl_tests/test_automl_search_regression.py 98.6% <0.0%> (-1.4%) ⬇️
evalml/tests/component_tests/test_components.py 99.1% <0.0%> (-0.2%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 4d2fa96...95c4040. Read the comment docs.

Copy link
Contributor

@bchen1116 bchen1116 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left some comments and questions on things I think would help clarify this! Great work on this!

Copy link
Contributor

@angela97lin angela97lin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks amazing, well done! 👏

I left a few comments for cleanup. I'm also curious about max_features. I see in the docs that there are more than the default (5) features that are considered detrimental?: https://feature-labs-inc-evalml--2861.com.readthedocs.build/en/2861/user_guide/model_understanding.html#Human-Readable-Importance

evalml/model_understanding/feature_explanations.py Outdated Show resolved Hide resolved
evalml/model_understanding/feature_explanations.py Outdated Show resolved Hide resolved
)
from evalml.pipelines import BinaryClassificationPipeline

elasticnet_component_graph = [
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's make this a pytest fixture so it doesn't automatically run if we need this! :)
I'm curious if there are any other pyfixture pipelines we can use, or if this is necessary?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made it into a fixture!

There's a Logistic Regression Classifier pyfixture I tried to replace this with, but there's a woodwork related bug that prevents the code from running successfully :( so I'll leave this here for now and we can consider swapping the pipeline out once the bug is fixed!

Copy link
Collaborator

@chukarsten chukarsten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I love this @eccabay . The testing is super thorough, the code is really clean. Really, quite marvellous. I think this exceeds what we were expecting from this issue and will really help bridge the gap between your more casual user and your more advanced user.

I left a few open ended questions and suggestions. Nothing particularly blocking. Take them or leave them. Again, great work.

from evalml.model_understanding import calculate_permutation_importance
from evalml.utils import get_logger, infer_feature_types


Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It feels to me that the supreme convenience might be to make the current readable_explanation() from a stand alone function to a instance function on a pipeline so that your work flow goes from:

from evalml.model_understanding.feature_explanations import readable_explanation

...
pl = ...blah...
pl.fit()
pl.predict()
readable_explanation(pl)

into

pl = ...blah...
pl.fit()
pl.predict()
pl.explain()

I feel like that's the ultimate convenience. Does it fit and make sense with our current API? I dunno. But to me as a relatively novice user compared to all of you, that makes some sense to me as an easy way to get into explanations.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Btw, not blocking. If you like it better as-is, no problemo.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bahaha, I have thoughts but no opinions on this, buuut I will say that it's really not a big lift if we decide later down the line that we want one or the other so this is fine either way :))

evalml/model_understanding/feature_explanations.py Outdated Show resolved Hide resolved
evalml/model_understanding/feature_explanations.py Outdated Show resolved Hide resolved
evalml/model_understanding/feature_explanations.py Outdated Show resolved Hide resolved
evalml/model_understanding/feature_explanations.py Outdated Show resolved Hide resolved
Copy link
Contributor

@angela97lin angela97lin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, thank you for addressing everything! I replied to your comment about the circular import / reusing method and think it'd still be a good idea to use the method we already have. Otherwise, just left some non-blocking suggestions/comments. Excited for this to get in!! 😁

@eccabay eccabay merged commit c817e2b into main Oct 5, 2021
@eccabay eccabay deleted the 2194_human_readable branch October 5, 2021 18:50
@chukarsten chukarsten mentioned this pull request Oct 14, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Generate human-readable descriptions of a pipeline's behavior
4 participants