Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add one-way partial dependence and partial dependence plots #1079

Merged
merged 19 commits into from Aug 24, 2020

Conversation

angela97lin
Copy link
Contributor

@angela97lin angela97lin commented Aug 19, 2020

Closes #985

scikit-learn's partial_dependence does some checking to ensure that a fitted pipeline has been passed in. Unfortunately, the code used also checks for certain attributes to ensure that a classifier / regressor is passed in. There are two options:

  1. Limit the available pipelines to those available in scikit-learn (not catboost or our baseline)
  2. Update our _classes attribute to classes_ instead to match scikit-learn, then set _estimator_type for the input pipeline to something (and somehow revert it after? Tried looking into context managers but not sure if that's the right approach since we're creating a new attribute, not temporarily updating a pre-existing one).

For now, I've updated this PR for option 1, since the idea of creating a new attribute and temporarily setting it feels weird, but this really limits what we can pass into partial_dependence, especially when it seems like it works fine (tested briefly with catboost) once we pass all of the checks. We also can't wrap the inputted pipeline with our own pipeline attribute since a fitted pipeline is required.

@codecov
Copy link

codecov bot commented Aug 20, 2020

Codecov Report

Merging #1079 into main will increase coverage by 0.00%.
The diff coverage is 100.00%.

Impacted file tree graph

@@           Coverage Diff           @@
##             main    #1079   +/-   ##
=======================================
  Coverage   99.89%   99.89%           
=======================================
  Files         192      192           
  Lines       10894    10981   +87     
=======================================
+ Hits        10883    10970   +87     
  Misses         11       11           
Impacted Files Coverage Δ
evalml/model_understanding/__init__.py 100.00% <ø> (ø)
evalml/model_understanding/graphs.py 100.00% <100.00%> (ø)
...lml/tests/model_understanding_tests/test_graphs.py 100.00% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 374945c...cd660b5. Read the comment docs.

@angela97lin angela97lin self-assigned this Aug 20, 2020
@angela97lin angela97lin added this to the August 2020 milestone Aug 20, 2020
@angela97lin angela97lin marked this pull request as ready for review Aug 20, 2020
Copy link
Collaborator

@dsherry dsherry left a comment

@angela97lin looks good!

I have two change requests before merging:

evalml/model_understanding/graphs.py Outdated Show resolved Hide resolved
evalml/model_understanding/graphs.py Outdated Show resolved Hide resolved
evalml/model_understanding/graphs.py Outdated Show resolved Hide resolved
evalml/model_understanding/graphs.py Outdated Show resolved Hide resolved
Copy link
Contributor

@freddyaboulton freddyaboulton left a comment

@angela97lin Looks good to me! I have a minor comment on the user guide that is not blocking.

docs/source/user_guide/model_understanding.ipynb Outdated Show resolved Hide resolved
@angela97lin angela97lin merged commit 52ea6d9 into main Aug 24, 2020
@angela97lin angela97lin deleted the 985_partial_dependencies_plot branch Aug 24, 2020
@dsherry dsherry mentioned this pull request Aug 25, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add partial dependence plot
3 participants