Add one-way partial dependence and partial dependence plots #1079

angela97lin · 2020-08-19T20:30:29Z

Closes #985

scikit-learn's partial_dependence does some checking to ensure that a fitted pipeline has been passed in. Unfortunately, the code used also checks for certain attributes to ensure that a classifier / regressor is passed in. There are two options:

Limit the available pipelines to those available in scikit-learn (not catboost or our baseline)
Update our _classes attribute to classes_ instead to match scikit-learn, then set _estimator_type for the input pipeline to something (and somehow revert it after? Tried looking into context managers but not sure if that's the right approach since we're creating a new attribute, not temporarily updating a pre-existing one).

For now, I've updated this PR for option 1, since the idea of creating a new attribute and temporarily setting it feels weird, but this really limits what we can pass into partial_dependence, especially when it seems like it works fine (tested briefly with catboost) once we pass all of the checks. We also can't wrap the inputted pipeline with our own pipeline attribute since a fitted pipeline is required.

codecov · 2020-08-20T06:04:55Z

Codecov Report

Merging #1079 into main will increase coverage by 0.00%.
The diff coverage is 100.00%.

@@           Coverage Diff           @@
##             main    #1079   +/-   ##
=======================================
  Coverage   99.89%   99.89%           
=======================================
  Files         192      192           
  Lines       10894    10981   +87     
=======================================
+ Hits        10883    10970   +87     
  Misses         11       11

Impacted Files	Coverage Δ
evalml/model_understanding/__init__.py	`100.00% <ø> (ø)`
evalml/model_understanding/graphs.py	`100.00% <100.00%> (ø)`
...lml/tests/model_understanding_tests/test_graphs.py	`100.00% <100.00%> (ø)`

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 374945c...cd660b5. Read the comment docs.

dsherry

@angela97lin looks good!

I have two change requests before merging:

The graph is backwards! The axes need to be flipped. Noticed because the user guide example is not a function
I found a way to get catboost working

evalml/model_understanding/graphs.py

freddyaboulton

@angela97lin Looks good to me! I have a minor comment on the user guide that is not blocking.

docs/source/user_guide/model_understanding.ipynb

angela97lin added 4 commits August 19, 2020 00:08

init

66370a9

cleaning up and beginning to add tests

5fd0ab9

revert classes_

20baeea

cleanup and add test

af55e26

angela97lin added 3 commits August 20, 2020 02:07

release note

ab49ea9

Merge branch 'main' into 985_partial_dependencies_plot

8e912f2

fix catboost

82044ce

angela97lin self-assigned this Aug 20, 2020

angela97lin added this to the August 2020 milestone Aug 20, 2020

add docs and cleanup

6bdbe0d

angela97lin marked this pull request as ready for review August 20, 2020 15:10

angela97lin requested review from dsherry, freddyaboulton, jeremyliweishih and bchen1116 August 20, 2020 15:10

Merge branch 'main' into 985_partial_dependencies_plot

e4209e6

dsherry approved these changes Aug 21, 2020

View reviewed changes

evalml/model_understanding/graphs.py Outdated Show resolved Hide resolved

evalml/model_understanding/graphs.py Outdated Show resolved Hide resolved

evalml/model_understanding/graphs.py Outdated Show resolved Hide resolved

evalml/model_understanding/graphs.py Outdated Show resolved Hide resolved

angela97lin added 4 commits August 21, 2020 16:15

Merge branch 'main' into 985_partial_dependencies_plot

ae762ba

Merge branch 'main' into 985_partial_dependencies_plot

038ef0f

Merge branch 'main' into 985_partial_dependencies_plot

40fec57

fixing axes switched and clean up names

4a26299

freddyaboulton approved these changes Aug 24, 2020

View reviewed changes

docs/source/user_guide/model_understanding.ipynb Outdated Show resolved Hide resolved

angela97lin added 4 commits August 24, 2020 12:29

update for catboost

8ec3794

fix test

69ad560

fix catboost test

d85f36f

actually fixing test

812ca20

angela97lin mentioned this pull request Aug 24, 2020

Add a “cost-benefit curve” util method to graph objective score vs binary classification threshold #1081

Merged

angela97lin added 2 commits August 24, 2020 13:55

cleanup and add check for if pipeline is fitted

019cb19

fix test

cd660b5

angela97lin merged commit 52ea6d9 into main Aug 24, 2020

angela97lin deleted the 985_partial_dependencies_plot branch August 24, 2020 20:53

dsherry mentioned this pull request Aug 25, 2020

Release v0.13.1 #1101

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add one-way partial dependence and partial dependence plots #1079

Add one-way partial dependence and partial dependence plots #1079

angela97lin commented Aug 19, 2020 •

edited

Loading

codecov bot commented Aug 20, 2020 •

edited

Loading

dsherry left a comment

freddyaboulton left a comment

Add one-way partial dependence and partial dependence plots #1079

Add one-way partial dependence and partial dependence plots #1079

Conversation

angela97lin commented Aug 19, 2020 • edited Loading

codecov bot commented Aug 20, 2020 • edited Loading

Codecov Report

dsherry left a comment

Choose a reason for hiding this comment

freddyaboulton left a comment

Choose a reason for hiding this comment

angela97lin commented Aug 19, 2020 •

edited

Loading

codecov bot commented Aug 20, 2020 •

edited

Loading