Skip to content

Non Numeric Bug with Partial Dependence#1748

Merged
chukarsten merged 12 commits intomainfrom
1509-part_dep_non_numeric_bug
Jan 29, 2021
Merged

Non Numeric Bug with Partial Dependence#1748
chukarsten merged 12 commits intomainfrom
1509-part_dep_non_numeric_bug

Conversation

@chukarsten
Copy link
Contributor

fixes #1509

@chukarsten chukarsten force-pushed the 1509-part_dep_non_numeric_bug branch 2 times, most recently from 927864f to f8cf441 Compare January 27, 2021 19:20
@codecov
Copy link

codecov bot commented Jan 27, 2021

Codecov Report

Merging #1748 (358a94b) into main (dd90a42) will increase coverage by 0.1%.
The diff coverage is 100.0%.

Impacted file tree graph

@@            Coverage Diff            @@
##             main    #1748     +/-   ##
=========================================
+ Coverage   100.0%   100.0%   +0.1%     
=========================================
  Files         247      247             
  Lines       19571    19598     +27     
=========================================
+ Hits        19562    19589     +27     
  Misses          9        9             
Impacted Files Coverage Δ
evalml/model_understanding/graphs.py 99.8% <100.0%> (+0.1%) ⬆️
...lml/tests/model_understanding_tests/test_graphs.py 100.0% <100.0%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update dd90a42...358a94b. Read the comment docs.

@chukarsten chukarsten force-pushed the 1509-part_dep_non_numeric_bug branch 2 times, most recently from 92b35f9 to 05ee3bd Compare January 28, 2021 16:07
@chukarsten chukarsten marked this pull request as ready for review January 28, 2021 16:07
Copy link
Contributor

@ParthivNaresh ParthivNaresh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Everything looks good, just wanted to clarify the docstring wording

If features is a tuple of int/strings, it must contain valid column integers/names in X.
grid_resolution (int): Number of samples of feature(s) for partial dependence plot
grid_resolution (int): Number of samples of feature(s) for partial dependence plot. If this value
does not exceed the maximum number of categories present in categorical data within X, it will be
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this number does not exceed or if this number does exceed?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the wording could be clarified a bit. If grid_resolution is less than the maximum number of..etc`

Copy link
Contributor

@angela97lin angela97lin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good 😁

I think we can clean up the impl and test a bit but thank you for fixing this!!

partial_dependence(pipeline, X, features=(0, 'b'))


def test_partial_dependence_more_categories_than_grid_resolution(logistic_regression_binary_pipeline_class):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a suggestion but maybe we could somehow make it more clear that the number of categories is greater than the grid resolution, whether that's by setting grid_resolution specifically to a number and then asserting that there are currency has more than that number of categories, which is why we trigger this new code? Or alternatively we could have a tiny dataset like the previous test and set grid_resolution to something even tinier?

@chukarsten chukarsten force-pushed the 1509-part_dep_non_numeric_bug branch from 05ee3bd to 297d622 Compare January 29, 2021 15:45
Copy link
Contributor

@freddyaboulton freddyaboulton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@chukarsten Looks good to me! Thanks for the fix 😄

Copy link
Contributor

@angela97lin angela97lin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! Thank you for updating the impl and tests 😁

@chukarsten chukarsten merged commit d1a89af into main Jan 29, 2021
@chukarsten chukarsten mentioned this pull request Feb 1, 2021
@freddyaboulton freddyaboulton deleted the 1509-part_dep_non_numeric_bug branch May 13, 2022 15:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Partial dependence fails on non-numeric columns from featuretool's retail dataset

4 participants