-
Notifications
You must be signed in to change notification settings - Fork 87
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Two-Way Dependence Plots #1690
Two-Way Dependence Plots #1690
Conversation
Codecov Report
@@ Coverage Diff @@
## main #1690 +/- ##
=========================================
+ Coverage 100.0% 100.0% +0.1%
=========================================
Files 242 242
Lines 19174 19273 +99
=========================================
+ Hits 19166 19265 +99
Misses 8 8
Continue to review full report at Codecov.
|
b5eae5d
to
fb3121f
Compare
c7aae1a
to
347af36
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@chukarsten Looks great! The only thing I think we should resolve before merge is how we should handle two-way plots for multiclass problems.
evalml/model_understanding/graphs.py
Outdated
data['class_label'] = np.repeat(classes, len(values[0])) | ||
elif isinstance(features, (list, tuple)): | ||
if len(features) == 2: | ||
data = pd.DataFrame(avg_pred[0]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would only return the partial dependence for the first class for multiclass problems. I think we should either modify this to return the grid for all of the classes or raise an exception that two-way plots are not allowed for multiclass problems.
I don't think modifying this to return the values for all classes would be a heavy lift. I think doing something like this:
df = pd.DataFrame(avg_pred.reshape((-1, avg_pred.shape[-1])))
df.columns = values[1]
df.index = np.tile(values[0], avf_pred.shape[0])
df['classes'] = np.repeat(classes, len(values[0]))
should get us most of the way there (though I didn't test 😬 ). The only thing left to modify is how the data is passed to the contour plot.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm legitimately mystified by numpy's reshape function and even more by people that can so intuitively use it. Ok, cool, I'll implement this and try and see if I can follow the one-way multi-class example.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking great! Added a few minor comments but I think Freddy's comments are solid.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree with Angela and Freddy, but LGTM otherwise!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@chukarsten Thanks for adding support for 2-way for multiclass! Tests are solid!
""" | ||
X = _convert_to_woodwork_structure(X) | ||
X = _convert_woodwork_types_wrapper(X.to_dataframe()) | ||
|
||
if isinstance(features, (list, tuple)): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome
evalml/model_understanding/graphs.py
Outdated
@@ -534,21 +580,43 @@ def graph_partial_dependence(pipeline, X, feature, class_label=None, grid_resolu | |||
# Don't specify share_xaxis and share_yaxis so that we get tickmarks in each subplot | |||
fig = _subplots.make_subplots(rows=rows, cols=cols, subplot_titles=class_labels) | |||
for i, label in enumerate(class_labels): | |||
|
|||
# Plotly trace indexing begins at 1 so we add 1 to i |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit-pick: This comment applies to the col=(i % 2) + 1
expression so I think we move it closer to that or get rid of it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!!! Good call. I must have a bunch of stuff missing in here but I definitely have PR fatigue on this one.
… the two way part dep calculated, just trying to integrated it into the plots.
…alize this and make sure it makes sense. Also not sure x and y line up.
dc23895
to
968d4bd
Compare
… the two way part dep calculated, just trying to integrated it into the plots.
Pull Request Description
(replace this text with your description)
After creating the pull request: in order to pass the release_notes_updated check you will need to update the "Future Release" section of
docs/source/release_notes.rst
to include this pull request by adding :pr:123
.