Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Two-Way Dependence Plots #1690

Merged
merged 17 commits into from
Jan 25, 2021
Merged

Conversation

chukarsten
Copy link
Contributor

… the two way part dep calculated, just trying to integrated it into the plots.

Pull Request Description

(replace this text with your description)


After creating the pull request: in order to pass the release_notes_updated check you will need to update the "Future Release" section of docs/source/release_notes.rst to include this pull request by adding :pr:123.

@codecov
Copy link

codecov bot commented Jan 13, 2021

Codecov Report

Merging #1690 (968d4bd) into main (70c42b1) will increase coverage by 0.1%.
The diff coverage is 100.0%.

Impacted file tree graph

@@            Coverage Diff            @@
##             main    #1690     +/-   ##
=========================================
+ Coverage   100.0%   100.0%   +0.1%     
=========================================
  Files         242      242             
  Lines       19174    19273     +99     
=========================================
+ Hits        19166    19265     +99     
  Misses          8        8             
Impacted Files Coverage Δ
evalml/model_understanding/graphs.py 99.8% <100.0%> (+0.1%) ⬆️
...lml/tests/model_understanding_tests/test_graphs.py 100.0% <100.0%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 70c42b1...968d4bd. Read the comment docs.

@chukarsten chukarsten force-pushed the 1076-add_2way_partial_dependence_plots branch 2 times, most recently from b5eae5d to fb3121f Compare January 19, 2021 20:00
@chukarsten chukarsten changed the title Initial commit implementing the two-way partial dependence graph. Got… Two-Way Dependence Plots Jan 19, 2021
@chukarsten chukarsten force-pushed the 1076-add_2way_partial_dependence_plots branch from c7aae1a to 347af36 Compare January 20, 2021 14:24
@chukarsten chukarsten marked this pull request as ready for review January 20, 2021 16:46
@chukarsten
Copy link
Contributor Author

Unknown

Copy link
Contributor

@freddyaboulton freddyaboulton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@chukarsten Looks great! The only thing I think we should resolve before merge is how we should handle two-way plots for multiclass problems.

evalml/model_understanding/graphs.py Outdated Show resolved Hide resolved
data['class_label'] = np.repeat(classes, len(values[0]))
elif isinstance(features, (list, tuple)):
if len(features) == 2:
data = pd.DataFrame(avg_pred[0])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would only return the partial dependence for the first class for multiclass problems. I think we should either modify this to return the grid for all of the classes or raise an exception that two-way plots are not allowed for multiclass problems.

I don't think modifying this to return the values for all classes would be a heavy lift. I think doing something like this:

df = pd.DataFrame(avg_pred.reshape((-1, avg_pred.shape[-1])))
df.columns = values[1]
df.index = np.tile(values[0], avf_pred.shape[0])
df['classes'] = np.repeat(classes, len(values[0]))

should get us most of the way there (though I didn't test 😬 ). The only thing left to modify is how the data is passed to the contour plot.

Copy link
Contributor Author

@chukarsten chukarsten Jan 21, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm legitimately mystified by numpy's reshape function and even more by people that can so intuitively use it. Ok, cool, I'll implement this and try and see if I can follow the one-way multi-class example.

evalml/model_understanding/graphs.py Show resolved Hide resolved
evalml/model_understanding/graphs.py Show resolved Hide resolved
evalml/tests/model_understanding_tests/test_graphs.py Outdated Show resolved Hide resolved
Copy link
Contributor

@angela97lin angela97lin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking great! Added a few minor comments but I think Freddy's comments are solid.

evalml/model_understanding/graphs.py Outdated Show resolved Hide resolved
Copy link
Contributor

@bchen1116 bchen1116 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree with Angela and Freddy, but LGTM otherwise!

@chukarsten
Copy link
Contributor Author

Unknown-3

Copy link
Contributor

@freddyaboulton freddyaboulton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@chukarsten Thanks for adding support for 2-way for multiclass! Tests are solid!

"""
X = _convert_to_woodwork_structure(X)
X = _convert_woodwork_types_wrapper(X.to_dataframe())

if isinstance(features, (list, tuple)):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome

@@ -534,21 +580,43 @@ def graph_partial_dependence(pipeline, X, feature, class_label=None, grid_resolu
# Don't specify share_xaxis and share_yaxis so that we get tickmarks in each subplot
fig = _subplots.make_subplots(rows=rows, cols=cols, subplot_titles=class_labels)
for i, label in enumerate(class_labels):

# Plotly trace indexing begins at 1 so we add 1 to i
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit-pick: This comment applies to the col=(i % 2) + 1 expression so I think we move it closer to that or get rid of it

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!!! Good call. I must have a bunch of stuff missing in here but I definitely have PR fatigue on this one.

@chukarsten chukarsten force-pushed the 1076-add_2way_partial_dependence_plots branch from dc23895 to 968d4bd Compare January 25, 2021 17:58
@chukarsten chukarsten merged commit 1f8779b into main Jan 25, 2021
@chukarsten chukarsten deleted the 1076-add_2way_partial_dependence_plots branch January 25, 2021 19:02
@bchen1116 bchen1116 mentioned this pull request Jan 26, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants