Skip to content

Preserve ww schema in partial dependence#2929

Merged
freddyaboulton merged 4 commits into
mainfrom
preserve-ww-in-part-dep
Oct 18, 2021
Merged

Preserve ww schema in partial dependence#2929
freddyaboulton merged 4 commits into
mainfrom
preserve-ww-in-part-dep

Conversation

@freddyaboulton
Copy link
Copy Markdown
Contributor

Pull Request Description

Fixes #2928


After creating the pull request: in order to pass the release_notes_updated check you will need to update the "Future Release" section of docs/source/release_notes.rst to include this pull request by adding :pr:123.

@codecov
Copy link
Copy Markdown

codecov Bot commented Oct 15, 2021

Codecov Report

Merging #2929 (74d87f9) into main (7d97020) will increase coverage by 0.1%.
The diff coverage is 100.0%.

Impacted file tree graph

@@           Coverage Diff           @@
##            main   #2929     +/-   ##
=======================================
+ Coverage   99.7%   99.7%   +0.1%     
=======================================
  Files        302     302             
  Lines      28396   28412     +16     
=======================================
+ Hits       28303   28319     +16     
  Misses        93      93             
Impacted Files Coverage Δ
evalml/model_understanding/_partial_dependence.py 98.8% <100.0%> (+0.1%) ⬆️
...del_understanding_tests/test_partial_dependence.py 99.3% <100.0%> (+0.1%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 7d97020...74d87f9. Read the comment docs.

Copy link
Copy Markdown
Contributor

@chukarsten chukarsten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome work, Freddy! Just a quick question about the copy in partial dependence, but nothing blocking.

prediction_method = pipeline.predict_proba

X_eval = X.copy()
X_eval = X.ww.copy()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason we copy this first rather than just build a new df by concatting the series? I've seen this a few times but never had the courage to ask :)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The partial dependence computation requires us to fill a given feature will only one value while keeping all other features the same. To not override the user's original data, I think we need a copy. Since we need all the features to be present in the data, I think concatting all the features will be equivalent to a copy + modify operation (and result in the same memory).

Copy link
Copy Markdown
Contributor

@eccabay eccabay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Solid, LGTM!

@freddyaboulton freddyaboulton merged commit d83e074 into main Oct 18, 2021
@freddyaboulton freddyaboulton deleted the preserve-ww-in-part-dep branch October 18, 2021 17:08
freddyaboulton added a commit to freddyaboulton/evalml that referenced this pull request Oct 18, 2021
* Preserve ww schema

* Fix index

* Add to release notes
@chukarsten chukarsten mentioned this pull request Oct 27, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Partial dependence fails when categorical column gets typed as natural language by user

3 participants