Preserve ww schema in partial dependence#2929
Conversation
Codecov Report
@@ Coverage Diff @@
## main #2929 +/- ##
=======================================
+ Coverage 99.7% 99.7% +0.1%
=======================================
Files 302 302
Lines 28396 28412 +16
=======================================
+ Hits 28303 28319 +16
Misses 93 93
Continue to review full report at Codecov.
|
2bb1f74 to
d96ee9f
Compare
chukarsten
left a comment
There was a problem hiding this comment.
Awesome work, Freddy! Just a quick question about the copy in partial dependence, but nothing blocking.
| prediction_method = pipeline.predict_proba | ||
|
|
||
| X_eval = X.copy() | ||
| X_eval = X.ww.copy() |
There was a problem hiding this comment.
Is there a reason we copy this first rather than just build a new df by concatting the series? I've seen this a few times but never had the courage to ask :)
There was a problem hiding this comment.
The partial dependence computation requires us to fill a given feature will only one value while keeping all other features the same. To not override the user's original data, I think we need a copy. Since we need all the features to be present in the data, I think concatting all the features will be equivalent to a copy + modify operation (and result in the same memory).
* Preserve ww schema * Fix index * Add to release notes
Pull Request Description
Fixes #2928
After creating the pull request: in order to pass the release_notes_updated check you will need to update the "Future Release" section of
docs/source/release_notes.rstto include this pull request by adding :pr:123.