Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Imputer so that it does not erase ww info during transform #2752

Merged
merged 2 commits into from
Sep 8, 2021

Conversation

freddyaboulton
Copy link
Contributor

Pull Request Description

Fixes #2751


After creating the pull request: in order to pass the release_notes_updated check you will need to update the "Future Release" section of docs/source/release_notes.rst to include this pull request by adding :pr:123.

@freddyaboulton freddyaboulton self-assigned this Sep 7, 2021
@codecov
Copy link

codecov bot commented Sep 7, 2021

Codecov Report

Merging #2752 (b84057a) into main (411a0c1) will increase coverage by 0.1%.
The diff coverage is 100.0%.

Impacted file tree graph

@@           Coverage Diff           @@
##            main   #2752     +/-   ##
=======================================
+ Coverage   99.9%   99.9%   +0.1%     
=======================================
  Files        301     301             
  Lines      27827   27840     +13     
=======================================
+ Hits       27778   27791     +13     
  Misses        49      49             
Impacted Files Coverage Δ
...elines/components/transformers/imputers/imputer.py 100.0% <100.0%> (ø)
evalml/tests/component_tests/test_imputer.py 100.0% <100.0%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 411a0c1...b84057a. Read the comment docs.

Copy link
Contributor

@chukarsten chukarsten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! Thanks Freddy!

Copy link
Contributor

@angela97lin angela97lin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, thanks for filing and fixing this!

This, along with the few other bugs that have popped up re ww typing, makes me wonder what we could do to make this easier--right now, our components could have potentially different behavior if they weren't fully updated to use ww like the imputer was. No real action, just food for thought 😅

@freddyaboulton
Copy link
Contributor Author

@angela97lin Agreed that it's not comforting to uncover these bugs one-by-one. I will say that the reason that the Imputer did not go through the ww accessor prior to handing the data to the simple imputer is that it didn't need to! The ww type inference used to be almost entirely dependent on the pandas dtype so the simple imputer inference would match the Imputer logical types. Maybe it was always a bug waiting to happen but this implementation got us pretty far.

What I'm saying is that as ww inference evolves and matures so will our code because new corner cases that couldn't exist before will begin to exist.

I think it could help if we always went through the accessor and never "broke" the schema, i think #2744 can help with that.

@freddyaboulton freddyaboulton merged commit e3ec96f into main Sep 8, 2021
@freddyaboulton freddyaboulton deleted the 2751-imputer-erases-ww-info branch September 8, 2021 18:57
@chukarsten chukarsten mentioned this pull request Sep 10, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Imputer.transform erases woodwork typing information
3 participants