-
Notifications
You must be signed in to change notification settings - Fork 885
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Updates for pandas 1.5.0 compatibility #2291
Conversation
Codecov Report
@@ Coverage Diff @@
## main #2291 +/- ##
==========================================
- Coverage 99.32% 99.31% -0.02%
==========================================
Files 148 148
Lines 17706 17706
==========================================
- Hits 17586 17584 -2
- Misses 120 122 +2
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. |
From what I can tell, the change in coverage is due a change in an error raised by pandas. What was previously an |
@@ -653,8 +653,8 @@ class Lag(TransformPrimitive): | |||
You can specify the number of periods to shift the values | |||
|
|||
>>> lag_periods = Lag(periods=3) | |||
>>> lag_periods(["hello", "world", "test", "foo", "bar"], pd.Series(pd.date_range(start="2020-01-01", periods=5, freq='D'))).tolist() | |||
[nan, nan, nan, 'hello', 'world'] | |||
>>> lag_periods([True, False, False, True, True], pd.Series(pd.date_range(start="2020-01-01", periods=5, freq='D'))).tolist() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we have a string lag and a boolean lag in the example?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could, but I'm not sure it's so simple: #2291 (comment)
If we added a string lag, the doctest wouldn't pass for all of the pandas version we currently support because of the None
vs nan
output difference. Unless we added some extra conversion to the output to make sure we always get the same missing value representation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just want to point out, that once Woodwork gets initialized in the feature matrix we will always end up with the same missing value representation, but since that doesn't happen at the primitive function level, this difference in output will be present when calling it like we do in the doctest examples.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah I see. Let's put a note in the docstring about this OR comment in the code
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added comment: b8e5cfe
Various changes needed for compatibility with pandas 1.5.0.