Updates for pandas 1.5.0 compatibility#2291
Conversation
Codecov Report
@@ Coverage Diff @@
## main #2291 +/- ##
==========================================
- Coverage 99.32% 99.31% -0.02%
==========================================
Files 148 148
Lines 17706 17706
==========================================
- Hits 17586 17584 -2
- Misses 120 122 +2
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. |
|
From what I can tell, the change in coverage is due a change in an error raised by pandas. What was previously an |
| >>> lag_periods = Lag(periods=3) | ||
| >>> lag_periods(["hello", "world", "test", "foo", "bar"], pd.Series(pd.date_range(start="2020-01-01", periods=5, freq='D'))).tolist() | ||
| [nan, nan, nan, 'hello', 'world'] | ||
| >>> lag_periods([True, False, False, True, True], pd.Series(pd.date_range(start="2020-01-01", periods=5, freq='D'))).tolist() |
There was a problem hiding this comment.
Can we have a string lag and a boolean lag in the example?
There was a problem hiding this comment.
We could, but I'm not sure it's so simple: #2291 (comment)
If we added a string lag, the doctest wouldn't pass for all of the pandas version we currently support because of the None vs nan output difference. Unless we added some extra conversion to the output to make sure we always get the same missing value representation.
There was a problem hiding this comment.
Just want to point out, that once Woodwork gets initialized in the feature matrix we will always end up with the same missing value representation, but since that doesn't happen at the primitive function level, this difference in output will be present when calling it like we do in the doctest examples.
There was a problem hiding this comment.
Ah I see. Let's put a note in the docstring about this OR comment in the code
Various changes needed for compatibility with pandas 1.5.0.