-
Notifications
You must be signed in to change notification settings - Fork 909
Adding IsYearEnd and IsYearStart primitives #2124
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov Report
@@ Coverage Diff @@
## main #2124 +/- ##
=======================================
Coverage 99.21% 99.21%
=======================================
Files 143 143
Lines 16837 16869 +32
=======================================
+ Hits 16705 16737 +32
Misses 132 132
Continue to review full report at Codecov.
|
|
|
||
| def test_is_year_end(): | ||
| is_year_end = IsYearEnd() | ||
| dates = pd.Series([datetime(2020, 12, 31), datetime(2020, 1, 1)]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens if there is a NaN?
| >>> from datetime import datetime | ||
| >>> dates = [datetime(2019, 12, 31), | ||
| ... datetime(2019, 1, 1), | ||
| ... datetime(2019, 11, 30)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we include an example with NaN?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should make sure we did this with the previous Datetime primitives we merged in
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
General question regarding the behavior of primitives with NaN:
If I use the Month primitive on the following data:
pd.Series([datetime(2020, 3, 3), np.nan])
I get [3.0, nan]
If I pass pd.Series([np.nan]) in to the Month primitive, I get an error.
I get analogous behavior for Year and the like. I just wanted to confirm that this is indeed the correct expected behavior before writing the tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you may have found a bug :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should always expect the series dtype for Datetime columns to be datetime64[ns] based on the behavior of Woodwork, which enforces that dtype for the series.
If you don't otherwise specify the dtype a series of pd.Series([np.nan]) will have a dtype of float64 and you can't perform datetime operations on a float64 column.
I haven't tested this out, but I think if you convert pd.Series([np.nan]) to a datetime column first, the Month primitive will no longer error. If you still get an error at that point, then I agree with @dvreed77 there is a bug we need to fix.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good call. I confirmed this doesn't error
s = pd.Series([np.nan]).astype('datetime64[ns]')
ft.primitives.Month()(s)…eryx/featuretools into more-year-and-day-primitives
| Examples: | ||
| >>> from datetime import datetime | ||
| >>> dates = [datetime(2019, 12, 31), | ||
| ... datetime(2019, 1, 1), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sbadithe can you put a NaN example in the doc string?
dvreed77
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
Adding the following primitives: IsYearEnd, IsYearStart
Fixes #2061
Fixes #2060