Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: Add isocalendar accessor to DatetimeIndex and Series.dt #33220

Merged
merged 17 commits into from
Apr 12, 2020

Conversation

mgmarino
Copy link
Contributor

@mgmarino mgmarino commented Apr 1, 2020

This PR adds the the isocalendar property to DatetimeIndex and the corresponding Series.dt accessor. The property returns a DataFrame, e.g.:

>>> idx = pd.date_range(start='2019-12-29', freq='D', periods=4)
>>> idx.isocalendar
   year  week  day
0  2019    52    7
1  2020     1    1
2  2020     1    2
3  2020     1    3

and

>>> pandas.to_datetime(pandas.Series(["2020-01-01"])).dt.isocalendar
   year  week  day
0  2020     1    
>>> pandas.to_datetime(pandas.Series(["2019-12-31"])).dt.isocalendar.week
0    1
Name: week, dtype: int32

The behavior is consistent with Timestamp.isocalendar and datetime.date.isocalendar.

For more information about ISO 8601 calendar, see e.g. https://en.wikipedia.org/wiki/ISO_week_date.

Address GH33206

Note that I am very happy to rename the field, but I wanted to go ahead and open up the PR.

@mgmarino mgmarino force-pushed the add-year-for-week-of-year branch 2 times, most recently from 6463c51 to 18b7fef Compare April 1, 2020 21:11
@WillAyd
Copy link
Member

WillAyd commented Apr 1, 2020

Thanks for the PR. I agree the naming is a little weird, though do we need a dedicated property for this? Might be more natural to just convert to the appropriate period and still use the .year attribute therefrom

@jbrockmendel @mroeschke

@WillAyd WillAyd added Needs Discussion Requires discussion from core team before further action Datetime Datetime data dtype labels Apr 1, 2020
@mroeschke
Copy link
Member

Agreed with @WillAyd.

Since Timestamp has an isocalendar method from the standard library, I wouldn't be opposed to a DatetimeIndex.isocalendar or Series.dt.isocalendar method that converts the dates to returns a 3-tuple, (ISO year, ISO week number, ISO weekday).

@mgmarino
Copy link
Contributor Author

mgmarino commented Apr 2, 2020

@WillAyd @mroeschke Thanks for the comments!

I agree that the isocalendar method approach is cleaner and more symmetric with Timestamp, this was slightly easier for me to get my feet wet with the code. An additional advantage of this is that it makes it clearer that all values are calculated using ISO (8601).

Regarding whether this functionality is needed at all, I think if weekofyear is provided (which is currently the ISO week), then there absolutely should to be a way to get the year that corresponds to it.

@mroeschke
Copy link
Member

I'd say if we include isocalendar (and think of a nice API to easily get the .year, .week, etc), we could deprecate weekofyear (which datetime.datetime doesn't have).

Generally, I believe we should align all .dt attributes/methods to what exists in datetime.datetime)

@mgmarino
Copy link
Contributor Author

mgmarino commented Apr 2, 2020

Yes, I think a series of tuples is not completely user friendly. What about like what's done in .dt.components here, i.e. it returns a frame?

isocalendar could be a property returning a frame (with three columns year, week, day), and then the respective series could simply be accessed like:

.dt.isocalendar.year
.dt.isocalendar.week
.dt.isocalendar.day

What do you think?

Re the deprecation of weekofyear, I think that would be a good idea. I get the impression that there's confusion around it by looking through old issues. By access week through isocalendar, it would I think reduce confusion.

@mroeschke
Copy link
Member

+1 to that API @mgmarino. Interested in working on both the isocalendar addition and deprecation of weekofyear (and if there's additional attributes like this)?

@mgmarino
Copy link
Contributor Author

mgmarino commented Apr 2, 2020

Definitely! I'll need a bit of a hand with the Deprecation process, but I think I'm pretty clear how to implement isocalendar. The deprecation should probably be done in a separate PR anyways.

For the isocalendar changes, shall I just keep this PR (to have the thread), with an appropriate rename?

@mroeschke
Copy link
Member

The deprecation should probably be done in a separate PR anyways.

Sure sounds good.

For the isocalendar changes, shall I just keep this PR (to have the thread), with an appropriate rename?

Definitely. Feel free to retool this PR (and original issue) about adding isocalendar

@mgmarino mgmarino changed the title ENH: Add year for week of year ENH: Add isocalendar accessor to DatetimeIndex and Series.dt Apr 3, 2020
@mgmarino mgmarino force-pushed the add-year-for-week-of-year branch 2 times, most recently from bd86caf to 90cb3c1 Compare April 3, 2020 21:48
@mgmarino
Copy link
Contributor Author

mgmarino commented Apr 4, 2020

FYI, the failing tests seem to be due to visual studio not liking the ctuple return. I’ll still need some time to investigate a fix and/or rewrite to avoid the ctuple.

@mroeschke mroeschke added Enhancement and removed Needs Discussion Requires discussion from core team before further action labels Apr 4, 2020
@mgmarino
Copy link
Contributor Author

mgmarino commented Apr 4, 2020

@mroeschke

Ok, I'm pretty confident that the build will now succeed (it succeeded on my local Azure pipelines), but here's the news:

So, I would understand if this feature is not important enough to do this, so I'm happy to have your feedback, if I should:

  • leave as is,
  • bump the dependency version upgrade out to a separate PR.
  • rework the internal code to avoid using ctuple.
  • other?

Thanks for your help on this!

@mroeschke
Copy link
Member

I think it's okay to bump Cython since it's a build dependency cc @jbrockmendel.

@mgmarino
Copy link
Contributor Author

mgmarino commented Apr 6, 2020

Current comments addressed, failing doc build seems to be handled with #33309.

@mgmarino mgmarino force-pushed the add-year-for-week-of-year branch 2 times, most recently from 6f41853 to 60d7088 Compare April 6, 2020 14:42
@jbrockmendel
Copy link
Member

i think we're OK to bump cython to the newest bugfix release


Examples
--------
>>> pd.to_datetime(pd.Series(["2020-01-01"])).dt.isocalendar
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd just create one datetime Series in the beginning and just reference that in the other examples:

>>> ser = pd.Series(pd.to_datetime(["2020-01-01", "NaT"]))
>>> ser
...

>>>  ser.dt.isocalendar
>>>  ser.dt.isocalendar.week

@mgmarino
Copy link
Contributor Author

mgmarino commented Apr 8, 2020

Rebased, happy to hear any further comments!

Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some comments. can you add some prose in timeseries.rst about this (basically just showing it, add a versionadded 1.1 tag). ping on green.


sarray = fields.build_isocalendar_sarray(self.asi8)
iso_calendar_df = DataFrame(
sarray, columns=["year", "week", "day"], dtype="Int64"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use UInt32 (as that's the default for all calendar component ops)

with nogil:
for i in range(count):
if dtindex[i] == NPY_NAT:
ret_val = -1, -1, -1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make these return 0

>>> ser.dt.isocalendar.week
0 53
1 <NA>
Name: week, dtype: Int64
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you will need to update this as per the dtype change

@mgmarino
Copy link
Contributor Author

mgmarino commented Apr 8, 2020

@jreback Thanks for the comments, I finally understood it now.

Dtype -> UInt32, Doc added, Tests green.

@jreback jreback added this to the 1.1 milestone Apr 8, 2020
@jreback
Copy link
Contributor

jreback commented Apr 8, 2020

lgtm. cc @TomAugspurger ok here?

@jreback
Copy link
Contributor

jreback commented Apr 10, 2020

this lgtm. let's merge on green.

@mroeschke mroeschke merged commit 03b510c into pandas-dev:master Apr 12, 2020
@mroeschke
Copy link
Member

Thanks @mgmarino! Very nice. Happy to have a follow up to deprecate weekofyear.

@jorisvandenbossche
Copy link
Member

Since isocalendaris a method and not an attribute on Timestamp, should we make dt.isocalendar a method instead of a property as well for consistency?

@mroeschke
Copy link
Member

Good point @jorisvandenbossche.

@mgmarino mind creating a follow up PR that makes isocalendar a method instead of an attribute?

@mgmarino mgmarino deleted the add-year-for-week-of-year branch April 13, 2020 18:48
mgmarino added a commit to mgmarino/pandas that referenced this pull request Apr 13, 2020
For consistency with `Timestamp.isocalendar`, this should rather be a
method.

Followup of pandas-dev#33220, see the discussions following the merge
@mgmarino
Copy link
Contributor Author

mind creating a follow up PR that makes isocalendar a method instead of an attribute?

@mroeschke Sure no problem, seems reasonable. The PR is open, I'll ping on green.

Happy to have a follow up to deprecate weekofyear.

Will do soon!

mgmarino added a commit to mgmarino/pandas that referenced this pull request Apr 14, 2020
For consistency with `Timestamp.isocalendar`, this should rather be a
method.

Followup of pandas-dev#33220, see the discussions following the merge
mroeschke pushed a commit that referenced this pull request Apr 14, 2020
For consistency with `Timestamp.isocalendar`, this should rather be a
method.

Followup of #33220, see the discussions following the merge
CloseChoice pushed a commit to CloseChoice/pandas that referenced this pull request Apr 20, 2020
For consistency with `Timestamp.isocalendar`, this should rather be a
method.

Followup of pandas-dev#33220, see the discussions following the merge
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Datetime Datetime data dtype Enhancement
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ENH: Add accessor for isocalendar returning iso year, week, and day
6 participants