-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
Add "semester" as a time/date component to DatetimeIndex #22362
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I believe this is intended as there is nothing that requires a "2 quarter period" to start on January and July, respectively. |
@WillAyd, it could be intended, but it is confusing indeed. Grouping by "2Q":
This behaviour is dangerous because it changes according to the dataset. As soon as your first date is in May the 2Q grouping won't start anymore on January, 1st. It is what happened to me: I was using "2Q" as a synonym for semester, and just changing the demo dataset groups went avoc... Adding a further interval such as Z for semester would allow to use it as unitary and avoid this behaviour. |
I think something like this would be useful. @Nemecsek does the use case depend on "semester" corresponding to e.g. Sep-Dec/Jan-May, or would it be sufficient to have something like Aside from that, the design question that comes to mind is whether we should modify Quarter offsets to be customizable, or implement new Half/Semester/Season offset classes. PRs welcome. You'll want to look at pandas.tseries.offsets. |
@jbrockmendel, I would implement the "Half" to keep the logic. |
I agree with @Nemecsek. In addition: this_month = pd.Timestamp('2019-9')
sem = pd.offsets.QuarterBegin(n=2, startingMonth=1)
print(this_month - sem) gives This is expected (by the definition of QuarterBegin). Hopefully the "semester" offset will return the second option. |
Groupby is missing "semester"
returns groups based on the first date time index of the dataset, not on the year semesters that begin on January, 1st and July, 1st:
while I would expect:
This issue is difficult to spot, as the behaviour changes according to the dataset, while it should be consistent. I didn't spot it with my first dataset (starting on January).
The same problem will show when grouping by 6MS (six months, start)
Semester frequency is missing from Pandas'offset-aliases
The text was updated successfully, but these errors were encountered: