-
Notifications
You must be signed in to change notification settings - Fork 3.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[C++] Support Temporal Extraction Functions for duration types #33962
Comments
Note that there is not necessarily a clear definition of what the extracted "second" (or other component) of a duration is. In pandas, there are two variants: for example |
Yeah I think that would make sense. The context of the request was actually implementing |
I want to take this but I'm not really much of a Python guy. So could anyone kindly help me to understand more about the Python counter-part of this task? My questions is that, according to the pandas doc (https://pandas.pydata.org/docs/dev/user_guide/timedeltas.html#timedelta-limitations), the internal representation of pandas Thanks. |
@zanmato1984 you're welcome to tackle this!
Yes, C++ kernel logic would probably fit here - you might be able to reuse other kernels and just add a |
Thank you @rok for the kind help. I'm making some progress in my local and will come up with a PR later. |
take |
When I'm implementing the kernels, I found some unexpected complexity - though it's actually working. I want to make sure that this complexity is necessary and I'm doing it the proper way. So I drafted PR #39267 and hope someone familiar with compute kernels could help to confirm it's essentially correct. If there is a better way, I'm all ears. Thanks. |
Thanks for doing this @zanmato1984 ! |
We might need some more discussion about what we actually want here. The current PR adds "day", "second", "milli/micro/nanosecond" and "subsecond" kernels. And I think this is mostly modelled after the Python For example the "second" kernel in the PR would return the number of seconds in the duration value that represents the number of seconds for the part of the duration of >= 0 and < 1 day. Equivalent Python example: >>> import datetime
>>> td = datetime.timedelta(days=2, hours=3, seconds=4, milliseconds=5)
>>> td.seconds
10804
# which is 3 hours (60*60 seconds) + 4 seconds
>>> 3*3600+4
10804 But a reason for Python to have those attributes, is because that is how it is implemented under the hood (it stores separate numbers of days, seconds and microseconds (https://docs.python.org/3/library/datetime.html#timedelta-objects). Checking with some other software about what kind of operations are support for Duration types:
|
Thanks @jorisvandenbossche for the informative comment! Though I'm neutral to adding or not adding these kernels, I think the following
makes a great point. I felt the same (unintuitive) way when I was adding the Python test case in my PR:
|
Seems so far it's still unclear of what we really want from this request. Though I may still want to help if the discussion is concluded at some point in the future, I'd excuse myself from assignee and close the PR for now. |
Describe the enhancement requested
Would be great to support extracting the
day
,second
,microseconds
,nanoseconds
components of a timedeltaComponent(s)
Python
The text was updated successfully, but these errors were encountered: