-
Notifications
You must be signed in to change notification settings - Fork 786
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(compute): Support week0 (PostgreSQL behaviour) for temporal #2052
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did some research and implementations seem to be wildly inconsistent on whether the start of the week is Sunday or Monday, and whether the returned value start at 0 or 1.
postgres extract(dow ...)
and datepart is 0-6 starting on Sunday
mysql weekday is 1-7 starting on Monday
SQL server lets you set the start date to anything, although its datepart starts at 1
I don't know if @alamb has any source on what the "correct" behaviour is. The ISO standard is 1-7 starting on Monday, but I understand America has its own date standards...
|
||
fn quarter0(&self) -> u32 { | ||
self.month0() / 3 | ||
/// Extracts the day of week of a given temporal array as an array of integers |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/// Extracts the day of week of a given temporal array as an array of integers | |
/// Extracts the day of week of a given temporal array as an array of integers | |
/// with each integer representing the number of days from sunday. | |
/// | |
/// For example, Sunday 1st 2010, would return 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fn quarter0(&self) -> u32 { | ||
self.month0() / 3 | ||
/// Extracts the day of week of a given temporal array as an array of integers | ||
pub fn dow<T>(array: &PrimitiveArray<T>) -> Result<Int32Array> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder how this is different than weekday
: https://docs.rs/arrow/18.0.0/arrow/compute/kernels/temporal/fn.weekday.html (defined on line 304 of this same file)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow, I didn't saw this new method, because we use old DF & old Arrow.
My opinion is that DataFusion should follow the postgres behavior, but arrow is not thus constrained. If it would be more convenient to have kernels that started on 0 and one that started on 1 (e.g. |
👍 If it's okey, I can update this PR to reflect similar way. |
@@ -654,6 +620,26 @@ mod tests { | |||
assert_eq!(2, b.value(2)); | |||
} | |||
|
|||
#[test] | |||
fn test_temporal_array_date64_weekday0() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Codecov Report
@@ Coverage Diff @@
## master #2052 +/- ##
=======================================
Coverage 83.55% 83.56%
=======================================
Files 222 222
Lines 58216 58234 +18
=======================================
+ Hits 48645 48662 +17
- Misses 9571 9572 +1
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fn quarter0(&self) -> u32 { | ||
self.month0() / 3 | ||
fn weekday0(&self) -> i32 { | ||
self.weekday().num_days_from_sunday() as i32 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤔 now I see that chrono uses the names num_days_from_sunday
and num_days_from_monday
https://docs.rs/chrono/0.4.19/chrono/enum.Weekday.html#method.num_days_from_sunday
https://docs.rs/chrono/0.4.19/chrono/enum.Weekday.html#method.num_days_from_monday
What would you think about using those names instead of weekday0
and weekday
@ovr (I would be happy to rename things in a follow on PR prior to release)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
Thanks @ovr |
Benchmark runs are scheduled for baseline = 50e285f and contender = 4444cb7. 4444cb7 is a master commit associated with this PR. Results will be available as each benchmark for each run completes. |
Rationale for this change
DataFusion uses temporal to operate with date like structures. I want to implement
EXTRACT(DOW from NOW())
.What changes are included in this PR?
Implementation of
temporal::dow
Are there any user-facing changes?
No
Thanks