-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: error instead of panic when date_bin interval is 0 #6522
Conversation
(TIMESTAMP '2021-06-10 17:05:00Z', 0.5), | ||
(TIMESTAMP '2021-06-10 17:19:10Z', 0.3) | ||
) as t (time, val) | ||
group by time; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I remember @alamb want just right number of tests, I think these 3 tests are good to capture 2 different forms of stride on constant (second --> nanosecond in the code, and month -> month) and an aggregate on data.
Clippy error on things I did not touch. I am happy to change them but I am not sure if I am supposed to do so. Please let me know. |
Please try to update rust and clippy
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
btw in https://www.postgresql.org/docs/current/functions-datetime.html#FUNCTIONS-DATETIME-BIN
The stride interval must be greater than zero and cannot contain units of month or larger.
This is not part of this PR, but we may want consider this as well
Clippy errors are related to #6529 -- I'll have a fix shortly |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the contribution @NGA-TRAN
@@ -522,6 +522,13 @@ fn date_bin_impl( | |||
|
|||
let (stride, stride_fn) = stride.bin_fn(); | |||
|
|||
// Return error if stride is 0 | |||
if stride == 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The error says the stride must be positive and non zero, but this check is only for 0
It seems like the main branch supports negative intervals:
❯ SELECT DATE_BIN(INTERVAL '-15 minutes', TIMESTAMP '2022-08-03 14:38:50Z', TIMESTAMP '1970-01-01T00:00:00Z');
+-----------------------------------------------------------------------------------------------------------------+
| datebin(IntervalMonthDayNano("18446743173709551616"),Utf8("2022-08-03 14:38:50Z"),Utf8("1970-01-01T00:00:00Z")) |
+-----------------------------------------------------------------------------------------------------------------+
| 2022-08-03T14:30:00 |
+-----------------------------------------------------------------------------------------------------------------+
1 row in set. Query took 0.001 seconds.
Thus, given what @comphead found in #6522 (review)
I think we should either:
- Update the error to say that the DATE_BIN stride must be non zero
- Update the code to match the error (prevent negative strides) and then add test coverage for negative intervals
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since I do not know if the negative stride is therefor a specific purpose and someone may use it, I chose not to change its behavior. I go with the option 1 to only fix the panic and have change the error message to "DATE_BIN stride must be non-zero"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -522,6 +522,13 @@ fn date_bin_impl( | |||
|
|||
let (stride, stride_fn) = stride.bin_fn(); | |||
|
|||
// Return error if stride is 0 | |||
if stride == 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since I do not know if the negative stride is therefor a specific purpose and someone may use it, I chose not to change its behavior. I go with the option 1 to only fix the panic and have change the error message to "DATE_BIN stride must be non-zero"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I took the liberty of merging |
The doc CI failure appears unrelated (it is an infrastructure looking thins): https://github.com/apache/arrow-datafusion/actions/runs/5158371209/jobs/9292110237?pr=6522 |
Which issue does this PR close?
Closes ##6517
Rationale for this change
Instead of panic when the stride of date_bin is 0, let us just make it an error message
What changes are included in this PR?
Check if the stride is 0 and error out
Are these changes tested?
Yes
Are there any user-facing changes?
User will see a proper error message
Execution error: DATE_BIN stride must be non-zero and positive
instead ofthread 'main' panicked at 'attempt to calculate the remainder with a divisor of zero', /arrow-datafusion/datafusion/physical-expr/src/datetime_expressions.rs:354:34 note: run with
RUST_BACKTRACE=1environment variable to display a backtrace