Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

date_trunc(null) results in a panic #6701

Closed
Tracked by #3148
alamb opened this issue Jun 16, 2023 · 3 comments · Fixed by #6723
Closed
Tracked by #3148

date_trunc(null) results in a panic #6701

alamb opened this issue Jun 16, 2023 · 3 comments · Fixed by #6723
Labels
bug Something isn't working good first issue Good for newcomers

Comments

@alamb
Copy link
Contributor

alamb commented Jun 16, 2023

Describe the bug

Calling the date_trunc(null) function results in a panic

To Reproduce

DataFusion CLI v26.0.0
❯ select date_trunc('2021-02-01T12:01:02', 'hour');
Optimizer rule 'simplify_expressions' failed
caused by
Arrow error: Parser error: Error parsing timestamp from 'hour': timestamp must contain at least 10 characters
❯ select date_trunc('hour', '2021-02-01T12:01:02');
+------------------------------------------------------+
| date_trunc(Utf8("hour"),Utf8("2021-02-01T12:01:02")) |
+------------------------------------------------------+
| 2021-02-01T12:00:00                                  |
+------------------------------------------------------+
1 row in set. Query took 0.074 seconds.
❯ select date_trunc('hour', null);
thread 'main' panicked at 'called `Option::unwrap()` on a `None` value', /Users/alamb/Software/arrow-datafusion/datafusion/physical-expr/src/datetime_expressions.rs:314:35
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

Expected behavior

null should be returned

Additional context

Found while reviewing #6654 from @Weijun-H

The issue is the use of unwrap() in the date_trunc implementation

I think this is a good first issue for someone as it is localized (involves removing unwrap()) and writing some tests using .slt

@BryanEmond
Copy link
Contributor

I'll look at it.

@BryanEmond
Copy link
Contributor

BryanEmond commented Jun 16, 2023

I've tried to remove the unwrap() like specify in the issue, but nano is an Option, so removing the unwrap just creates errors. So is it better to catch the error and return null or replacing unwrap() by unwrap_or_default() which return "1970-01-01T00:00:00"?

I got this timestamp by running this test in timestamps.slt

query P

SELECT DATE_TRUNC('minute', TIMESTAMP '2022-08-03 14:38:50Z');

----
2022-01-01T00:00:00




error:
[SQL] SELECT DATE_TRUNC('minute', NULL);
[Diff] (-expected|+actual)

  • NULL
  • 1970-01-01T00:00:00
    at tests/sqllogictests/test_files/timestamps.slt:910

@alamb
Copy link
Contributor Author

alamb commented Jun 18, 2023

I've tried to remove the unwrap() like specify in the issue, but nano is an Option, so removing the unwrap just creates errors. So is it better to catch the error and return null or replacing unwrap() by unwrap_or_default() which return "1970-01-01T00:00:00"?

I think what is needed is to ignore nulls (so that null comes out)

something like nano.map(|nano| truncate(nano))

So rather than unwrap use map to only do the calculation of nano is non null

If you make a PR with some example code I can probably offer some more specific advice

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working good first issue Good for newcomers
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants