Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can not use extract <part> on the value of now() #3980

Closed
Tracked by #3148
alamb opened this issue Oct 27, 2022 · 10 comments
Closed
Tracked by #3148

Can not use extract <part> on the value of now() #3980

alamb opened this issue Oct 27, 2022 · 10 comments
Labels
bug Something isn't working

Comments

@alamb
Copy link
Contributor

alamb commented Oct 27, 2022

Describe the bug
I would like to be able to find the current date, or the current day of week, etc

DataFusion supports now() and extract (hour from <timestamp>) syntax ✅

However, you can't use extract with the value of now() 😢

To Reproduce

❯ select extract (day  from now());
Plan("Coercion from [Utf8, Timestamp(Nanosecond, Some(\"UTC\"))] to the signature OneOf([Exact([Utf8, Date32]), Exact([Utf8, Date64]), Exact([Utf8, Timestamp(Second, None)]), Exact([Utf8, Timestamp(Microsecond, None)]), Exact([Utf8, Timestamp(Millisecond, None)]), Exact([Utf8, Timestamp(Nanosecond, None)])]) failed.")

Expected behavior
I expect the day part of now() to be extracted as in postgres:

postgres=# select extract(day from now());
 extract 
---------
      27
(1 row)

Note that the error seems to be that DataFusion can't coerce a Timestamp(Nanosecond, "UTC") to Timestamp(Nanosecond, None) -- aka it is related to timezones

You can work around the problem by explicitly casting the output of now() to timestamp (not timestamptz):

select extract (day  from cast(now() as timestamp));
+-----------------------------+
| datepart(Utf8("DAY"),now()) |
+-----------------------------+
| 27                          |
+-----------------------------+

Additional context
This was reported by early (internal) users of IOx

@alamb alamb added the bug Something isn't working label Oct 27, 2022
@waitingkuo
Copy link
Contributor

i guess this is similar as #3096

@alamb
Copy link
Contributor Author

alamb commented Oct 27, 2022

I'll plan to try and improve things here myself in the next week or two if no one else has a chance

@waitingkuo
Copy link
Contributor

waitingkuo commented Oct 27, 2022

i'm working on this apache/arrow-rs#1380 which is probably one of the root case for extract and date_part
https://github.com/apache/arrow-rs/blob/master/arrow/src/compute/kernels/temporal.rs#L694-L712

in postgres extract returns numeric/decimal while date_part returns double precision. Should we follow it?
https://www.postgresql.org/docs/15/functions-datetime.html

btw, postgres doc recommend use extract

For historical reasons, the date_part function returns values of type double precision. 
This can result in a loss of precision in certain uses. Using extract is recommended instead.

@alamb
Copy link
Contributor Author

alamb commented Oct 27, 2022

Should we follow it?

I am not quite sure what you are asking. Are you asking if we should follow postgres so that date_part returns double?

If so I don't have any strong opinion 🤷

@waitingkuo
Copy link
Contributor

waitingkuo commented Oct 28, 2022

@alamb

datafusion currently returns i32 which might lose some information, e.g.

select date_part('second', timestamp '2000-01-01T00:00:00.1');
+--------------------------------------------------------+
| datepart(Utf8("second"),Utf8("2000-01-01T00:00:00.1")) |
+--------------------------------------------------------+
| 0                                                      |
+--------------------------------------------------------+
1 row in set. Query took 0.000 seconds.

and

select extract(second from timestamp '2000-01-01T00:00:00.1');
+--------------------------------------------------------+
| datepart(Utf8("SECOND"),Utf8("2000-01-01T00:00:00.1")) |
+--------------------------------------------------------+
| 0                                                      |
+--------------------------------------------------------+
1 row in set. Query took 0.000 seconds.

while postgresql has

willy=# select date_part('second', timestamp '2000-01-01T00:00:00.1');
 date_part 
-----------
       0.1
(1 row)

which returns double precision

and

willy=# select extract(second from timestamp '2000-01-01T00:00:00.1');
 extract  
----------
 0.100000
(1 row)

which returns decimal

is it recommended to follow postgresql's return type?
if it is preferred, perhaps we could do a separate pr to deal with this

@alamb
Copy link
Contributor Author

alamb commented Oct 28, 2022

is it recommended to follow postgresql's return type?

That probably makes the most sense

if it is preferred, perhaps we could do a separate pr to deal with this

Agreed -- can you file a ticket? (basically copy/paste #3980 (comment) 😆 )

@waitingkuo
Copy link
Contributor

@alamb
i fired 2 instead 😆 #3996 #3997

@alamb
Copy link
Contributor Author

alamb commented Oct 29, 2022

Thanks @waitingkuo -- We'll get there some day!

@comphead
Copy link
Contributor

comphead commented Dec 9, 2022

This should be fixed by #4548

@alamb
Copy link
Contributor Author

alamb commented Dec 9, 2022

I confirmed that on master the query works now:

(arrow_dev) alamb@MacBook-Pro-8:~/Software/arrow-datafusion$ datafusion-cli
DataFusion CLI v15.0.0
❯ select extract (day  from now());
+-----------------------------+
| datepart(Utf8("DAY"),now()) |
+-----------------------------+
| 9                           |
+-----------------------------+
1 row in set. Query took 0.035 seconds.

Thank you @comphead ❤️

@alamb alamb closed this as completed Dec 9, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants