Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for nanosecond/microsecond precision in TIMESTAMP and TIMESTAMP WITH TIME ZONE #1284

Closed
6 of 13 tasks
hocanint-amzn opened this issue Aug 13, 2019 · 17 comments
Closed
6 of 13 tasks
Labels
enhancement New feature or request roadmap Top level issues for major efforts in the project

Comments

@hocanint-amzn
Copy link

hocanint-amzn commented Aug 13, 2019

Add nanosecond support to TIMESTAMP and TIMESTAMP WITH TIME ZONE

Introduction

We wish to support nanosecond-precision TIMESTAMP and TIMESTAMP WITH TIME ZONE within Presto to support companies that retrieve data at that granularity. One industry that deals with nanosecond granularity is the finance industry.

Within this project, we will introduce a Fractional second support to TIMESTAMP, and TIMESTAMP WITH TIME ZONE with precision greater than 3 (ms). For example:

CREATE EXTERNAL TABLE test (
	timestamp_microseconds TIMESTAMP(6),
	timestamp_nano_with_tz TIMESTAMP(9) WITH TIME ZONE
) STORED AS TEXT LOCATION 'XYZ';

Tasks

Design Decisions:

Encoding

The current timestamp data types are being encoded as long at the millisecond resolution[1][2][3] when packing into blocks during shuffling and movement of data. The original thought when looking at this project was to always encode the timestamp at the nanosecond resolution within an existing long. With this method, we could store timestamps between the years 1678 to 2262 [4]. If we needed to In the future, if we needed a wider range, we would add a new int that would store the nanoseconds from midnight, similar to how other implementations store timestamps. This approach allowed us to minimize the number of code changes while keeping the ability to enhance the time range in the future if needed.

However, after some research, this approach may not work. For timestamps that contain time zone information, the timezone is packed into the long using the last 3 bytes of the long, and the milliseconds is shifted by the 3 bytes to the left and stored in the remaining bytes[6]. This reduces the available range of possible dates to only 20 days from Jan 1, 1970 [5] which is not sufficient. Thus, we will be forced to information needed into a buffer larger than 8 bytes. The components that we would need to store are:

  1. the timestamp in milliseconds (minimum 50 bits to remain compatible with todays range [0-1,125,899,906,842,624])
  2. nanoseconds portion (minimum of 20 bits [0-1,000,000])
  3. timezone (12 bits if implemented in the same fashion as today)

I believe that precision is not needed to be stored with the other information as we will treat everything at nanosecond resolution.

Thus, I am proposing the following:

  1. For time only data types (TIME, TIME WITH TIME ZONE) we change the resolution to nanosecond, and leave the data type as a Long is sufficient to store the data.
  2. For TIMESTAMP and TIMESTAMP WITH TIME ZONE (data types that contain both date and time), we would need to add an extra int (4 bytes) to store the nanoseconds portion.

Impact:
The impact of adding the extra 4 bytes (int) will be the following:

  1. When users upgrade, they may start to see certain queries start to fail due to OOM since we are making the timestamp bigger.
  2. Functions that originally returned long will need to be changed to something else. This is something that we still need to figure out. Example functions are currentTimestamp (https://github.com/trinodb/trino/blob/1c1108fb460ae85a1c993afbac389d8487336052/presto-main/src/main/java/io/prestosql/operator/scalar/DateTimeFunctions.java#L143). However, these functions do not seem to be called from anywhere.

Mitigation:
There are two mitigation strategies we can employ:

  1. We can employ a config parameter to determine if we pack and unpack the 4 bytes for nanoseconds. If we do not pack the extra 4 bytes, then the behavior should be exactly the same.
  2. We can pack a single bit that determines if a timestamp is being packed using the 4 bytes for nanoseconds. If so, then we unpack an int from the block.

Effects on Precision when comparing two timestamps with different precisions:

The result of any operation on two timestamps will result with a timestamp that is of higher precision. The precision decimals of the lower precision timestamp will be assumed to be 0 if the digits do not exist. This is the behavior of DB2, and seems to be specified in the SQL Spec. (See below for details).

Justification:
As per SQL Spec (https://standards.iso.org/ittf/PubliclyAvailableStandards/c060394_ISO_IEC_TR_19075-2_2015.zip)
"Year-month intervals are comparable only with other year-month intervals. If two year-month intervals have different interval precision, they are, for the purpose of any operations between them, converted to the same precision by appending new datetime fields to either one of the ends of one interval, or to both ends. New datetime fields are assigned a value of 0 (zero)."

Similarly with "Day-time intervals are comparable only with other day-time intervals. If two day-time intervals have different interval precision, they are, for the purpose of any operations between them, converted to the same precision by appending new datetime field to either one of the ends of one interval, or to both ends. New datetime fields are assigned a value of 0 (zero)."

From DB2’s documentation, ( https://www.ibm.com/support/knowledgecenter/en/SSEPEK_10.0.0/sqlref/src/tpc/db2z_datetimecomparisions.html)
"When comparing timestamp values with different precision, the higher precision is used for the comparison and any missing digits for fractional seconds are assumed to be zero."
Displaying Timestamps with Nanosecond granularity:

Today, I believe we are always displaying the timestamp in "uuuu-MM-dd HH:mm:ss.SSS" format. I believe that this should continue and provide functions that can output different formats (date_format()).

What changes are being made?

(THIS IS NOT EXHAUSTIVE AS OF YET)

Grammar Changes:
SqlBase.g4 -> Add specification for precision in grammar (

TIME_WITH_TIME_ZONE
: 'TIME' WS 'WITH' WS 'TIME' WS 'ZONE'
;
TIMESTAMP_WITH_TIME_ZONE
: 'TIMESTAMP' WS 'WITH' WS 'TIME' WS 'ZONE'
;
)

Change SPI to change Long’s to int96 for time/timestamps.

https://github.com/trinodb/trino/blob/master/presto-spi/src/main/java/io/prestosql/spi/type/DateTimeEncoding.java
https://github.com/trinodb/trino/blob/master/presto-spi/src/main/java/io/prestosql/spi/type/SqlTime.java
https://github.com/trinodb/trino/blob/master/presto-spi/src/main/java/io/prestosql/spi/type/SqlTimestamp.java
https://github.com/trinodb/trino/blob/master/presto-spi/src/main/java/io/prestosql/spi/type/SqlTimeWithTimeZone.java
https://github.com/trinodb/trino/blob/master/presto-spi/src/main/java/io/prestosql/spi/type/SqlTimestampWithTimeZone.java
https://github.com/trinodb/trino/blob/master/presto-spi/src/main/java/io/prestosql/spi/type/TimeWithTimeZoneType.java
https://github.com/trinodb/trino/blob/master/presto-spi/src/main/java/io/prestosql/spi/type/TimeType.java
https://github.com/trinodb/trino/blob/master/presto-spi/src/main/java/io/prestosql/spi/type/TimeZoneKey.java
https://github.com/trinodb/trino/blob/master/presto-spi/src/main/java/io/prestosql/spi/type/TimestampType.java
https://github.com/trinodb/trino/blob/master/presto-spi/src/main/java/io/prestosql/spi/type/TimestampWithTimeZoneType.java

Functions:

JDBC:

Parquet Changes:

ORC Changes:

RCFile Changes:

Further changes depending on acceptance on Design.

Endnotes
[1] SqlTime - https://github.com/trinodb/trino/blob/master/presto-spi/src/main/java/io/prestosql/spi/type/SqlTime.java#L29-L30

[2] SqlTimestamp - https://github.com/trinodb/trino/blob/master/presto-spi/src/main/java/io/prestosql/spi/type/SqlTimestamp.java#L32-L33

[3] SqlTimeWithTimeZone - https://github.com/trinodb/trino/blob/master/presto-spi/src/main/java/io/prestosql/spi/type/SqlTimeWithTimeZone.java#L33-L34

[4] 9223372036854775807 (size of long) / 1000,000,000 (ns => s ) / 60 (sec/min) / 60 (min/hr) / 24 (hr/day) / 365 (days/year) = 292 years. 1970 + 292 = 2262, 1970 - 292 = 1678

[5] 2^(64-12) (size of long) / 1000,000,000 (ns => s ) / 60 (sec/min) / 60 (min/hr) / 24 (hr/day) / 365 (day/year) ~ 3 years.

[6] DateTimeEncoding.java - https://github.com/trinodb/trino/blob/master/presto-spi/src/main/java/io/prestosql/spi/type/DateTimeEncoding.java#L26

@martint
Copy link
Member

martint commented Aug 13, 2019

This is a worthy goal. We've talked about it in the past, but it's never been high enough priority for anyone to work on it. In case you're not aware, there are other efforts related to fixing timestamps that we should consider to see if there are potential dependencies that might affect the sequencing of tasks (#37).

Some other things to consider:

  • implementing the parametric datetime types like we did with DECIMAL, which can switch between more efficient and less efficient representations and functions when the required precision allows for it.
  • backward compatibility issues with the existing types, which assume precision = 3 by default. This also affects functions like current_time, current_timestamp, localtime and localtimestamp.
  • mapping between connector-specific datetime types with Presto's (e.g., Hive's TIMESTAMP type is effectively TIMESTAMP(9), but we want to provide an escape hatch to map it to lower-precision types if performance is an issue).
  • conversion rules between various types and precisions.

@sopel39
Copy link
Member

sopel39 commented Aug 14, 2019

mapping between connector-specific datetime types with Presto's (e.g., Hive's TIMESTAMP type is effectively TIMESTAMP(9), but we want to provide an escape hatch to map it to lower-precision types if performance is an issue).

Do you think this might be a real issue? I think this case might be similar as for DECIMAL support when we switched from DOUBLE mapping. Even though timestamp data processing might perform slower, it could be negligible when looking at total query times. The additional complexity of escape hatch might not be worth it then.

@dain
Copy link
Member

dain commented Aug 14, 2019

mapping between connector-specific datetime types with Presto's (e.g., Hive's TIMESTAMP type is effectively TIMESTAMP(9), but we want to provide an escape hatch to map it to lower-precision types if performance is an issue).

Do you think this might be a real issue? I think this case might be similar as for DECIMAL support when we switched from DOUBLE mapping. Even though timestamp data processing might perform slower, it could be negligible when looking at total query times. The additional complexity of escape hatch might not be worth it then.

I think it will be a real issue. When decimal was added, most people did not use decimal types so impact was low. Also, decimal in Hive actually has bounds, and you only hit slowdowns at large numbers. Timestamp is widely used, and FWIU the only mode Hive precision Hive supports nano, which will be slow.

As for the complexity of the escape hatch, I'm not sure. There are only 3 readers, so I would expect it isn't too much work.

One additional problem. I believe Iceberg reuses some of the Hive connector and they unfortunately choose micro second percision.

@electrum
Copy link
Member

I think the first step is to parameterize the types with the fractional seconds precision. For compatibility with existing code, the default value should be 3 (we likely need to change this in the future, as the SQL specification requires 6 as the default). This should have zero user visible impact at the SQL layer, with hopefully minimal compatibility changes at the SPI layer.

The special functions current_time, localtime, current_timestamp, and localtimestamp will also need to be parameterized.

The big parts of this first step are all of the usages of the types and functions:

  • Anything comparing the types with identity (==) will need instanceof, similar to decimal or varchar.
  • Connectors that accept the type in DDL (create table or add column) will need checks to reject any precision other than 3.
  • All functions and operators that accept values of these types will need to be modified to use parameterized variants of the type, with checks to disallow unsupported precision.

@martint
Copy link
Member

martint commented Aug 19, 2019

the SQL specification requires 6 as the default

It's 0 for time and 6 for timestamp:

If <time precision> is not specified, then 0 (zero) is implicit. If <timestamp precision> is not specified,
then 6 is implicit.

For literals, it's 0 if the time/timestamp doesn't contain any decimal digits:

26) The declared type of a <time literal> that does not specify <time zone interval> is 
TIME(P) WITHOUT TIME ZONE, where P is the number of digits in <seconds fraction>,
 if specified, and 0 (zero) otherwise. The declared type of a <time literal> that specifies 
<time zone interval> is TIME(P) WITH TIME ZONE, where P is the number of digits in
 <seconds fraction>, if specified, and 0 (zero) otherwise.
27) The declared type of a <timestamp literal> that does not specify <time zone interval> 
is TIMESTAMP(P) WITHOUT TIME ZONE, where P is the number of digits in <seconds fraction>, 
if specified, and 0 (zero) otherwise. The declared type of a <timestamp literal> that specifies 
<time zone interval> is TIMESTAMP(P) WITH TIME ZONE, where P is the number of digits in 
<seconds fraction>, if specified, and 0 (zero) otherwise.

@electrum
Copy link
Member

The grammar for the special functions is as follows (today they do not support parameterization):

<datetime value function> ::=
    <current date value function>
  | <current time value function>
  | <current timestamp value function>
  | <current local time value function>
  | <current local timestamp value function>

<current date value function> ::=
  CURRENT_DATE

<current time value function> ::=
  CURRENT_TIME [ <left paren> <time precision> <right paren> ]

<current local time value function> ::=
  LOCALTIME [ <left paren> <time precision> <right paren> ]

<current timestamp value function> ::=
  CURRENT_TIMESTAMP [ <left paren> <timestamp precision> <right paren> ]

<current local timestamp value function> ::=
  LOCALTIMESTAMP [ <left paren> <timestamp precision> <right paren> ]

@electrum
Copy link
Member

<datetime type> ::=
    DATE
  | TIME [ <left paren> <time precision> <right paren> ] [ <with or without time zone> ]
  | TIMESTAMP [ <left paren> <timestamp precision> <right paren> ]
      [ <with or without time zone> ]

<with or without time zone> ::=
    WITH TIME ZONE
  | WITHOUT TIME ZONE

<time precision> ::=
  <time fractional seconds precision>

<timestamp precision> ::=
  <time fractional seconds precision>

<time fractional seconds precision> ::=
  <unsigned integer>

@martint
Copy link
Member

martint commented Aug 21, 2019

@hocanint-amzn, I'm happy to help move this forward. @electrum's suggestions about tackling the language and data type first while maintaining current semantics seem like a good approach. Please join the #timestamps channel on Slack (https://prestosql.io/community.html) and we can coordinate the tactical aspects there.

@hocanint-amzn
Copy link
Author

Thank you everyone for the very good discussions and followup. I was not expecting to see so much activity around this issue. Im going to be on vacation for a week. I agree with the first steps. I will be more engaged when I get back. I have joined the timestamps channel in Slack. I'll talk to you all through there for now.

@findepi findepi changed the title Support for nano-second/microsecond timestamps Support for nanosecond/microsecond precision in TIMESTAMP and TIMESTAMP WITH TIME ZONE Aug 23, 2019
@martint martint self-assigned this May 14, 2020
@martint
Copy link
Member

martint commented May 15, 2020

I've started working on this. I'm tracking additional research and approach here: https://github.com/prestosql/presto/wiki/Variable-precision-datetime-types

@martint
Copy link
Member

martint commented May 27, 2020

Here's the PR for adding timestamp(p) type. It's almost ready to go -- just need to finish adding a bunch of tests: #3783

@martint
Copy link
Member

martint commented Jun 6, 2020

@hocanint-amzn, the PR to add support for variable precision timestamp type is merged and will be available in the next release. I’m now working on timestamp with timezone.

@findepi findepi added enhancement New feature or request roadmap Top level issues for major efforts in the project labels Jun 6, 2020
@martint
Copy link
Member

martint commented Jun 18, 2020

Support variable precision timestamp with timezone is merged and will be in the next release.

@srinivascreddy
Copy link

Fantastic, much needed for my company. Couple quick questions:

  1. When is next release?
  2. Does this mean it will be available in AWS Athena? Still trying to understand distinctions between Presto variants.

@martint depending on answer to #2, we may be able to help with some tests.

@findepi
Copy link
Member

findepi commented Sep 6, 2020

@srinivascreddy can't speak about Athena plans, but please note it is not implemented for Hive connector yet.
This is something @aalbu is currently working on. You can follow #3977 for this.

@tisonkun
Copy link

It seems this feature has been supported as we now have TimestampType.TIMESTAMP_MICROS and TimestampType.TIMESTAMP_NANOS?

@martint
Copy link
Member

martint commented Jul 19, 2022

Yes, closing this as done. All the remaining tasks are minor follow ups or belong to other projects/repositories.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request roadmap Top level issues for major efforts in the project
Development

No branches or pull requests

8 participants