New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix query failure of predicate pushdown to BigQuery DATETIME type #9005
Conversation
@@ -147,7 +147,7 @@ static String dateToStringConverter(Object value) | |||
|
|||
static String datetimeToStringConverter(Object value) | |||
{ | |||
return formatTimestamp(((Long) value).longValue(), systemDefault()); | |||
return formatTimestamp(((Long) value).longValue() / NANOSECONDS_PER_MICROSECOND, UTC); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does BigQuery connector only read TIMESTAMP(3)? Otherwise the Object can be a long-timestamp type which is not backed by a long
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I understand correctly, this method is called only when building filter and not used for reading BigQuery data. If we specify long-timestamp in condition, this method won't be called because the precision is different.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, indeed since we only have type-mapping for timestamp/tz millis we'll always get a long
here.
Can you add a checkState/verify
here + a test that tries to use TIMESTAMP(6)
so that once we support higher precisions the test will fail and we'll be reminded to correct this code?
Otherwise we may forget about this.
(If you are already working on higher precision timestamp support then feel free to ignore since there's a lower chance of forgetting this).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alternative, you can com.google.common.math.LongMath#divide(value, 1000, RoundingMode.UNNECESSARY)
so that the code doesn't do silent data conversion for precision >3.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NANOSECONDS_PER_MICROSECOND -> MICROSECONDS_PER_MILLI, i guess
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -147,7 +147,7 @@ static String dateToStringConverter(Object value) | |||
|
|||
static String datetimeToStringConverter(Object value) | |||
{ | |||
return formatTimestamp(((Long) value).longValue(), systemDefault()); | |||
return formatTimestamp(((Long) value).longValue() / NANOSECONDS_PER_MICROSECOND, UTC); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alternative, you can com.google.common.math.LongMath#divide(value, 1000, RoundingMode.UNNECESSARY)
so that the code doesn't do silent data conversion for precision >3.
@@ -147,7 +147,7 @@ static String dateToStringConverter(Object value) | |||
|
|||
static String datetimeToStringConverter(Object value) | |||
{ | |||
return formatTimestamp(((Long) value).longValue(), systemDefault()); | |||
return formatTimestamp(((Long) value).longValue() / NANOSECONDS_PER_MICROSECOND, UTC); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NANOSECONDS_PER_MICROSECOND -> MICROSECONDS_PER_MILLI, i guess
{ | ||
String tableName = "test.test_datetime_type"; | ||
|
||
onBigQuery("DROP TABLE IF EXISTS " + tableName); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this can run on a shared resources, so table name should be randomized
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I remember there was opposite comment (should be fixed) by other people in the past. While I can understand both opinion, BigQuery has expiration_timestamp
table option. I think we can use this option to cleanup automatically. Of course we can improve by try-with-resources, but it's not perfect solution.
I will simply change to randomized names anyway in this PR.
|
||
assertQuery( | ||
"SELECT a FROM " + tableName, | ||
"VALUES (CAST('2021-08-27 12:34:56.123' AS TIMESTAMP(3))), (CAST('2022-09-28 01:02:03.987' AS TIMESTAMP(3)))"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CAST('2022-09-28 01:02:03.987' AS TIMESTAMP(3))
is very verbose way of writing TIMESTAMP '2022-09-28 01:02:03.987'
for this, instead of h2 , use
assertThat(query( trino query using big query )
.matches ( trino query using values )
especially when working with date/time things.
"SELECT a FROM " + tableName + " WHERE a = CAST('2021-08-27 12:34:56.123' AS TIMESTAMP(3))", | ||
"VALUES (CAST('2021-08-27 12:34:56.123' AS TIMESTAMP(3)))"); | ||
assertQuery( | ||
"SELECT a FROM " + tableName + " WHERE rand() = 42 OR a = CAST('2021-08-27 12:34:56.123' AS TIMESTAMP(3))", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the case with rand()
tests the engine, not the connector. i feel like you can remove it.
bffbc1a
to
69a683c
Compare
ImmutableList.of("'2021-08-27 12:34:56.123'", "'2022-09-28 01:02:03.987'"))) { | ||
assertThat(query("SELECT a FROM " + table.getName())) | ||
.matches("VALUES TIMESTAMP '2021-08-27 12:34:56.123', TIMESTAMP '2022-09-28 01:02:03.987'"); | ||
assertThat(query("SELECT a FROM " + table.getName() + " WHERE a = TIMESTAMP '2021-08-27 12:34:56.123'")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
any chance to have io.trino.testing.AbstractTestDistributedQueries#testDataMappingSmokeTest
run for BQ?
it would probably cover this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BQ connector doesn't support writes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could it support writes for test purposes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Storage write API is in the beta preview right now
Let me merge PR though of course I agree adding type mapping tests eventually. |
The query having filter on BigQuery
DATETIME
type caused below error before this change: