-
Notifications
You must be signed in to change notification settings - Fork 24.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SQL: Respect custom time zone from client in CAST #76638
Conversation
da174cd
to
1741835
Compare
Pinging @elastic/es-ql (Team:QL) |
* Converts `value` to the target type considering `zoneId` if needed. | ||
* `zoneId` can be null and it is ignored for all non-timezone aware conversions. | ||
*/ | ||
Object convert(Object value, ZoneId zoneId); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've also investigated whether the zoneId
could be stored in the converter but this would be a lot more invasive change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed, adding a specific method for one specific data type doesn't feel ideal, but still better than instantiating the converter per session.
1741835
to
6f001ce
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lg, only left minor observations.
* Parses the given string into a DateTime using `defaultZone` as a default timezone. | ||
*/ | ||
public static ZonedDateTime asDateTime(String dateFormat) { | ||
public static ZonedDateTime asDateTime(String dateFormat, ZoneId defaultZone) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why keep the "default" in defaultZone
? This would be the user-specified TZ (or null
/default), right?
@@ -1173,52 +1170,6 @@ protected LogicalPlan rule(OrderBy ob) { | |||
} | |||
} | |||
|
|||
private static class ImplicitCasting extends AnalyzerRule<LogicalPlan> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a commented instantiation of this rule (L124). Not sure why this was kept, but I think commented/dead code isn't typically kept around, so maybe remove that line too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 I didn't thin about searching for the class. Thanks!
* Converts `value` to the target type considering `zoneId` if needed. | ||
* `zoneId` can be null and it is ignored for all non-timezone aware conversions. | ||
*/ | ||
Object convert(Object value, ZoneId zoneId); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed, adding a specific method for one specific data type doesn't feel ideal, but still better than instantiating the converter per session.
@@ -52,6 +54,8 @@ | |||
|
|||
public class SqlDataTypeConverterTests extends ESTestCase { | |||
|
|||
private static final ZoneId MINUS_5 = ZoneId.of("-05:00"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit:
private static final ZoneId MINUS_5 = ZoneId.of("-05:00"); | |
private static final ZoneId MINUS_5H = ZoneId.of("-05:00"); |
; | ||
|
||
castStringWithoutTZToDatetimeInScript-TZ[Etc/GMT-1] | ||
SELECT hire_date::string, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Supernit: the rest of the tests use capitalised ::STRING
, but I was actually wondering if the conversion is [always] necessary?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 for the capitalization.
But yes, as soon as you get timestamps in another timezone as UTC, the JDBC issue mentioned in #40779 surfaces.
@@ -43,9 +43,7 @@ public CsvSpecTestCase(String fileName, String groupName, String testName, Integ | |||
|
|||
@Override | |||
protected final void doTest() throws Throwable { | |||
// Run the time tests always in UTC | |||
// TODO: https://github.com/elastic/elasticsearch/issues/40779 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we don't (yet) randomise the timezones for the QA tests ref'd in this comment (which I think is the case), maybe we should keep the comment. (BTW, the file ref'd in the issue contains a similar comment.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe I got this wrong but when the comment was added, time.csv-spec
has been special cased (see https://github.com/elastic/elasticsearch/pull/40776/files#diff-387c1ea94dd5d111032f384770a8244f7d4942ac1fd91a68b397614212d59f04R44). But the special case has been removed in the meantime and I think the comment is no longer accurate. Or the reality is now more complicated as described in https://github.com/elastic/elasticsearch/pull/76638/files#diff-aceee053cb68e1bb219cfe4256635c7f77d86b46eb943e5ad42bc7bb360ca945R125
@@ -33,6 +35,8 @@ | |||
|
|||
public class DataTypeConversionTests extends ESTestCase { | |||
|
|||
public static final ZoneId MINUS_5 = ZoneId.of("-05:00"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion: MINUS_5H
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thx, that's nicer indeed 👍
return new ScriptTemplate( | ||
formatTemplate(format("{sql}.", "cast({},{})", fieldAsScript.template())), | ||
|
||
if (DataTypes.isDateTime(dataType)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not entirely sure what's better here: Only passing the timezone for datetime conversions results in a smaller diff of the changes and the resulting scripts are also not unnecessarily large. But it breaks a bit with consistency and might become a problem when more conversions start depending on the timezone.
Not sure whether anyone has a stronger opinion on this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good in general, but to me it's unclear what exactly the original bug report covers and what this PR is addressing.
For example, what happens with CASTs applied on fields. All examples and tests I see refer to date literals only. Also, any more complex queries (a group by on a CASTed string) - SELECT CAST(CAST(birth_date AS DATETIME) AS STRING) t FROM test_emp WHERE YEAR(birth_date)=2021 GROUP BY t
- are not covered.
@@ -51,10 +52,16 @@ public static ZonedDateTime asDateTime(long millis) { | |||
return ZonedDateTime.ofInstant(Instant.ofEpochMilli(millis), UTC); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method can simply call the new asDateTime
method below.
} | ||
|
||
Converter converter() { | ||
return conversion; | ||
} | ||
|
||
ZoneId zoneId() { return zoneId; } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please, follow the style of the other methods: a multi-line method, rather than a single-liner.
f348f35
to
1d03cb0
Compare
Note to myself: Consider reusing |
Resolves #40692
Makes sure that CASTs use the time zone specified in the according REST/JDBC options to convert strings to datetime if no timezone is specified in the string itself. Other conversions to datetime (e.g. from long) still use UTC.
This change breaks some queries using conversions from string to datetime if the string does not include a timezone. E.g. the following query will retrieve various amount of records depending on the timezone specified in the connection:
SELECT hire_date FROM test_emp WHERE hire_date >= '1997-05-19T01:00:00.000'::DATETIME
The following queries are not affected: