New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-32272][SQL] Add SQL standard command SET TIME ZONE #29064
Conversation
cc @cloud-fan @maropu @dongjoon-hyun thanks for reviewing |
sql/core/src/main/scala/org/apache/spark/sql/execution/command/ListTimeZonesCommand.scala
Outdated
Show resolved
Hide resolved
|
OK, I will add fully support then. |
Test build #125602 has finished for PR 29064 at commit
|
Test build #125604 has finished for PR 29064 at commit
|
the test case for listing all timezones is removed because it varies from different JDKs. The interval is supported and also some extensions were made too
cc @cloud-fan |
sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBase.g4
Outdated
Show resolved
Hide resolved
Test build #125749 has finished for PR 29064 at commit
|
val interval = parseIntervalLiteral(ctx.interval) | ||
if (interval.months != 0 || interval.days != 0 || | ||
math.abs(interval.microseconds) > 18 * DateTimeConstants.MICROS_PER_HOUR) { | ||
throw new ParseException("The interval value must be in the range of [-18, +18] hours", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shall we also fail for things like INTERVAL 1 MICROSECOND
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we want to fail this like this completely, we need a new interval parser, otherwise, we can only forbid it based on the interval result. e.g. INTERVAL 1 MICROSECOND -1 MICROSECOND 1 HOUR
will work...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
INTERVAL 1 MICROSECOND -1 MICROSECOND 1 HOUR
is OK. We should forbid truncation like INTERVAL 1 MICROSECOND
, which you have to ignore the 1 microsecond.
sql/core/src/main/scala/org/apache/spark/sql/execution/SparkSqlParser.scala
Show resolved
Hide resolved
| SET TIME ZONE ALL #listTimeZones | ||
| SET TIME ZONE interval #setTimeZone | ||
| SET TIME ZONE timezone=(STRING | LOCAL) #setTimeZone | ||
| SET TIME ZONE .*? #setTimeZone |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so we add this only for better parser message?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is used to fail invalid set time zone syntax explicitly, cuz' now we support
spark-sql (default)> set time zone abcd;
key value
time zone abcd <undefined>
sql/core/src/main/scala/org/apache/spark/sql/execution/SparkSqlParser.scala
Outdated
Show resolved
Hide resolved
-- !query schema | ||
struct<key:string,value:string> | ||
-- !query output | ||
spark.sql.session.timeZone Asia/Hong_Kong |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like a weird output for SET TIME ZONE 'Asia/Hong_Kong'
. Shall we add a Project
node on the SetCommand
to give better output?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do we expect as output? only keep the value Asia/Hong_Kong
here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is there any reference?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
postgres=# SET TIME ZONE 'Asia/Hong_kong';
SET
Test build #125759 has finished for PR 29064 at commit
|
Test build #125763 has finished for PR 29064 at commit
|
Test build #125773 has finished for PR 29064 at commit
|
retest this please |
Test build #125791 has finished for PR 29064 at commit
|
We should also add a document page in SQL reference for it. |
cc @maropu @cloud-fan @huaxingao. Please check the reference doc for set tz command. |
|
||
### Description | ||
|
||
The `SET TIME ZONE` command sets the current default time zone(`spark.sql.session.timeZone`) displacement for the `SparkSession`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The SET TIME ZONE
command sets the time zone of the current session.
|
||
* **LOCAL** | ||
|
||
Respectively set the time zone the one specified in environment variable `TZ`, or to the operating system time zone if `TZ` is undefined. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
set the time zone "to" the one specified
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
operating system
-> system
? We actually use the JVM timezone which can be different from the operating system,
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, it prefers user.timezone
sql/core/src/main/scala/org/apache/spark/sql/execution/SparkSqlParser.scala
Show resolved
Hide resolved
Test build #125891 has finished for PR 29064 at commit
|
|
||
* **LOCAL** | ||
|
||
Respectively set the time zone the one specified in environment variable `TZ`, or to the operating system time zone if `TZ` is undefined. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Respectively set -> Sets?
|
||
* **interval_literal** | ||
|
||
The [interval literal](sql-ref-literals.html#interval-literal) represents the displacement of time zone to the 'UTC'. It must be in the range of [-18, 18] hours and max to second precision, e.g. `INTERVAL 2 HOURS 30 MINITUES` or `INTERVAL '15:40:32' HOUR TO SECOND`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess something like this?
The interval literal represents the difference between the (system?) time zone and UTC
.
SET TIME ZONE LOCAL; | ||
|
||
-- Set time zone to the region-based zone ID. | ||
SET TIME ZONE 'America/Los_Angeles' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: ;
in the end?
@yaooqinn Please also change menu-sql.yaml to add SET TIME ZONE to the side menu. |
Test build #125940 has finished for PR 29064 at commit
|
|
||
* **LOCAL** | ||
|
||
Set the time zone to the one specified in the java `user.timezone` property or environment variable `TZ`, or to the system time zone if they are undefined. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we clarify what happens if both the java user.timezone
property or environment variable TZ
are specified?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about
Set the time zone to the one specified in the java `user.timezone` property,
or to the environment variable `TZ` if `user.timezone` is undefined,
or to the system time zone if both of them are undefined.
Test build #125959 has finished for PR 29064 at commit
|
retest this please |
All github action checks passed, I think we are good to go. Thanks, merging to master! |
Test build #125963 has finished for PR 29064 at commit
|
Test build #125974 has finished for PR 29064 at commit
|
@@ -240,6 +240,9 @@ statement | |||
| MSCK REPAIR TABLE multipartIdentifier #repairTable | |||
| op=(ADD | LIST) identifier (STRING | .*?) #manageResource | |||
| SET ROLE .*? #failNativeCommand | |||
| SET TIME ZONE interval #setTimeZone | |||
| SET TIME ZONE timezone=(STRING | LOCAL) #setTimeZone | |||
| SET TIME ZONE .*? #setTimeZone |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It sounds like none of the systems are following ANSI SQL syntax completely.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are close to the DB2 syntax, except that we support interval and LOCAL
, and we don't allow the optional SESSION
keyword.
What changes were proposed in this pull request?
This PR adds the SQL standard command -
SET TIME ZONE
to the current default time zone displacement for the current SQL-session, which is the same as the existing `set spark.sql.session.timeZone=xxx'.All in all, this PR adds syntax as following,
Why are the changes needed?
ANSI compliance and supply pure SQL users a way to retrieve all supported TimeZones
Does this PR introduce any user-facing change?
yes, add new syntax.
How was this patch tested?
add unit tests.
and locally verified reference doc