[SPARK-20557][SQL] Support for db column type TIMESTAMP WITH TIME ZONE #17832

JannikArndt · 2017-05-02T13:49:00Z

What changes were proposed in this pull request?

SparkSQL can now read from a database table with column type TIMESTAMP WITH TIME ZONE.

How was this patch tested?

Tested against Oracle database.

@JoshRosen, you seem to know the class, would you look at this? Thanks!

gatorsmile · 2017-05-02T17:04:15Z

sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JdbcUtils.scala

@@ -223,6 +223,9 @@ object JdbcUtils extends Logging {
      case java.sql.Types.STRUCT        => StringType
      case java.sql.Types.TIME          => TimestampType
      case java.sql.Types.TIMESTAMP     => TimestampType
+      case java.sql.Types.TIMESTAMP_WITH_TIMEZONE
+                                        => TimestampType
+      case -101                         => TimestampType


What is -101?

-101 is the code Oracle databases use for TIMESTAMP WITH TIME ZONE. There is unfortunately no equivalent in the java.sql.Types Enum (https://docs.oracle.com/javase/8/docs/api/java/sql/Types.html)

Could you add a test case to OracleIntegrationSuite?

I did a try. The fix works in my local docker test environment. Below is the way how we run docker integration test.

build/mvn -Pyarn -Phadoop-2.6 -Dhadoop.version=2.6.0 -Phive-thriftserver -Phive -DskipTests install Before running dockers test . You need to set the docker env variabled. Following does the magic: eval $(docker-machine env default) build/mvn -Pdocker-integration-tests -pl :spark-docker-integration-tests_2.11 compile test

Feel free to ping me if you hit any issue.

Hi, @JannikArndt
Could you add a comment describing about -101 here?

Write fix: 5 min ✅
Write test: 5 min ✅
Run test: 3 h. ✅

Why was this -101 thing put here instead of in the Oracle dialect?

felixcheung · 2017-05-04T16:06:29Z

can you please rebase your PR? it's including a lot of changes from other people

…E ZONE

JannikArndt · 2017-05-05T13:06:46Z

@felixcheung Done

gatorsmile · 2017-05-05T15:48:18Z

ok to test

dongjoon-hyun · 2017-05-05T18:19:10Z

+1. LGTM. I also tested with docker integration test with/without this patch.

gatorsmile · 2017-05-05T18:23:18Z

Yes. I also did it. Thanks! @dongjoon-hyun @JannikArndt

LGTM

SparkQA · 2017-05-05T18:42:10Z

Test build #76498 has finished for PR 17832 at commit 113fd53.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

gatorsmile · 2017-05-05T18:43:29Z

Thanks! Merging to master.

gatorsmile · 2017-05-05T18:44:00Z

@JannikArndt Could you check whether we can do the same thing for TIME WITH TIME ZONE

gatorsmile · 2017-05-05T21:03:23Z

NVM, I will do it in my PR. Thanks!

…ialect ## What changes were proposed in this pull request? In the previous PRs, apache#17832 and apache#17835 , we convert `TIMESTAMP WITH TIME ZONE` and `TIME WITH TIME ZONE` to `TIMESTAMP` for all the JDBC sources. However, this conversion could be risky since it does not respect our SQL configuration `spark.sql.session.timeZone`. In addition, each vendor might have different semantics for these two types. For example, Postgres simply returns `TIMESTAMP` types for `TIMESTAMP WITH TIME ZONE`. For such supports, we should do it case by case. This PR reverts the general support of `TIMESTAMP WITH TIME ZONE` and `TIME WITH TIME ZONE` for JDBC sources, except ORACLE Dialect. When supporting the ORACLE's `TIMESTAMP WITH TIME ZONE`, we only support it when the JVM default timezone is the same as the user-specified configuration `spark.sql.session.timeZone` (whose default is the JVM default timezone). Now, we still treat `TIMESTAMP WITH TIME ZONE` as `TIMESTAMP` when fetching the values via the Oracle JDBC connector, whose client converts the timestamp values with time zone to the timestamp values using the local JVM default timezone (a test case is added to `OracleIntegrationSuite.scala` in this PR for showing the behavior). Thus, to avoid any future behavior change, we will not support it if JVM default timezone is different from `spark.sql.session.timeZone` No regression because the previous two PRs were just merged to be unreleased master branch. ## How was this patch tested? Added the test cases Author: gatorsmile <gatorsmile@gmail.com> Closes apache#19939 from gatorsmile/timezoneUpdate.

gatorsmile reviewed May 2, 2017

View reviewed changes

gatorsmile mentioned this pull request May 2, 2017

[SPARK-20557] [SQL] Support JDBC data type Time with Time Zone #17835

Closed

[SPARK-20557][SQL] Adds support for db column type TIMESTAMP WITH TIM…

113fd53

…E ZONE

asfgit closed this in b31648c May 5, 2017

gatorsmile mentioned this pull request Dec 10, 2017

[SPARK-20557] [SQL] Only support TIMESTAMP WITH TIME ZONE for Oracle Dialect #19939

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SPARK-20557][SQL] Support for db column type TIMESTAMP WITH TIME ZONE #17832

[SPARK-20557][SQL] Support for db column type TIMESTAMP WITH TIME ZONE #17832

JannikArndt commented May 2, 2017

gatorsmile May 2, 2017

JannikArndt May 2, 2017

gatorsmile May 2, 2017

gatorsmile May 2, 2017

dongjoon-hyun May 3, 2017

JannikArndt May 4, 2017

atrigent Jul 4, 2017

felixcheung commented May 4, 2017

JannikArndt commented May 5, 2017

gatorsmile commented May 5, 2017

dongjoon-hyun commented May 5, 2017

gatorsmile commented May 5, 2017

SparkQA commented May 5, 2017

gatorsmile commented May 5, 2017

gatorsmile commented May 5, 2017

gatorsmile commented May 5, 2017

[SPARK-20557][SQL] Support for db column type TIMESTAMP WITH TIME ZONE #17832

[SPARK-20557][SQL] Support for db column type TIMESTAMP WITH TIME ZONE #17832

Conversation

JannikArndt commented May 2, 2017

What changes were proposed in this pull request?

How was this patch tested?

gatorsmile May 2, 2017

Choose a reason for hiding this comment

JannikArndt May 2, 2017

Choose a reason for hiding this comment

gatorsmile May 2, 2017

Choose a reason for hiding this comment

gatorsmile May 2, 2017

Choose a reason for hiding this comment

dongjoon-hyun May 3, 2017

Choose a reason for hiding this comment

JannikArndt May 4, 2017

Choose a reason for hiding this comment

atrigent Jul 4, 2017

Choose a reason for hiding this comment

felixcheung commented May 4, 2017

JannikArndt commented May 5, 2017

gatorsmile commented May 5, 2017

dongjoon-hyun commented May 5, 2017

gatorsmile commented May 5, 2017

SparkQA commented May 5, 2017

gatorsmile commented May 5, 2017

gatorsmile commented May 5, 2017

gatorsmile commented May 5, 2017