[SPARK-10855] [SQL] Add a JDBC dialect for Apache Derby #8982

rick-ibm · 2015-10-05T16:58:38Z

This patch adds a JdbcDialect class, which customizes the datatype mappings for Derby backends. The patch also adds unit tests for the new dialect, corresponding to the existing tests for other JDBC dialects.

JDBCSuite runs cleanly for me with this patch. So does JDBCWriteSuite, although it produces noise as described here: https://issues.apache.org/jira/browse/SPARK-10890

This patch is my original work, which I license to the ASF. I am a Derby contributor, so my ICLA is on file under SVN id "rhillegas": http://people.apache.org/committer-index.html

Touches the following files:

org.apache.spark.sql.jdbc.JdbcDialects

Adds a DerbyDialect.

org.apache.spark.sql.jdbc.JDBCSuite

Adds unit tests for the new DerbyDialect.

JoshRosen · 2015-10-05T21:27:42Z

;Jenkins, this is ok to test

SparkQA · 2015-10-05T23:44:45Z

Test build #43250 has finished for PR 8982 at commit 11fdb04.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

rick-ibm · 2015-10-06T16:37:11Z

Tests seem to have passed cleanly. Any comments? Suggestions for improvements? Thanks.

rxin · 2015-10-07T20:06:51Z

sql/core/src/main/scala/org/apache/spark/sql/jdbc/JdbcDialects.scala

+case object DerbyDialect extends JdbcDialect {
+  override def canHandle(url: String): Boolean = url.startsWith("jdbc:derby")
+  override def getCatalystType(
+    sqlType: Int, typeName: String, size: Int, md: MetadataBuilder): Option[DataType] = {


indent 4 space here

SparkQA · 2015-10-09T00:51:52Z

Test build #43436 has finished for PR 8982 at commit f83d86c.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

…the new dialect.

…can create Derby tables with DECIMAL columns.

…of a DECIMAL if the max Derby precision is exceeded.

…tion, make an if expression more compact.

rick-ibm · 2015-10-09T16:27:28Z

sql/core/src/main/scala/org/apache/spark/sql/jdbc/JdbcDialects.scala

+  override def canHandle(url: String): Boolean = url.startsWith("jdbc:derby")
+  override def getCatalystType(
+      sqlType: Int, typeName: String, size: Int, md: MetadataBuilder): Option[DataType] = {
+    if (sqlType == Types.REAL) Some(FloatType) else None


I have changed the original code so that it matches the pattern followed by the other getCatalystType() overloads: The code returns a Some(FloatType) rather than an Option(FloatType). I am new to programming in Scala, but the former expression is the pattern which I see all over the code. However, Reynold recommended the latter expression. If Option(FloatType) is the correct value to return, then we should probably change all of the other getCatalystType() overloads to return Option(...) rather than Some(...). I would appreciate your guidance on this point. Thanks.

Thanks for the review comments, Reynold and Michael. Hopefully, my last commit (e6022ed) addresses your concerns:

o I adjusted the indentation of a method signature, per Reynold's advice.

o I simplified an "if" expression, also per Reynold's advice. However, I have a question about this change (see my comment above on the line in question).

Thanks,
-Rick

In this case (where you are clearly not passing null) they are equivalent.
Option will return None in the case of null, which is why I prefer using
that method of construction.
On Oct 9, 2015 9:28 AM, "Rick Hillegas" notifications@github.com wrote:

In sql/core/src/main/scala/org/apache/spark/sql/jdbc/JdbcDialects.scala
#8982 (comment):

@@ -287,3 +288,30 @@ case object MsSqlServerDialect extends JdbcDialect {
case _ => None
}
}
+
+/**

* :: DeveloperApi ::

* Default Apache Derby dialect, mapping real on read

* and string/byte/short/boolean/decimal on write.

*/
+@DeveloperAPI
+case object DerbyDialect extends JdbcDialect {

override def canHandle(url: String): Boolean = url.startsWith("jdbc:derby")

override def getCatalystType(

sqlType: Int, typeName: String, size: Int, md: MetadataBuilder): Option[DataType] = {

if (sqlType == Types.REAL) Some(FloatType) else None

I have changed the original code so that it matches the pattern followed
by the other getCatalystType() overloads: The code returns a
Some(FloatType) rather than an Option(FloatType). I am new to programming
in Scala, but the former expression is the pattern which I see all over the
code. However, Reynold recommended the latter expression. If
Option(FloatType) is the correct value to return, then we should probably
change all of the other getCatalystType() overloads to return Option(...)
rather than Some(...). I would appreciate your guidance on this point.
Thanks.

—
Reply to this email directly or view it on GitHub
https://github.com/apache/spark/pull/8982/files#r41649575.

Thanks for the quick response, Michael. Should I adjust the other getCatalystType() overloads to follow the Option(...) pattern? I could do that as part of this pull-request or I could open another JIRA if you think that is worth doing. Thanks.

SparkQA · 2015-10-09T16:44:44Z

Test build #43471 has finished for PR 8982 at commit 989ce9b.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

rick-ibm · 2015-10-09T17:43:59Z

My latest commit (d56897e) adjusts the Option usage per Reynold and Michael's advice. I believe that I have addressed all issues with this pull request. We can file a new JIRA if we want to harmonize the usage of Option in JdbcDialects.scala. Thanks.

SparkQA · 2015-10-09T18:01:34Z

Test build #43478 has finished for PR 8982 at commit e6022ed.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2015-10-09T19:48:53Z

Test build #43481 has finished for PR 8982 at commit d56897e.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

rxin · 2015-10-09T20:36:38Z

Thanks - I've merged this.

rxin reviewed Oct 7, 2015
View reviewed changes

rick-ibm force-pushed the b_10855 branch from 11fdb04 to f83d86c Compare October 8, 2015 22:45

rick-ibm added 4 commits October 9, 2015 07:38

SPARK-10855: Add a JDBC dialect for Apache Derby. Add unit tests for …

c889324

…the new dialect.

SPARK-10855: Fix DECIMAL mapping for Derby JdbcDialect so that Spark …

9d52d35

…can create Derby tables with DECIMAL columns.

SPARK-10855: For Derby JDBC dialect, only change the precision/scale …

02905cd

…of a DECIMAL if the max Derby precision is exceeded.

SPARK-10855: Correct an overly long line.

989ce9b

rick-ibm force-pushed the b_10855 branch from f83d86c to 989ce9b Compare October 9, 2015 14:38

SPARK-10855: Address review comments from Reynold Xin: adjust indenta…

e6022ed

…tion, make an if expression more compact.

rick-ibm reviewed Oct 9, 2015
View reviewed changes

SPARK-10855: Adjust Option usage per comments from Reynold and Michael.

d56897e

asfgit closed this in 12b7191 Oct 9, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SPARK-10855] [SQL] Add a JDBC dialect for Apache Derby #8982

[SPARK-10855] [SQL] Add a JDBC dialect for Apache Derby #8982

rick-ibm commented Oct 5, 2015

JoshRosen commented Oct 5, 2015

SparkQA commented Oct 5, 2015

rick-ibm commented Oct 6, 2015

rxin Oct 7, 2015

SparkQA commented Oct 9, 2015

rick-ibm Oct 9, 2015

rick-ibm Oct 9, 2015

marmbrus Oct 9, 2015

rick-ibm Oct 9, 2015

SparkQA commented Oct 9, 2015

rick-ibm commented Oct 9, 2015

SparkQA commented Oct 9, 2015

SparkQA commented Oct 9, 2015

rxin commented Oct 9, 2015

[SPARK-10855] [SQL] Add a JDBC dialect for Apache Derby #8982

[SPARK-10855] [SQL] Add a JDBC dialect for Apache Derby #8982

Conversation

rick-ibm commented Oct 5, 2015

JoshRosen commented Oct 5, 2015

SparkQA commented Oct 5, 2015

rick-ibm commented Oct 6, 2015

rxin Oct 7, 2015

Choose a reason for hiding this comment

SparkQA commented Oct 9, 2015

rick-ibm Oct 9, 2015

Choose a reason for hiding this comment

rick-ibm Oct 9, 2015

Choose a reason for hiding this comment

marmbrus Oct 9, 2015

Choose a reason for hiding this comment

rick-ibm Oct 9, 2015

Choose a reason for hiding this comment

SparkQA commented Oct 9, 2015

rick-ibm commented Oct 9, 2015

SparkQA commented Oct 9, 2015

SparkQA commented Oct 9, 2015

rxin commented Oct 9, 2015