[SPARK-12966][SQL] Support ArrayType(DecimalType) in Postgre JDBC #10898

maropu · 2016-01-25T11:50:43Z

The current master throws an exception below;

org.postgresql.util.PSQLException: Unable to find server array type for provided name decimal(38,18)

This pr fixes this issue.

SparkQA · 2016-01-25T13:56:53Z

Test build #49989 has finished for PR 10898 at commit 304d3d0.

This patch fails PySpark unit tests.
This patch merges cleanly.
This patch adds no public classes.

maropu · 2016-01-25T14:24:50Z

retest this please

blbradley · 2016-01-25T14:49:37Z

You should not be converting to doubles when testing BigDecimal or DecimalType..

blbradley · 2016-01-25T14:51:27Z

Also, we should be handling the precision and scale returned from Postgres. I've looked deep enough to see that this is possible.

maropu · 2016-01-25T15:41:53Z

ISTM precision and scale returned by postgres are filled with ResultSetMetaData.getPrecision and ResultSetMetaData.getScale in JDBCRDD.

blbradley · 2016-01-25T16:27:43Z

...er-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/PostgresIntegrationSuite.scala

+    val c13Expected = Seq(1.5, 3.25).map(new java.math.BigDecimal(_))
+    assert(rows(0).getSeq[java.math.BigDecimal](13).zipWithIndex.forall { case (v, idx) =>
+        v.compareTo(c13Expected(idx)) == 0
+    })


Why not follow the style of the test?

Something like

assert(rows(0).getSeq(13) == Seq[BigDecimal](0.11, 0.22))

The test fails because the behaviour of BigDecimal#equals and BigDecimal#compareTo are different.

SparkQA · 2016-01-25T16:27:59Z

Test build #49993 has finished for PR 10898 at commit 304d3d0.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

blbradley · 2016-01-25T16:38:51Z

@maropu Indeed, but they are not available in the metadata pased to dialect.getCatalystType. They probably need to be added to the metadata and logic added to PostgresDialect to handle that.

maropu · 2016-01-25T17:31:50Z

Jenkins, retest this please.

SparkQA · 2016-01-25T17:49:46Z

Test build #49995 has finished for PR 10898 at commit 3626132.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2016-01-25T19:51:57Z

Test build #49998 has finished for PR 10898 at commit 3626132.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

maropu · 2016-01-28T05:49:01Z

I found that it's not easy to support this type in postgresql in the current interface of JdbcDialect.
This is because the postgresql-jdbc implementation cannot handle a decimal type name with precision and scale such as DECIMAL(xx, xx) in Connection#createArrayOf.
To fix this issue, we can add a specific entry in PostgresDialect#getJDBCType (https://github.com/apache/spark/pull/10898/files#diff-23da60722bc6bfc160cbb59bd99f5925R64).
However, this fix makes things worse; JdbcUtils#schemaString cannot pass precision and scale for Decimal when relation defined in postgresql through DataFrameWriter.

I'm not sure how to fix this though, it seems okay to just add TODO comments and throw an unsupported exception for this type for the time being.
cc: @marmbrus

SparkQA · 2016-01-28T06:29:55Z

Test build #50251 has finished for PR 10898 at commit 52eaebe.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

blbradley · 2016-01-28T15:21:11Z

@maropu I have submitted an implementation of this in #10928 and am not getting the error you describe.

blbradley · 2016-01-28T20:08:43Z

@maropu I see a corner case in schemaString. I believe I can get the right behavior now.

maropu · 2016-01-29T01:31:08Z

I'll close this pr and I discuss this in #10928.

Support ArrayType(DecimalType) in Postgre JDBC

304d3d0

Fix bad comparisons for BigDecimal types

3626132

blbradley reviewed Jan 25, 2016
View reviewed changes

Add tests in PostgresIntegrationSuite.scala

52eaebe

maropu mentioned this pull request Jan 29, 2016

[SPARK-12966][SQL] ArrayType(DecimalType) support in Postgres JDBC #10928

Closed

maropu closed this Jan 29, 2016

maropu deleted the SPARK-12747 branch July 5, 2017 11:47

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SPARK-12966][SQL] Support ArrayType(DecimalType) in Postgre JDBC #10898

[SPARK-12966][SQL] Support ArrayType(DecimalType) in Postgre JDBC #10898

maropu commented Jan 25, 2016

SparkQA commented Jan 25, 2016

maropu commented Jan 25, 2016

blbradley commented Jan 25, 2016

blbradley commented Jan 25, 2016

maropu commented Jan 25, 2016

blbradley Jan 25, 2016

blbradley Jan 25, 2016

maropu Jan 25, 2016

SparkQA commented Jan 25, 2016

blbradley commented Jan 25, 2016

maropu commented Jan 25, 2016

SparkQA commented Jan 25, 2016

SparkQA commented Jan 25, 2016

maropu commented Jan 28, 2016

SparkQA commented Jan 28, 2016

blbradley commented Jan 28, 2016

blbradley commented Jan 28, 2016

maropu commented Jan 29, 2016

[SPARK-12966][SQL] Support ArrayType(DecimalType) in Postgre JDBC #10898

[SPARK-12966][SQL] Support ArrayType(DecimalType) in Postgre JDBC #10898

Conversation

maropu commented Jan 25, 2016

SparkQA commented Jan 25, 2016

maropu commented Jan 25, 2016

blbradley commented Jan 25, 2016

blbradley commented Jan 25, 2016

maropu commented Jan 25, 2016

blbradley Jan 25, 2016

Choose a reason for hiding this comment

blbradley Jan 25, 2016

Choose a reason for hiding this comment

maropu Jan 25, 2016

Choose a reason for hiding this comment

SparkQA commented Jan 25, 2016

blbradley commented Jan 25, 2016

maropu commented Jan 25, 2016

SparkQA commented Jan 25, 2016

SparkQA commented Jan 25, 2016

maropu commented Jan 28, 2016

SparkQA commented Jan 28, 2016

blbradley commented Jan 28, 2016

blbradley commented Jan 28, 2016

maropu commented Jan 29, 2016