[SPARK-19209] [WIP] JDBC: Fix "No suitable driver" on the first try #16678

gatorsmile · 2017-01-23T09:13:31Z

What changes were proposed in this pull request?

This PR is to revert some changes made in #15292

@darabos reported Spark 2.1.0 issued the No suitable driver exception at the first time when reading a JDBC data source but simply re-executing the same command a second time "fixes" the No suitable driver error. This only happens when the Hive support is enabled.

Based on my understanding, the problem is java.sql.DriverManager class that can't access drivers loaded by Spark ClassLoader. The changes made in this PR does not sound a solution for the reported issue. It could be caused by the other code changes in 2.1 that change the current ClassLoader

@darabos Could you please help us try it in your local environment? Thanks!

Below is the error @darabos got in his environement.

$ ~/spark-2.1.0/bin/spark-shell --jars org.xerial.sqlite-jdbc-3.8.11.2.jar --driver-class-path org.xerial.sqlite-jdbc-3.8.11.2.jar
[...]
scala> spark.read.format("jdbc").option("url", "jdbc:sqlite:").option("dbtable", "x").load
java.sql.SQLException: No suitable driver
  at java.sql.DriverManager.getDriver(DriverManager.java:315)
  at org.apache.spark.sql.execution.datasources.jdbc.JDBCOptions$$anonfun$7.apply(JDBCOptions.scala:84)
  at org.apache.spark.sql.execution.datasources.jdbc.JDBCOptions$$anonfun$7.apply(JDBCOptions.scala:84)
  at scala.Option.getOrElse(Option.scala:121)
  at org.apache.spark.sql.execution.datasources.jdbc.JDBCOptions.<init>(JDBCOptions.scala:83)
  at org.apache.spark.sql.execution.datasources.jdbc.JDBCOptions.<init>(JDBCOptions.scala:34)
  at org.apache.spark.sql.execution.datasources.jdbc.JdbcRelationProvider.createRelation(JdbcRelationProvider.scala:32)
  at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:330)
  at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:152)
  at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:125)
  ... 48 elided

scala> spark.read.format("jdbc").option("url", "jdbc:sqlite:").option("dbtable", "x").load
java.sql.SQLException: [SQLITE_ERROR] SQL error or missing database (no such table: x)

How was this patch tested?

@darabos Could you make a manual test and see whether this changes can resolve your issue?

srowen

Change seems reasonable in any event

srowen · 2017-01-23T10:37:37Z

sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JDBCOptions.scala

-      DriverManager.getDriver(url).getClass.getCanonicalName
-    }
-  }
+  val driverClass = parameters.get(JDBC_DRIVER_CLASS)


Might be too late, but should these be private?
Does it solve the issue to make this a def instead?

spark.read.format("jdbc").option("url", "jdbc:sqlite:").option("dbtable", "x").load

In the above code, it is not passing JDBC_DRIVER_CLASS. Thus, this does not help.

Actually, if @darabos passes JDBC_DRIVER_CLASS as an JDBC option, the problem is solved. Thus, I suspect it is caused by the classloader.

SparkQA · 2017-01-23T11:10:41Z

Test build #71836 has finished for PR 16678 at commit 9f5b11b.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

darabos · 2017-01-23T16:40:26Z

Thanks for the quick pull request!

@darabos Could you make a manual test and see whether this changes can resolve your issue?

Unfortunately it does not. I built the code with this command:

JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64/ \
  ./dev/make-distribution.sh \
    -Phadoop-2.7 -Phive -Phive-thriftserver

I ran the commands in the pull request description in the fresh spark-shell and got the same errors. The stack trace reflects your changes though:

scala> spark.read.format("jdbc").option("url", "jdbc:sqlite:").option("dbtable", "x").load
java.sql.SQLException: No suitable driver
  at java.sql.DriverManager.getDriver(DriverManager.java:315)
  at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$2.apply(JdbcUtils.scala:55)
  at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$2.apply(JdbcUtils.scala:55)
  at scala.Option.getOrElse(Option.scala:121)
  at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.createConnectionFactory(JdbcUtils.scala:54)
  at org.apache.spark.sql.execution.datasources.jdbc.JDBCRDD$.resolveTable(JDBCRDD.scala:58)
  at org.apache.spark.sql.execution.datasources.jdbc.JDBCRelation.<init>(JDBCRelation.scala:113)
  at org.apache.spark.sql.execution.datasources.jdbc.JdbcRelationProvider.createRelation(JdbcRelationProvider.scala:47)
  at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:320)
  at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:158)
  at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:126)
  ... 48 elided

scala> spark.read.format("jdbc").option("url", "jdbc:sqlite:").option("dbtable", "x").load
java.sql.SQLException: [SQLITE_ERROR] SQL error or missing database (no such table: x)

Have you been unable to reproduce this on your machine? Do you think something's wrong with my environment?

darabos · 2017-01-23T16:44:15Z

(I used make-distribution.sh because when I built with build/mvn -DskipTests clean package I could not reproduce the issue. I think -Phive is probably the culprit, but I have not experimented enough to know for sure.)

gatorsmile · 2017-01-23T17:30:26Z

@darabos Thank you! It confirms my guess. Let me think about it. Thanks!

Because you will not enable Hive support, if you do not use -Phive.

try1

9f5b11b

srowen approved these changes Jan 23, 2017

View reviewed changes

gatorsmile closed this May 23, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SPARK-19209] [WIP] JDBC: Fix "No suitable driver" on the first try #16678

[SPARK-19209] [WIP] JDBC: Fix "No suitable driver" on the first try #16678

Uh oh!

gatorsmile commented Jan 23, 2017 •

edited

Loading

Uh oh!

srowen left a comment

Uh oh!

srowen Jan 23, 2017

Uh oh!

gatorsmile Jan 23, 2017

Uh oh!

SparkQA commented Jan 23, 2017

Uh oh!

darabos commented Jan 23, 2017

Uh oh!

darabos commented Jan 23, 2017

Uh oh!

gatorsmile commented Jan 23, 2017

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

[SPARK-19209] [WIP] JDBC: Fix "No suitable driver" on the first try #16678

[SPARK-19209] [WIP] JDBC: Fix "No suitable driver" on the first try #16678

Uh oh!

Conversation

gatorsmile commented Jan 23, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changes were proposed in this pull request?

How was this patch tested?

Uh oh!

srowen left a comment

Choose a reason for hiding this comment

Uh oh!

srowen Jan 23, 2017

Choose a reason for hiding this comment

Uh oh!

gatorsmile Jan 23, 2017

Choose a reason for hiding this comment

Uh oh!

SparkQA commented Jan 23, 2017

Uh oh!

darabos commented Jan 23, 2017

Uh oh!

darabos commented Jan 23, 2017

Uh oh!

gatorsmile commented Jan 23, 2017

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

gatorsmile commented Jan 23, 2017 •

edited

Loading