-
Notifications
You must be signed in to change notification settings - Fork 902
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[KYUUBI #3487] Provide Hive JDBC Dialect support for Spark/PySpark to…
… connect Kyuubi via JDBC Source …and register to JdbcDialects ### _Why are the changes needed?_ close #3487 . 1. add kyuubi-extension-spark-client_2.12 module, and introduce KyuubiSparkClientExtension 2. implement HiveDialect and register to JdbcDialects ### _How was this patch tested?_ - [x] Add some test cases that check the changes thoroughly including negative and positive cases if possible - [ ] Add screenshots for manual tests if appropriate - [x] [Run test](https://kyuubi.apache.org/docs/latest/develop_tools/testing.html#running-tests) locally before make a pull request Closes #3489 from bowenliang123/3487-hive-jdbc-dialect. Closes #3487 3ed8be7 [Bowen Liang] nit 47be0ba [Bowen Liang] update docs for hive jdbc dialect 84623a3 [Bowen Liang] update pom in minor details b7edc6c [Bowen Liang] add ut 968bb72 [Bowen Liang] move to package org.apache.spark.sql.dialect 03eab32 [Bowen Liang] renamed to kyuubi-extension-spark-jdbc-dialect module and moved to extensions/spark 9a4eaf4 [Bowen Liang] add kyuubi-extension-spark-client_2.12 module, implement HiveDialect and register to JdbcDialects Authored-by: Bowen Liang <liangbowen@gf.com.cn> Signed-off-by: Cheng Pan <chengpan@apache.org>
- Loading branch information
1 parent
c3c7707
commit 1a9bf93
Showing
9 changed files
with
284 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -25,3 +25,4 @@ Extensions for Spark | |
functions | ||
../../../connector/spark/index | ||
lineage | ||
jdbc-dialect |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,42 @@ | ||
<!-- | ||
- Licensed to the Apache Software Foundation (ASF) under one or more | ||
- contributor license agreements. See the NOTICE file distributed with | ||
- this work for additional information regarding copyright ownership. | ||
- The ASF licenses this file to You under the Apache License, Version 2.0 | ||
- (the "License"); you may not use this file except in compliance with | ||
- the License. You may obtain a copy of the License at | ||
- | ||
- http://www.apache.org/licenses/LICENSE-2.0 | ||
- | ||
- Unless required by applicable law or agreed to in writing, software | ||
- distributed under the License is distributed on an "AS IS" BASIS, | ||
- WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
- See the License for the specific language governing permissions and | ||
- limitations under the License. | ||
--> | ||
|
||
|
||
# Hive Dialect Support | ||
|
||
Hive Dialect plugin aims to provide Hive Dialect support to Spark's JDBC source. | ||
It will auto registered to Spark and applied to JDBC sources with url prefix of `jdbc:hive2://` or `jdbc:kyuubi://`. | ||
|
||
Hive Dialect helps to solve failures access Kyuubi. It fails and unexpected results when querying data from Kyuubi as JDBC source with Hive JDBC Driver or Kyuubi Hive JDBC Driver in Spark, as Spark JDBC provides no Hive Dialect support out of box and quoting columns and other identifiers in ANSI as "table.column" rather than in HiveSQL style as \`table\`.\`column\`. | ||
|
||
|
||
## Features | ||
|
||
- quote identifier in Hive SQL style | ||
|
||
eg. Quote `table.column` in \`table\`.\`column\` | ||
|
||
## Usage | ||
|
||
1. Get the Kyuubi Hive Dialect Extension jar | ||
1. compile the extension by executing `build/mvn clean package -pl :kyuubi-extension-spark-jdbc-dialect_2.12 -DskipTests` | ||
2. get the extension jar under `extensions/spark/kyuubi-extension-spark-jdbc-dialect/target` | ||
3. If you like, you can compile the extension jar with the corresponding Maven's profile on you compile command, i.e. you can get extension jar for Spark 3.2 by compiling with `-Pspark-3.1` | ||
2. Put the Kyuubi Hive Dialect Extension jar `kyuubi-extension-spark-jdbc-dialect_-*.jar` into `$SPARK_HOME/jars` | ||
3. Enable `KyuubiSparkJdbcDialectExtension`, by setting `spark.sql.extensions=org.apache.spark.sql.dialect.KyuubiSparkJdbcDialectExtension`, i.e. | ||
- add a config into `$SPARK_HOME/conf/spark-defaults.conf` | ||
- or add setting config in SparkSession builder |
55 changes: 55 additions & 0 deletions
55
extensions/spark/kyuubi-extension-spark-jdbc-dialect/pom.xml
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,55 @@ | ||
<?xml version="1.0" encoding="UTF-8"?> | ||
<!-- | ||
~ Licensed to the Apache Software Foundation (ASF) under one or more | ||
~ contributor license agreements. See the NOTICE file distributed with | ||
~ this work for additional information regarding copyright ownership. | ||
~ The ASF licenses this file to You under the Apache License, Version 2.0 | ||
~ (the "License"); you may not use this file except in compliance with | ||
~ the License. You may obtain a copy of the License at | ||
~ | ||
~ http://www.apache.org/licenses/LICENSE-2.0 | ||
~ | ||
~ Unless required by applicable law or agreed to in writing, software | ||
~ distributed under the License is distributed on an "AS IS" BASIS, | ||
~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
~ See the License for the specific language governing permissions and | ||
~ limitations under the License. | ||
--> | ||
|
||
<project xmlns="http://maven.apache.org/POM/4.0.0" | ||
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" | ||
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd"> | ||
<parent> | ||
<artifactId>kyuubi-parent</artifactId> | ||
<groupId>org.apache.kyuubi</groupId> | ||
<version>1.7.0-SNAPSHOT</version> | ||
<relativePath>../../../pom.xml</relativePath> | ||
|
||
</parent> | ||
<modelVersion>4.0.0</modelVersion> | ||
|
||
<artifactId>kyuubi-extension-spark-jdbc-dialect_2.12</artifactId> | ||
<name>Kyuubi Spark JDBC Dialect plugin</name> | ||
<packaging>jar</packaging> | ||
<url>https://kyuubi.apache.org/</url> | ||
|
||
<dependencies> | ||
<dependency> | ||
<groupId>org.apache.spark</groupId> | ||
<artifactId>spark-sql_${scala.binary.version}</artifactId> | ||
<scope>provided</scope> | ||
</dependency> | ||
</dependencies> | ||
|
||
<build> | ||
<outputDirectory>target/scala-${scala.binary.verison}/classes</outputDirectory> | ||
<testOutputDirectory>target/scala-${scala.binary.verison}/test-classes</testOutputDirectory> | ||
|
||
<testResources> | ||
<testResource> | ||
<directory>${project.basedir}/src/test/resources</directory> | ||
</testResource> | ||
</testResources> | ||
</build> | ||
|
||
</project> |
37 changes: 37 additions & 0 deletions
37
...on-spark-jdbc-dialect/src/main/scala/org/apache/spark/sql/dialect/KyuubiHiveDialect.scala
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,37 @@ | ||
/* | ||
* Licensed to the Apache Software Foundation (ASF) under one or more | ||
* contributor license agreements. See the NOTICE file distributed with | ||
* this work for additional information regarding copyright ownership. | ||
* The ASF licenses this file to You under the Apache License, Version 2.0 | ||
* (the "License"); you may not use this file except in compliance with | ||
* the License. You may obtain a copy of the License at | ||
* | ||
* http://www.apache.org/licenses/LICENSE-2.0 | ||
* | ||
* Unless required by applicable law or agreed to in writing, software | ||
* distributed under the License is distributed on an "AS IS" BASIS, | ||
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
* See the License for the specific language governing permissions and | ||
* limitations under the License. | ||
*/ | ||
|
||
package org.apache.spark.sql.dialect | ||
|
||
import java.util.Locale | ||
|
||
import org.apache.spark.sql.jdbc.JdbcDialect | ||
|
||
object KyuubiHiveDialect extends JdbcDialect { | ||
|
||
override def canHandle(url: String): Boolean = { | ||
val urlLowered = url.toLowerCase(Locale.ROOT) | ||
|
||
urlLowered.startsWith("jdbc:hive2://") || | ||
urlLowered.startsWith("jdbc:kyuubi://") | ||
} | ||
|
||
override def quoteIdentifier(colName: String): String = { | ||
colName.split('.').map(part => s"`$part`").mkString(".") | ||
} | ||
|
||
} |
28 changes: 28 additions & 0 deletions
28
...dialect/src/main/scala/org/apache/spark/sql/dialect/KyuubiSparkJdbcDialectExtension.scala
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,28 @@ | ||
/* | ||
* Licensed to the Apache Software Foundation (ASF) under one or more | ||
* contributor license agreements. See the NOTICE file distributed with | ||
* this work for additional information regarding copyright ownership. | ||
* The ASF licenses this file to You under the Apache License, Version 2.0 | ||
* (the "License"); you may not use this file except in compliance with | ||
* the License. You may obtain a copy of the License at | ||
* | ||
* http://www.apache.org/licenses/LICENSE-2.0 | ||
* | ||
* Unless required by applicable law or agreed to in writing, software | ||
* distributed under the License is distributed on an "AS IS" BASIS, | ||
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
* See the License for the specific language governing permissions and | ||
* limitations under the License. | ||
*/ | ||
|
||
package org.apache.spark.sql.dialect | ||
|
||
import org.apache.spark.sql.SparkSessionExtensions | ||
import org.apache.spark.sql.jdbc.JdbcDialects | ||
|
||
class KyuubiSparkJdbcDialectExtension extends (SparkSessionExtensions => Unit) { | ||
override def apply(extensions: SparkSessionExtensions): Unit = { | ||
// register hive jdbc dialect | ||
JdbcDialects.registerDialect(KyuubiHiveDialect) | ||
} | ||
} |
35 changes: 35 additions & 0 deletions
35
...ark-jdbc-dialect/src/test/scala/org/apache/spark/sql/dialect/KyuubiHiveDialectSuite.scala
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,35 @@ | ||
/* | ||
* Licensed to the Apache Software Foundation (ASF) under one or more | ||
* contributor license agreements. See the NOTICE file distributed with | ||
* this work for additional information regarding copyright ownership. | ||
* The ASF licenses this file to You under the Apache License, Version 2.0 | ||
* (the "License"); you may not use this file except in compliance with | ||
* the License. You may obtain a copy of the License at | ||
* | ||
* http://www.apache.org/licenses/LICENSE-2.0 | ||
* | ||
* Unless required by applicable law or agreed to in writing, software | ||
* distributed under the License is distributed on an "AS IS" BASIS, | ||
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
* See the License for the specific language governing permissions and | ||
* limitations under the License. | ||
*/ | ||
|
||
package org.apache.spark.sql.dialect | ||
|
||
// scalastyle:off | ||
import org.scalatest.funsuite.AnyFunSuite | ||
|
||
class KyuubiHiveDialectSuite extends AnyFunSuite { | ||
// scalastyle:on | ||
|
||
test("[KYUUBI #3489] Kyuubi Hive dialect: can handle jdbc url") { | ||
assert(KyuubiHiveDialect.canHandle("jdbc:hive2://")) | ||
assert(KyuubiHiveDialect.canHandle("jdbc:kyuubi://")) | ||
} | ||
|
||
test("[KYUUBI #3489] Kyuubi Hive dialect: quoteIdentifier") { | ||
assertResult("`id`")(KyuubiHiveDialect.quoteIdentifier("id")) | ||
assertResult("`table`.`id`")(KyuubiHiveDialect.quoteIdentifier("table.id")) | ||
} | ||
} |
42 changes: 42 additions & 0 deletions
42
...sions/spark/kyuubi-extension-spark-jdbc-dialect/src/test/scala/resources/log4j.properties
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,42 @@ | ||
# | ||
# Licensed to the Apache Software Foundation (ASF) under one or more | ||
# contributor license agreements. See the NOTICE file distributed with | ||
# this work for additional information regarding copyright ownership. | ||
# The ASF licenses this file to You under the Apache License, Version 2.0 | ||
# (the "License"); you may not use this file except in compliance with | ||
# the License. You may obtain a copy of the License at | ||
# | ||
# http://www.apache.org/licenses/LICENSE-2.0 | ||
# | ||
# Unless required by applicable law or agreed to in writing, software | ||
# distributed under the License is distributed on an "AS IS" BASIS, | ||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
# See the License for the specific language governing permissions and | ||
# limitations under the License. | ||
# | ||
|
||
# Set everything to be logged to the file target/unit-tests.log | ||
log4j.rootLogger=INFO, CA, FA | ||
|
||
# Console Appender | ||
log4j.appender.CA=org.apache.log4j.ConsoleAppender | ||
log4j.appender.CA.layout=org.apache.log4j.PatternLayout | ||
log4j.appender.CA.layout.ConversionPattern=%d{HH:mm:ss.SSS} %p %c: %m%n | ||
log4j.appender.CA.Threshold = FATAL | ||
|
||
# File Appender | ||
log4j.appender.FA=org.apache.log4j.FileAppender | ||
log4j.appender.FA.append=false | ||
log4j.appender.FA.file=target/unit-tests.log | ||
log4j.appender.FA.layout=org.apache.log4j.PatternLayout | ||
log4j.appender.FA.layout.ConversionPattern=%d{HH:mm:ss.SSS} %t %p %c{2}: %m%n | ||
log4j.appender.FA.Threshold = DEBUG | ||
|
||
# SPARK-34128: Suppress undesirable TTransportException warnings involved in THRIFT-4805 | ||
log4j.appender.CA.filter.1=org.apache.log4j.varia.StringMatchFilter | ||
log4j.appender.CA.filter.1.StringToMatch=Thrift error occurred during processing of message | ||
log4j.appender.CA.filter.1.AcceptOnMatch=false | ||
|
||
log4j.appender.FA.filter.1=org.apache.log4j.varia.StringMatchFilter | ||
log4j.appender.FA.filter.1.StringToMatch=Thrift error occurred during processing of message | ||
log4j.appender.FA.filter.1.AcceptOnMatch=false |
43 changes: 43 additions & 0 deletions
43
...nsions/spark/kyuubi-extension-spark-jdbc-dialect/src/test/scala/resources/log4j2-test.xml
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,43 @@ | ||
<?xml version="1.0" encoding="UTF-8"?> | ||
<!-- | ||
~ Licensed to the Apache Software Foundation (ASF) under one or more | ||
~ contributor license agreements. See the NOTICE file distributed with | ||
~ this work for additional information regarding copyright ownership. | ||
~ The ASF licenses this file to You under the Apache License, Version 2.0 | ||
~ (the "License"); you may not use this file except in compliance with | ||
~ the License. You may obtain a copy of the License at | ||
~ | ||
~ http://www.apache.org/licenses/LICENSE-2.0 | ||
~ | ||
~ Unless required by applicable law or agreed to in writing, software | ||
~ distributed under the License is distributed on an "AS IS" BASIS, | ||
~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
~ See the License for the specific language governing permissions and | ||
~ limitations under the License. | ||
--> | ||
|
||
<!-- Extra logging related to initialization of Log4j. | ||
Set to debug or trace if log4j initialization is failing. --> | ||
<Configuration status="WARN"> | ||
<Appenders> | ||
<Console name="stdout" target="SYSTEM_OUT"> | ||
<PatternLayout pattern="%d{HH:mm:ss.SSS} %p %c: %m%n"/> | ||
<Filters> | ||
<ThresholdFilter level="FATAL"/> | ||
<RegexFilter regex=".*Thrift error occurred during processing of message.*" onMatch="DENY" onMismatch="NEUTRAL"/> | ||
</Filters> | ||
</Console> | ||
<File name="file" fileName="target/unit-tests.log"> | ||
<PatternLayout pattern="%d{HH:mm:ss.SSS} %t %p %c{1}: %m%n"/> | ||
<Filters> | ||
<RegexFilter regex=".*Thrift error occurred during processing of message.*" onMatch="DENY" onMismatch="NEUTRAL"/> | ||
</Filters> | ||
</File> | ||
</Appenders> | ||
<Loggers> | ||
<Root level="INFO"> | ||
<AppenderRef ref="stdout"/> | ||
<AppenderRef ref="file"/> | ||
</Root> | ||
</Loggers> | ||
</Configuration> |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters