Skip to content

Commit

Permalink
[SPARK-30994][BUILD][FOLLOW-UP] Change scope of xml-apis to include i…
Browse files Browse the repository at this point in the history
…t and add xerces in SBT as dependency override

### What changes were proposed in this pull request?

This PR propose

1. Explicitly include xml-apis. xml-apis is already the part of xerces 2.12.0 (https://repo1.maven.org/maven2/xerces/xercesImpl/2.12.0/xercesImpl-2.12.0.pom). However, we're excluding it by setting `scope` to `test`. This seems causing `spark-shell`, built from Maven, to fail.

    Seems like previously xml-apis wasn't reached for some reasons but after we upgrade, it seems requiring. Therefore, this PR proposes to include it.

2. Pins `xerces` version in SBT as well. Seems this dependency is resolved differently from Maven.

Note that Hadoop 3 does not looks requiring this as they replaced xerces as of [HDFS-12221](https://issues.apache.org/jira/browse/HDFS-12221).

### Why are the changes needed?

To make `spark-shell` working from Maven build, and uses the same xerces version.

### Does this PR introduce any user-facing change?

No, it's master only.

### How was this patch tested?

**1.**

```bash
./build/mvn -DskipTests -Psparkr -Phive clean package
./bin/spark-shell
```

Before:

```
Exception in thread "main" java.lang.NoClassDefFoundError: org/w3c/dom/ElementTraversal
	at java.lang.ClassLoader.defineClass1(Native Method)
	at java.lang.ClassLoader.defineClass(ClassLoader.java:763)
	at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
	at java.net.URLClassLoader.defineClass(URLClassLoader.java:468)
	at java.net.URLClassLoader.access$100(URLClassLoader.java:74)
	at java.net.URLClassLoader$1.run(URLClassLoader.java:369)
	at java.net.URLClassLoader$1.run(URLClassLoader.java:363)
	at java.security.AccessController.doPrivileged(Native Method)
	at java.net.URLClassLoader.findClass(URLClassLoader.java:362)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
	at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
	at org.apache.xerces.parsers.AbstractDOMParser.startDocument(Unknown Source)
	at org.apache.xerces.xinclude.XIncludeHandler.startDocument(Unknown Source)
	at org.apache.xerces.impl.dtd.XMLDTDValidator.startDocument(Unknown Source)
	at org.apache.xerces.impl.XMLDocumentScannerImpl.startEntity(Unknown Source)
	at org.apache.xerces.impl.XMLVersionDetector.startDocumentParsing(Unknown Source)
	at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
	at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
	at org.apache.xerces.parsers.XMLParser.parse(Unknown Source)
	at org.apache.xerces.parsers.DOMParser.parse(Unknown Source)
	at org.apache.xerces.jaxp.DocumentBuilderImpl.parse(Unknown Source)
	at javax.xml.parsers.DocumentBuilder.parse(DocumentBuilder.java:150)
	at org.apache.hadoop.conf.Configuration.parse(Configuration.java:2482)
	at org.apache.hadoop.conf.Configuration.parse(Configuration.java:2470)
	at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2541)
	at org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:2494)
	at org.apache.hadoop.conf.Configuration.getProps(Configuration.java:2407)
	at org.apache.hadoop.conf.Configuration.set(Configuration.java:1143)
	at org.apache.hadoop.conf.Configuration.set(Configuration.java:1115)
	at org.apache.spark.deploy.SparkHadoopUtil$.org$apache$spark$deploy$SparkHadoopUtil$$appendS3AndSparkHadoopHiveConfigurations(SparkHadoopUtil.scala:456)
	at org.apache.spark.deploy.SparkHadoopUtil$.newConfiguration(SparkHadoopUtil.scala:427)
	at org.apache.spark.deploy.SparkSubmit.$anonfun$prepareSubmitEnvironment$2(SparkSubmit.scala:342)
	at scala.Option.getOrElse(Option.scala:189)
	at org.apache.spark.deploy.SparkSubmit.prepareSubmitEnvironment(SparkSubmit.scala:342)
	at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:871)
	at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180)
	at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203)
	at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90)
	at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1007)
	at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1016)
	at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.lang.ClassNotFoundException: org.w3c.dom.ElementTraversal
	at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
	at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
	... 42 more
```

After:

```
...
Welcome to
      ____              __
     / __/__  ___ _____/ /__
    _\ \/ _ \/ _ `/ __/  '_/
   /___/ .__/\_,_/_/ /_/\_\   version 3.1.0-SNAPSHOT
      /_/

Using Scala version 2.12.10 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_202)
Type in expressions to have them evaluated.
Type :help for more information.

scala>
```

**2.**

```
./build/sbt dependencyTree -Phadoop-2.7 -Phive-2.3 -Phive-thriftserver -Phive
./build/sbt dependencyTree -Phadoop-3.2 -Phive-2.3 -Phive-thriftserver -Phive
```

Closes #27808 from HyukjinKwon/SPARK-30994.

Authored-by: HyukjinKwon <gurwls223@apache.org>
Signed-off-by: HyukjinKwon <gurwls223@apache.org>
  • Loading branch information
HyukjinKwon committed Mar 6, 2020
1 parent fe126a6 commit 5b3277f
Show file tree
Hide file tree
Showing 4 changed files with 3 additions and 1 deletion.
1 change: 1 addition & 0 deletions dev/deps/spark-deps-hadoop-2.7-hive-1.2
Expand Up @@ -202,6 +202,7 @@ threeten-extra/1.5.0//threeten-extra-1.5.0.jar
univocity-parsers/2.8.3//univocity-parsers-2.8.3.jar
xbean-asm7-shaded/4.15//xbean-asm7-shaded-4.15.jar
xercesImpl/2.12.0//xercesImpl-2.12.0.jar
xml-apis/1.4.01//xml-apis-1.4.01.jar
xmlenc/0.52//xmlenc-0.52.jar
xz/1.5//xz-1.5.jar
zjsonpatch/0.3.0//zjsonpatch-0.3.0.jar
Expand Down
1 change: 1 addition & 0 deletions dev/deps/spark-deps-hadoop-2.7-hive-2.3
Expand Up @@ -216,6 +216,7 @@ univocity-parsers/2.8.3//univocity-parsers-2.8.3.jar
velocity/1.5//velocity-1.5.jar
xbean-asm7-shaded/4.15//xbean-asm7-shaded-4.15.jar
xercesImpl/2.12.0//xercesImpl-2.12.0.jar
xml-apis/1.4.01//xml-apis-1.4.01.jar
xmlenc/0.52//xmlenc-0.52.jar
xz/1.5//xz-1.5.jar
zjsonpatch/0.3.0//zjsonpatch-0.3.0.jar
Expand Down
1 change: 0 additions & 1 deletion pom.xml
Expand Up @@ -612,7 +612,6 @@
<groupId>xml-apis</groupId>
<artifactId>xml-apis</artifactId>
<version>1.4.01</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.slf4j</groupId>
Expand Down
1 change: 1 addition & 0 deletions project/SparkBuild.scala
Expand Up @@ -621,6 +621,7 @@ object KubernetesIntegrationTests {
object DependencyOverrides {
lazy val settings = Seq(
dependencyOverrides += "com.google.guava" % "guava" % "14.0.1",
dependencyOverrides += "xerces" % "xercesImpl" % "2.12.0",
dependencyOverrides += "jline" % "jline" % "2.14.6")
}

Expand Down

0 comments on commit 5b3277f

Please sign in to comment.