Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-12807] [YARN] Spark External Shuffle not working in Hadoop clusters with Jackson 2.2.3 #10780

Closed
Changes from all commits
Commits
Show all changes
15 commits
Select commit Hold shift + click to select a range
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
49 changes: 48 additions & 1 deletion network/yarn/pom.xml
Expand Up @@ -35,6 +35,8 @@
<sbt.project.name>network-yarn</sbt.project.name>
<!-- Make sure all Hadoop dependencies are provided to avoid repackaging. -->
<hadoop.deps.scope>provided</hadoop.deps.scope>
<shuffle.jar>${project.build.directory}/scala-${scala.binary.version}/spark-${project.version}-yarn-shuffle.jar</shuffle.jar>
<shade>org/spark-project/</shade>
</properties>

<dependencies>
Expand Down Expand Up @@ -70,7 +72,7 @@
<artifactId>maven-shade-plugin</artifactId>
<configuration>
<shadedArtifactAttached>false</shadedArtifactAttached>
<outputFile>${project.build.directory}/scala-${scala.binary.version}/spark-${project.version}-yarn-shuffle.jar</outputFile>
<outputFile>${shuffle.jar}</outputFile>
<artifactSet>
<includes>
<include>*:*</include>
Expand All @@ -86,6 +88,15 @@
</excludes>
</filter>
</filters>
<relocations>
<relocation>
<pattern>com.fasterxml.jackson</pattern>
<shadedPattern>org.spark-project.com.fasterxml.jackson</shadedPattern>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not use the shade property you created here too?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one is {{org/spark-project}}, the other {{org.spark-project}}; you'd have to do s/r renaming to automatically derive one from the other. With hindsight, it's a common enough use case we should have added it to Ant 15 years ago, but its too late now.

<includes>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

super nit: I wonder if this is needed; we have it for Jetty but not for Guava, and both end up being relocated the same way.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maven shading is a dark mystery to me. I don't understand what it does; I don't trust it. What you see there is cargo-cult cut and paste backed by a bit of ant to verify the final outcome of the C&P matches what I want.

<include>com.fasterxml.jackson.**</include>
</includes>
</relocation>
</relocations>
</configuration>
<executions>
<execution>
Expand All @@ -96,6 +107,42 @@
</execution>
</executions>
</plugin>

<!-- probes to validate that those dependencies which must be shaded are -->
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-antrun-plugin</artifactId>
<executions>
<execution>
<phase>verify</phase>
<goals>
<goal>run</goal>
</goals>
<configuration>
<target>
<macrodef name="shaded">
<attribute name="resource"/>
<sequential>
<fail message="Not found ${shade}@{resource}">
<condition>
<not>
<resourceexists>
<zipentry zipfile="${shuffle.jar}" name="${shade}@{resource}"/>
</resourceexists>
</not>
</condition>
</fail>
</sequential>
</macrodef>
<echo>Verifying dependency shading</echo>
<shaded resource="com/fasterxml/jackson/core/JsonParser.class" />
<shaded resource="com/fasterxml/jackson/annotation/JacksonAnnotation.class" />
<shaded resource="com/fasterxml/jackson/databind/JsonSerializer.class" />
</target>
</configuration>
</execution>
</executions>
</plugin>
</plugins>
</build>
</project>