-
Notifications
You must be signed in to change notification settings - Fork 57
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix shading class relocation: net.snowflake.ingest.shade #492
Conversation
- Relocate all dependency classes under net.snowflake.ingest.shade As of release 1.1.3 the shaded jar contains a large number of unrelocated classes. This is problematic for library consumers that include any of the same dependencies in their projects that snowflake-ingest-sdk uses in addition to snowflake-ingest-sdk. This can result in duplicate classes in a project and confusing errors for users depending on dependency ordering of their project. When duplicate classes are present on the classpath the first class found wins. This means that for consumers of snowflake-ingest-sdk if it's a dependency listed high in a pom.xml's dependency section, its duplicate classes take precedence, overriding any other dependency definitions in their project. Here's a short example of the problem. If I have a project using the ingest SDK with a pom.xml as follows: ``` <dependency> <groupId>net.snowflake</groupId> <artifactId>snowflake-ingest-sdk</artifactId> <version>1.1.3</version> </dependency> <dependency> <groupId>com.google.protobuf</groupId> <artifactId>protobuf-java</artifactId> <version>3.22.3</version> </dependency> ``` snowflake-ingest-sdk includes class files from com.google.protobuf:protobuf-java:2.5.0 which is pulled in transitively from hadoop-common. For any class files that are duplicates of protobuf-java 3.22.3 the class files of 2.5.0 will take precedence at runtime due to snowflake-ingest-sdk being first in the pom.xml and therefore first on the classpath. Conversely if snowflake-ingest-sdk is last then protobuf-java 3.22.3 will be active which may cause runtime issues with snowflake-ingest-sdk if it depends on APIs only present in 2.5.0. Fixes #356
@@ -6,7 +6,7 @@ | |||
<!-- Arifact name and version information --> | |||
<groupId>net.snowflake</groupId> | |||
<artifactId>snowflake-ingest-sdk</artifactId> | |||
<version>1.1.3</version> | |||
<version>1.1.4-SNAPSHOT</version> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think once a version has been published this should be bumped to the next version + -SNAPSHOT
@@ -57,7 +57,7 @@ | |||
<objenesis.version>3.1</objenesis.version> | |||
<parquet.version>1.12.3</parquet.version> | |||
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding> | |||
<shadeBase>net.snowflake.ingest.internal</shadeBase> | |||
<shadeBase>net.snowflake.ingest.shade</shadeBase> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think shade
makes the use more clear vs internal could be something else, ex. https://github.com/snowflakedb/snowflake-ingest-java/tree/master/src/main/java/net/snowflake/ingest/streaming/internal
@@ -384,7 +384,6 @@ | |||
<dependency> | |||
<groupId>org.slf4j</groupId> | |||
<artifactId>slf4j-api</artifactId> | |||
<scope>provided</scope> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Linkage errors on the shade jar show provided is probably incorrect
pom.xml
Outdated
<plugin> | ||
<groupId>org.apache.maven.plugins</groupId> | ||
<artifactId>maven-site-plugin</artifactId> | ||
<version>4.0.0-M8</version> | ||
</plugin> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unrelated and can be removed, but adding this enables mvn site
to work
@@ -510,7 +514,7 @@ | |||
<sortDependencyExclusions>groupId,artifactId</sortDependencyExclusions> | |||
<sortExecutions>true</sortExecutions> | |||
<sortModules>true</sortModules> | |||
<sortPlugins>groupId,artifactId</sortPlugins> | |||
<sortPlugins>groupId</sortPlugins> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When two plugins have an execution in the same phase, the plugins run in order they appear in pom.xml. This was problematic for the antrun plugin and had to be changed
@@ -724,66 +728,48 @@ | |||
</activation> | |||
<build> | |||
<plugins> | |||
<!-- Relocate all dependencies to internal to solve dependency conflict problem --> | |||
<plugin> | |||
<groupId>org.apache.maven.plugins</groupId> | |||
<artifactId>maven-shade-plugin</artifactId> | |||
<version>3.4.1</version> | |||
<configuration> | |||
<relocations> | |||
<relocation> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All relocations were adjusted to cover the TLDs
<filters> | ||
<filter> | ||
<artifact>*:*</artifact> | ||
<excludes> | ||
<exclude>META-INF/LICENSE*</exclude> | ||
<exclude>META-INF/NOTICE*</exclude> | ||
<exclude>META-INF/DEPENDENCIES</exclude> | ||
<exclude>META-INF/maven/**</exclude> | ||
<exclude>META-INF/services/com.fasterxml.*</exclude> | ||
<exclude>META-INF/*.xml</exclude> | ||
<exclude>META-INF/*.SF</exclude> | ||
<exclude>META-INF/*.DSA</exclude> | ||
<exclude>META-INF/*.RSA</exclude> | ||
</excludes> | ||
</filter> | ||
<filter> | ||
<artifact>commons-logging:commons-logging</artifact> | ||
<excludes> | ||
<exclude>org/apache/commons/logging/impl/AvalonLogger.class</exclude> | ||
</excludes> | ||
</filter> | ||
<filter> | ||
<artifact>org.slf4j:slf4j-simple</artifact> | ||
<excludes> | ||
<exclude>org/slf4j/**</exclude> | ||
</excludes> | ||
</filter> | ||
</filters> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not obvious if any of this needs to stay but it seemed like it might be able to go.
<plugin> | ||
<!-- hacky META-INF/versions shade relocation --> | ||
<groupId>org.apache.maven.plugins</groupId> | ||
<artifactId>maven-antrun-plugin</artifactId> | ||
<version>3.1.0</version> | ||
<executions> | ||
<execution> | ||
<goals> | ||
<goal>run</goal> | ||
</goals> | ||
<phase>package</phase> | ||
<configuration> | ||
<target> | ||
<unzip dest="${project.build.directory}/tmp" src="${project.build.directory}/${project.build.finalName}.jar"/> | ||
<mkdir dir="${project.build.directory}/tmp/META-INF/versions/9/net/snowflake/ingest/shade/net/snowflake/client"/> | ||
<mkdir dir="${project.build.directory}/tmp/META-INF/versions/11/net/snowflake/ingest/shade/net/snowflake/client"/> | ||
<mkdir dir="${project.build.directory}/tmp/META-INF/versions/15/net/snowflake/ingest/shade/net/snowflake/client"/> | ||
<move file="${project.build.directory}/tmp/META-INF/versions/9/net/snowflake/client" todir="${project.build.directory}/tmp/META-INF/versions/9/net/snowflake/ingest/shade/net/snowflake/"/> | ||
<move file="${project.build.directory}/tmp/META-INF/versions/11/net/snowflake/client" todir="${project.build.directory}/tmp/META-INF/versions/9/net/snowflake/ingest/shade/net/snowflake/"/> | ||
<move file="${project.build.directory}/tmp/META-INF/versions/15/net/snowflake/client" todir="${project.build.directory}/tmp/META-INF/versions/9/net/snowflake/ingest/shade/net/snowflake/"/> | ||
<zip basedir="${project.build.directory}/tmp" destfile="${project.build.directory}/${project.build.finalName}.jar"/> | ||
<delete dir="${project.build.directory}/tmp"/> | ||
</target> | ||
</configuration> | ||
</execution> | ||
</executions> | ||
</plugin> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The relocation plugin doesn't handle multi-release jars, but https://stackoverflow.com/questions/74119618/maven-shade-plugin-relocate-classes-under-meta-inf-versions-in-multi-release-jar provides a hacky workaround
scripts/check_content.sh
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like the original intent of this script was to catch the problems fixed by this PR, however I get the impression that people didn't fully understand the errors this script would report and kept adding exclusions. The modifications attempt to return the script to its apparent intent: report on any classes outside package net.snowflake.ingest after shading.
#490 addresses this. |
As of release 1.1.3 the shaded jar contains a large number of unrelocated classes. This is problematic for library consumers that include any of the same dependencies in their projects that snowflake-ingest-sdk uses in addition to snowflake-ingest-sdk. This can result in duplicate classes in a project and confusing errors for users depending on dependency ordering of their project. When duplicate classes are present on the classpath the first class found wins.
This means that for consumers of snowflake-ingest-sdk if it's a dependency listed high in a pom.xml's dependency section, its duplicate classes take precedence, overriding any other dependency definitions in their project.
Here's a short example of the problem. If I have a project using the ingest SDK with a pom.xml as follows:
snowflake-ingest-sdk includes class files from
com.google.protobuf:protobuf-java:2.5.0 which is pulled in transitively from hadoop-common. For any class files that are duplicates of protobuf-java 3.22.3 the class files of 2.5.0 will take precedence at runtime due to snowflake-ingest-sdk being first in the pom.xml and therefore first on the classpath.
Conversely if snowflake-ingest-sdk is last then protobuf-java 3.22.3 will be active which may cause runtime issues with snowflake-ingest-sdk if it depends on APIs only present in 2.5.0.
Fixes #356