[BEAM-1092] Shade commonly used libraries (e.g. Guava) to avoid class conflicts#2096
[BEAM-1092] Shade commonly used libraries (e.g. Guava) to avoid class conflicts#2096aviemzur wants to merge 3 commits intoapache:masterfrom
Conversation
|
Refer to this link for build results (access rights to CI server needed): |
|
We looked into this exact approach in the past, and rolled it back because |
|
(#744) |
|
I can replace the |
|
Refer to this link for build results (access rights to CI server needed): Build result: ABORTED[...truncated 833.54 KB...] ... 31 moreCaused by: java.lang.RuntimeException: The forked VM terminated without properly saying goodbye. VM crash or System.exit called?Command was /bin/sh -c cd /home/jenkins/jenkins-slave/workspace/beam_PreCommit_Java_MavenInstall@2/sdks/java/io/hbase && /usr/local/asfpackages/java/jdk1.8.0_121/jre/bin/java org.apache.maven.surefire.booter.ForkedBooter /home/jenkins/jenkins-slave/workspace/beam_PreCommit_Java_MavenInstall@2/sdks/java/io/hbase/target/surefire/surefire7450086805414423461tmp /home/jenkins/jenkins-slave/workspace/beam_PreCommit_Java_MavenInstall@2/sdks/java/io/hbase/target/surefire/surefire_83942865920716290877tmp at org.apache.maven.plugin.surefire.booterclient.ForkStarter.fork(ForkStarter.java:590) at org.apache.maven.plugin.surefire.booterclient.ForkStarter.fork(ForkStarter.java:460) at org.apache.maven.plugin.surefire.booterclient.ForkStarter.run(ForkStarter.java:229) at org.apache.maven.plugin.surefire.booterclient.ForkStarter.run(ForkStarter.java:201) at org.apache.maven.plugin.surefire.AbstractSurefireMojo.executeProvider(AbstractSurefireMojo.java:1026) at org.apache.maven.plugin.surefire.AbstractSurefireMojo.executeAfterPreconditionsChecked(AbstractSurefireMojo.java:862) at org.apache.maven.plugin.surefire.AbstractSurefireMojo.execute(AbstractSurefireMojo.java:755) at org.apache.maven.plugin.DefaultBuildPluginManager.executeMojo(DefaultBuildPluginManager.java:134) ... 32 more2017-02-24T22:25:19.043 [ERROR] 2017-02-24T22:25:19.043 [ERROR] Re-run Maven using the -X switch to enable full debug logging.2017-02-24T22:25:19.043 [ERROR] 2017-02-24T22:25:19.043 [ERROR] For more information about the errors and possible solutions, please read the following articles:2017-02-24T22:25:19.043 [ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/PluginExecutionException2017-02-24T22:25:19.043 [ERROR] 2017-02-24T22:25:19.043 [ERROR] After correcting the problems, you can resume the build with the command2017-02-24T22:25:19.043 [ERROR] mvn -rf :beam-sdks-java-io-hbaseBuild was abortedchannel stoppedSetting status of af0c88f to FAILURE with url https://builds.apache.org/job/beam_PreCommit_Java_MavenInstall/7831/ and message: 'Build finished. 'Using context: Jenkins: Maven clean install--none-- |
|
Retest this please. |
|
Refer to this link for build results (access rights to CI server needed): |
|
@dhalperi Addressed your comment in my latest commit. PTAL? |
| <pattern>com.google.common</pattern> | ||
| <excludes> | ||
| <!-- com.google.common is too generic, need to exclude guava-testlib --> | ||
| <exclude>com.google.common.**.testing.*</exclude> |
There was a problem hiding this comment.
this looks like it's being lost. Is this right?
There was a problem hiding this comment.
All the tests pass without this, so I'm not sure why this is needed?
There was a problem hiding this comment.
This was added in PR #832 because the rules only shade com.google.guava:guava and not com.google.guava:guava-testlib. If we relocate symbols from guava-testlib without shading them, we can get runtime ClassNotFoundExceptions on projects which depend on our artifacts.
IIRC, this issue did not manifest within the the beam project structure because each project had equivalent repackaging rules and could source the repackaged guava-testlib symbols from its own local dependencies. I believe I ran into the issue when creating my own project that didn't have shading and consumed the shaded beam artifacts.
I've been away from Beam development for some months so unfortunately I don't have recent context on whether this is still needed.
There was a problem hiding this comment.
Thanks @swegner for your comment!
All dependencies on guava-testlib in the project currently are in scope test.
My change removes shading of test jars in modules which matched the shading rules that are in the root pom in my branch (They now do not shade their test jars)
The only test jar that remains with shading is the one of Dataflow runner, which still has this exclusion in its shade-plugin configuration: google-cloud-dataflow-java/pom.xml
Is this sufficient?
There was a problem hiding this comment.
Sounds good to me; thanks for the thorough investigation!
(Please continue working with @dhalperi for the rest of the PR)
| <artifactId>build-helper-maven-plugin</artifactId> | ||
| </plugin> | ||
| <plugin> | ||
| <groupId>org.apache.maven.plugins</groupId> |
There was a problem hiding this comment.
how do we ensure this only executes in the modules that use guava?
What happens in the modules that need to shade more than guava?
There was a problem hiding this comment.
This will execute in every module, to ensure forward compatability, when new modules are introduced, they will not need to know they should shade guava and add this shading to the pom, this will happen automatically. (If they don't depend on guava, no extra classes will be shaded into the resulting jar).
For modules that need to shade more than guava we should ask the following:
Should this artifact be shaded in all modules that will use it?
If so: add it to the root pom
If not: override shading for that specific module.
| <pattern>com.google.common</pattern> | ||
| <excludes> | ||
| <!-- com.google.common is too generic, need to exclude guava-testlib --> | ||
| <exclude>com.google.common.**.testing.*</exclude> |
|
Re: testlib -- I did some GitHub blame/history diving and found JIRA https://issues.apache.org/jira/browse/BEAM-557 and #832 . Maybe @swegner has more context about which specific ways the Guava testlib is a public dependency. |
|
Re: general approach. I see two improvements:
I see two downsides:
I would probably:
In otherwords, I would probably take improvement 1, but not 2. 1 is a clear win, but 2 has some drawbacks and doesn't quite guarantee success to justify those drawbacks IMO. |
|
Another good opinion to get would be @davorbonaci . |
|
@dhalperi I understand the concerns you raise. I'll try to address them from my point of view.
Many of the modules already shade, so regarding slowing down of the build I’m not sure this concern is that pertinent if new way of shading is an improvement.
I agree that human error could cause Guava to be forgotten from shading configuration added to child modules, but is this different than today? Can this human error not happen now? |
|
I'm not totally sold on the premise that we should shade Guava in every single module. A couple of reasons:
Separately, the amount of "magic" worries me too. This will work perfectly in Maven, but what happens when someone imports it into Eclipse, IntelliJ or elsewhere? At least in some IDEs, it will fail. On the other hand, I'm really strongly in favor of one common configuration. There's literally zero chance that this will work if this configuration is offloaded to every single module. This one outweighs them all for me. (And, I was bitten by Guava 19 vs. 20 just today -- so, I know the pain.) In summary, I'm leaning towards +1 on this. However, I'd be more comfortable if we were explicit in saying:
|
Build time (MacBook Pro, skipping tests)Before: 02:13 min Jar size (Sum of all jars generated)Before: 93.93 MB |
|
Build time -- less than I would have forecasted. SGTM. Jar size -- more than I would have forecasted -- it is 50% more. I think minimization is in order. |
af0c88f to
7eb6434
Compare
|
Refer to this link for build results (access rights to CI server needed): |
If there are modules which do not need to shade and relocate Guava - for example if they actively need to not do this and/or we feel that their file size has inflated without any reason - we can add the following node to their <plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-shade-plugin</artifactId>
<executions>
<execution>
<id>bundle-and-repackage</id>
<phase>none</phase>
</execution>
</executions>
</plugin>Another way to tackle large file sizes would be to configure minimization by default. I think we should start a minimization track separately from this change, as there are some issues with it:
|
|
@aviemzur, btw, I noticed the version of the shade plugin went down. Was this intentional? If yes, great. If not, perhaps push a separate PR to get it back up. |
Be sure to do all of the following to help us incorporate your contribution
quickly and easily:
[BEAM-<Jira issue #>] Description of pull requestmvn clean verify. (Even better, enableTravis-CI on your fork and ensure the whole test matrix passes).
<Jira issue #>in the title with the actual Jira issuenumber, if there is one.
Individual Contributor License Agreement.