Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-8819] Fix build for maven 3.3.x #7219

Closed
wants to merge 3 commits into from

Conversation

andrewor14
Copy link
Contributor

This is a workaround for MSHADE-148, which leads to an infinite loop when building Spark with maven 3.3.x. This was originally caused by #6441, which added a bunch of test dependencies on the spark-core test module. Recently, it was revealed by #7193.

This patch adds a -Prelease profile. If present, it will set createDependencyReducedPom to true. The consequences are:

  • If you are releasing Spark with this profile, you are fine as long as you use maven 3.2.x or before.
  • If you are releasing Spark without this profile, you will run into SPARK-8781.
  • If you are not releasing Spark but you are using this profile, you may run into SPARK-8819.
  • If you are not releasing Spark and you did not include this profile, you are fine.

This is all documented in pom.xml and tested locally with both versions of maven.

@andrewor14
Copy link
Contributor Author

@pwendell @JoshRosen @srowen could you take a look?

@AmplabJenkins
Copy link

Merged build triggered.

@AmplabJenkins
Copy link

Merged build started.

@SparkQA
Copy link

SparkQA commented Jul 4, 2015

Test build #36522 has started for PR 7219 at commit 77b43ec.

@SparkQA
Copy link

SparkQA commented Jul 4, 2015

Test build #36522 has finished for PR 7219 at commit 77b43ec.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • *

@AmplabJenkins
Copy link

Merged build finished. Test PASSed.

@srowen
Copy link
Member

srowen commented Jul 4, 2015

I think this should also be a solution, yes.
Why does scalatest become a compile scope dependency instead of test? because test deps aren't transitive? right-er would be to have all test modules depend on scalatest then, if so.

@@ -678,7 +682,13 @@
<groupId>org.scalatest</groupId>
<artifactId>scalatest_${scala.binary.version}</artifactId>
<version>2.2.1</version>
<scope>test</scope>
<!--
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this have transitive dependencies that might change the overall dependency graph for users of spark-core?

@pwendell
Copy link
Contributor

pwendell commented Jul 4, 2015

Hey @andrewor14 - I took a look at this. It seems, to me a bit weird to have Spark core dependend on Scalatest. This will affect all of our downstream applications, since they will also inherit the scalatest dependency, and that seems risky given that it has its own transitive dependencies (most of those are marked as optional - but I have no idea how various build tools handle optional dependencies). I was wondering abut some alternative solutions such as:

  1. Just declare SparkFunSuite in each module - this is a bit clunky, but you could have a large warning in the class file that it's N-way replicated. It's a very small class so I don't anticipate large modifications in the future. Then we don't have to reason about any build tool behaviors.
  2. Do the suggestion from my email where you disable the dependency reduced pom, except we only enable it for a certain profile.

I am pretty hesitant to change compile dependencies around in 1.4.1 vs 1.4.0 - since we've typically not done this in a release. I also think it would be good not to have test dependencies as compile, if we can avoid it.

@srowen
Copy link
Member

srowen commented Jul 4, 2015

If this is just because test dependencies are not transitive in Maven, yes indeeed the alternative is to declare the scalatest test-scope dependency in each module. It's more boilerplate, but the right-er thing to do. So I like 1.

@pwendell
Copy link
Contributor

pwendell commented Jul 4, 2015

Hey @srowen - there is more going on here. The issue is that there is actually a class called SparkFunSuite that is defined inside of the core module that the other modules need to depend on. It's not just that all of them need the scalatest dependency. So we'd have to copy/paste the source code for SparkFunSuite N times.

@srowen
Copy link
Member

srowen commented Jul 4, 2015

Oh I get it, darn. Well, I vote for 2 then. This amounts to dodging the shade plugin bug where the cause doesn't have to come up anyway. The profile release-profile should be a special name in Maven that gets enabled automatically when you run the release plugin; is that useable?

@andrewor14
Copy link
Contributor Author

Yeah, I completely agree it's a terrible idea to introduce scalatest as a core dependency. This patch goes against most of my own principles in software development and I would be very happy to change it.

@pwendell is the idea with (2) that we use a maven version before 3.3.x during releases, and additionally pass in a -Prelease flag or something that enables createReducedDependencyPom?

@andrewor14
Copy link
Contributor Author

If so, sounds like the solution is to revert my revert at #7193 and add another profile that has its own maven shade plugin dependency.

@pwendell
Copy link
Contributor

pwendell commented Jul 5, 2015

Hey @andrewor14 - sorry for the delay I didn't see this comment yesterday. I see - I didn't realize one thing - the maven version being used on jenkins is 3.1.X (that must be why it worked rather than the profiles). This is a little bit unfortunate because it means if we go with solution (2) then we'll need to reason about the version installed on Jenkins and if that gets upgraded things will fail.

What about just doing (1) here and throwing our hands up and seeing if we can convince the maven project to make this fix, so that we can use this feature in the future?

@srowen
Copy link
Member

srowen commented Jul 5, 2015

(Why does option 2 not work with later Maven? I missed that part.)

@pwendell
Copy link
Contributor

pwendell commented Jul 5, 2015

My understanding was this bug affects maven version 3.3 and later. I've never tried publishing the release with Maven 3.3, so I don't know whether we would still just hit this bug when the profile is enabled.

@srowen
Copy link
Member

srowen commented Jul 5, 2015

From the discussion I thought it was observed with 3.2 and 3.3, though I wasn't clear it was actually related to the Maven version. But even if so, option 2 should dodge this bug in the shade plugin wherever it occurs right? or will it then somehow make things not-work in older Maven?

@pwendell
Copy link
Contributor

pwendell commented Jul 5, 2015

Hey sean - option 2 still enables the bug-causing option when we publish releases. So I think this means we may not be able to publish releases with maven 3.2 and 3.2. I'd prefer not to be tied to specific versions for publishing releases since it's a bit brittle. But maybe it would be okay for now.

@srowen
Copy link
Member

srowen commented Jul 5, 2015

Got it, I had in my head that somehow the release incantation was not affected by this.

In the spirit of option 1 again then, is SparkFunSuite that essential anyway? it's a short class with one overridden method that logs the test name in the output. Worth considering whether it can just go away.

@andrewor14
Copy link
Contributor Author

@srowen SparkFunSuite should stay because it makes the unit-tests.log actually useful, which is instrumental in debugging flaky tests. Unfortunately because many (hundreds) suites currently extend it, it will be difficult to revert the patch that added it in the first place, and I don't think we want to do that because ultimately we want to keep it in the long run.

I think the verdict for now is just to go with (2), which adds a special release profile, and if that doesn't work, we'll fallback to (1), which duplicates SparkFunSuite. I'll update this patch within the next few hours.

@srowen
Copy link
Member

srowen commented Jul 5, 2015

OK sounds good. Good point about all of the things extending it. So you're saying option 2, plus, using some version of Maven that doesn't somehow trip on this issue?

@andrewor14
Copy link
Contributor Author

Yes, I believe the plan is to restrict ourselves to use maven 3.2.x or before to do releases.

@AmplabJenkins
Copy link

Merged build triggered.

@AmplabJenkins
Copy link

Merged build started.

@SparkQA
Copy link

SparkQA commented Jul 5, 2015

Test build #36550 has started for PR 7219 at commit f4c00e0.

<scope>test</scope>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is just reverting #7193. If we use this workaround it turns out we can just keep this dependency (like how we keep the other ones in GraphX, MLlib, Bagel...)

@andrewor14
Copy link
Contributor Author

@pwendell Please try this out. I have tested this with maven 3.3.3 and 3.2.1 with and without the new -Prelease profile. The results are summarized in the PR description and I have verified this locally.

Andrew Or added 3 commits July 6, 2015 16:22
You should use this profile IFF you are making release AND you are
using maven version 3.2.x or before. Otherwise, do not use this
flag.

For more detail see the inline comments.

Conflicts:
	pom.xml
@andrewor14
Copy link
Contributor Author

I've removed the revert of #7193 from this patch. This will make this patch easier to revert in the future. After this patch is merged we will revert #7193 separately.

@AmplabJenkins
Copy link

Merged build triggered.

@AmplabJenkins
Copy link

Merged build started.

@AmplabJenkins
Copy link

Merged build finished. Test FAILed.

@andrewor14
Copy link
Contributor Author

retest this please

@AmplabJenkins
Copy link

Merged build triggered.

@AmplabJenkins
Copy link

Merged build started.

@SparkQA
Copy link

SparkQA commented Jul 6, 2015

Test build #36622 has started for PR 7219 at commit 1d37e87.

@vanzin
Copy link
Contributor

vanzin commented Jul 6, 2015

LGTM. Hopefully there'll be a release of the plugin soon.

@pwendell
Copy link
Contributor

pwendell commented Jul 7, 2015

@vanzin thanks for helping with this

@SparkQA
Copy link

SparkQA commented Jul 7, 2015

Test build #36622 has finished for PR 7219 at commit 1d37e87.

  • This patch fails MiMa tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Merged build finished. Test FAILed.

@andrewor14
Copy link
Contributor Author

retest this please

@AmplabJenkins
Copy link

Merged build triggered.

@AmplabJenkins
Copy link

Merged build started.

@SparkQA
Copy link

SparkQA commented Jul 7, 2015

Test build #36624 has started for PR 7219 at commit 1d37e87.

@SparkQA
Copy link

SparkQA commented Jul 7, 2015

Test build #36624 has finished for PR 7219 at commit 1d37e87.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Merged build finished. Test PASSed.

@andrewor14
Copy link
Contributor Author

OK, merging into master, 1.4 and 1.3. Thanks everyone for the reviews.

asfgit pushed a commit that referenced this pull request Jul 7, 2015
This is a workaround for MSHADE-148, which leads to an infinite loop when building Spark with maven 3.3.x. This was originally caused by #6441, which added a bunch of test dependencies on the spark-core test module. Recently, it was revealed by #7193.

This patch adds a `-Prelease` profile. If present, it will set `createDependencyReducedPom` to true. The consequences are:
- If you are releasing Spark with this profile, you are fine as long as you use maven 3.2.x or before.
- If you are releasing Spark without this profile, you will run into SPARK-8781.
- If you are not releasing Spark but you are using this profile, you may run into SPARK-8819.
- If you are not releasing Spark and you did not include this profile, you are fine.

This is all documented in `pom.xml` and tested locally with both versions of maven.

Author: Andrew Or <andrew@databricks.com>

Closes #7219 from andrewor14/fix-maven-build and squashes the following commits:

1d37e87 [Andrew Or] Merge branch 'master' of github.com:apache/spark into fix-maven-build
3574ae4 [Andrew Or] Review comments
f39199c [Andrew Or] Create a -Prelease profile that flags `createDependencyReducedPom`

(cherry picked from commit 9eae5fa)
Signed-off-by: Andrew Or <andrew@databricks.com>
@asfgit asfgit closed this in 9eae5fa Jul 7, 2015
@andrewor14 andrewor14 deleted the fix-maven-build branch July 7, 2015 02:25
asfgit pushed a commit that referenced this pull request Jul 7, 2015
This is a workaround for MSHADE-148, which leads to an infinite loop when building Spark with maven 3.3.x. This was originally caused by #6441, which added a bunch of test dependencies on the spark-core test module. Recently, it was revealed by #7193.

This patch adds a `-Prelease` profile. If present, it will set `createDependencyReducedPom` to true. The consequences are:
- If you are releasing Spark with this profile, you are fine as long as you use maven 3.2.x or before.
- If you are releasing Spark without this profile, you will run into SPARK-8781.
- If you are not releasing Spark but you are using this profile, you may run into SPARK-8819.
- If you are not releasing Spark and you did not include this profile, you are fine.

This is all documented in `pom.xml` and tested locally with both versions of maven.

Author: Andrew Or <andrew@databricks.com>

Closes #7219 from andrewor14/fix-maven-build and squashes the following commits:

1d37e87 [Andrew Or] Merge branch 'master' of github.com:apache/spark into fix-maven-build
3574ae4 [Andrew Or] Review comments
f39199c [Andrew Or] Create a -Prelease profile that flags `createDependencyReducedPom`

Conflicts:
	dev/create-release/create-release.sh
	pom.xml
andrewor14 referenced this pull request Jul 7, 2015
@srowen
Copy link
Member

srowen commented Jul 7, 2015

Agree that a separate module is a logically nice fix, but probably does have the circular dependency problem and introduce a whole new module for a few lines of code.

I also saw @pwendell 's message about release-profile having side-effects. Darn. I had hoped side effects of setting up like Maven does for release are helpful, like building javadoc, but I guess the javadoc doesn't build on its own. LGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
6 participants