Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-10359] Enumerate dependencies in a file and diff against it for new pull requests #10461

Closed
wants to merge 37 commits into from

Conversation

JoshRosen
Copy link
Contributor

This patch adds a new build check which enumerates Spark's resolved runtime classpath and saves it to a file, then diffs against that file to detect whether pull requests have introduced dependency changes. The aim of this check is to make it simpler to reason about whether pull request which modify the build have introduced new dependencies or changed transitive dependencies in a way that affects the final classpath.

This supplants the checks added in SPARK-4123 / #5093, which are currently disabled due to bugs.

This patch is based on @pwendell's work in #8531.

Closes #8531.

@JoshRosen
Copy link
Contributor Author

@JoshRosen
Copy link
Contributor Author

Uh oh, I'd hold off on merging this pending investigation of issues with the dummy change.

@SparkQA
Copy link

SparkQA commented Dec 30, 2015

Test build #48452 has finished for PR 10461 at commit 9932c80.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@JoshRosen JoshRosen changed the title [SPARK-10359] Enumerate dependencies in a file and diff against it for new pull requests [SPARK-10359][DONOTMERGE] Enumerate dependencies in a file and diff against it for new pull requests Dec 30, 2015
@JoshRosen
Copy link
Contributor Author

Spotted the problem: the original logic for checking should_run_build_tests was wrong: it needs to loop over all test_modules (using any) and the root module needs to have the build tests enabled.

@SparkQA
Copy link

SparkQA commented Dec 30, 2015

Test build #48503 has finished for PR 10461 at commit 1454f5f.

  • This patch fails build dependency tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@JoshRosen
Copy link
Contributor Author

Woohoo, that worked! However, I'm not a fan of how vanilla diff formats its output, so I'm going to quickly try and see if I can use git diff for this.

@SparkQA
Copy link

SparkQA commented Dec 30, 2015

Test build #48505 has finished for PR 10461 at commit 329bdf2.

  • This patch fails build dependency tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@JoshRosen JoshRosen changed the title [SPARK-10359][DONOTMERGE] Enumerate dependencies in a file and diff against it for new pull requests [SPARK-10359] Enumerate dependencies in a file and diff against it for new pull requests Dec 30, 2015
@JoshRosen
Copy link
Contributor Author

Going to merge this into master and will email the dev list with a summary describing the changes + impact on process.

@asfgit asfgit closed this in 27a42c7 Dec 30, 2015
@JoshRosen JoshRosen deleted the SPARK-10359 branch December 30, 2015 20:50
@SparkQA
Copy link

SparkQA commented Dec 30, 2015

Test build #48507 has finished for PR 10461 at commit cc99de6.

  • This patch fails PySpark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

asfgit pushed a commit that referenced this pull request Jan 1, 2016
…sh script

This patch includes multiple fixes for the `dev/test-dependencies.sh` script (which was introduced in #10461):

- Use `build/mvn --force` instead of `mvn` in one additional place.
- Explicitly set a zero exit code on success.
- Set `LC_ALL=C` to make `sort` results agree across machines (see https://stackoverflow.com/questions/28881/).
- Set `should_run_build_tests=True` for `build` module (this somehow got lost).

Author: Josh Rosen <joshrosen@databricks.com>

Closes #10543 from JoshRosen/dep-script-fixes.
JoshRosen added a commit to JoshRosen/spark that referenced this pull request Jan 10, 2016
…r new pull requests

This patch adds a new build check which enumerates Spark's resolved runtime classpath and saves it to a file, then diffs against that file to detect whether pull requests have introduced dependency changes. The aim of this check is to make it simpler to reason about whether pull request which modify the build have introduced new dependencies or changed transitive dependencies in a way that affects the final classpath.

This supplants the checks added in SPARK-4123 / apache#5093, which are currently disabled due to bugs.

This patch is based on pwendell's work in apache#8531.

Closes apache#8531.

Author: Josh Rosen <joshrosen@databricks.com>
Author: Patrick Wendell <patrick@databricks.com>

Closes apache#10461 from JoshRosen/SPARK-10359.
JoshRosen added a commit to JoshRosen/spark that referenced this pull request Jan 10, 2016
…sh script

This patch includes multiple fixes for the `dev/test-dependencies.sh` script (which was introduced in apache#10461):

- Use `build/mvn --force` instead of `mvn` in one additional place.
- Explicitly set a zero exit code on success.
- Set `LC_ALL=C` to make `sort` results agree across machines (see https://stackoverflow.com/questions/28881/).
- Set `should_run_build_tests=True` for `build` module (this somehow got lost).

Author: Josh Rosen <joshrosen@databricks.com>

Closes apache#10543 from JoshRosen/dep-script-fixes.
JoshRosen added a commit to JoshRosen/spark that referenced this pull request Jan 10, 2016
…r new pull requests

This patch adds a new build check which enumerates Spark's resolved runtime classpath and saves it to a file, then diffs against that file to detect whether pull requests have introduced dependency changes. The aim of this check is to make it simpler to reason about whether pull request which modify the build have introduced new dependencies or changed transitive dependencies in a way that affects the final classpath.

This supplants the checks added in SPARK-4123 / apache#5093, which are currently disabled due to bugs.

This patch is based on pwendell's work in apache#8531.

Closes apache#8531.

Author: Josh Rosen <joshrosen@databricks.com>
Author: Patrick Wendell <patrick@databricks.com>

Closes apache#10461 from JoshRosen/SPARK-10359.
JoshRosen added a commit to JoshRosen/spark that referenced this pull request Jan 10, 2016
…sh script

This patch includes multiple fixes for the `dev/test-dependencies.sh` script (which was introduced in apache#10461):

- Use `build/mvn --force` instead of `mvn` in one additional place.
- Explicitly set a zero exit code on success.
- Set `LC_ALL=C` to make `sort` results agree across machines (see https://stackoverflow.com/questions/28881/).
- Set `should_run_build_tests=True` for `build` module (this somehow got lost).

Author: Josh Rosen <joshrosen@databricks.com>

Closes apache#10543 from JoshRosen/dep-script-fixes.
asfgit pushed a commit that referenced this pull request Jan 10, 2016
… branch-1.6

This patch backports the `dev/test-dependencies` script (from #10461) to branch-1.6.

Author: Josh Rosen <joshrosen@databricks.com>

Closes #10680 from JoshRosen/test-deps-16-backport.
asfgit pushed a commit that referenced this pull request Jan 11, 2016
…o branch-1.5

This patch backports the `dev/test-dependencies` script (from #10461) to branch-1.5.

Author: Josh Rosen <joshrosen@databricks.com>

Closes #10679 from JoshRosen/test-deps-15-backport and squashes the following commits:

7633793 [Josh Rosen] Fix merge conflicts.
1d60cb6 [Josh Rosen] [SPARK-10359][PROJECT-INFRA] Use more random number in dev/test-dependencies.sh; fix version switching
889c355 [Josh Rosen] [SPARK-10359][PROJECT-INFRA] Multiple fixes to dev/test-dependencies.sh script
4e48f14 [Josh Rosen] [SPARK-10359] Enumerate dependencies in a file and diff against it for new pull requests
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants