New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-39529][INFRA] Refactor and merge all related job selection logic into precondition #36940
Conversation
81e6206
to
79d92a2
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice how this removes the need for that extra layer of logic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for doing this. In general, the direction looks correct and good for the future.
BTW, how do we validate this PR?
I found it best to test by removing The scheduled jobs are best to be tested by declaring this branch as the default branch, so that GitHub runs them as defined here. Besides that, it can only be tested from master, sadly. |
I will test locally, and try to run the scheduled jobs in my fork with some manual changes (to allow running in my fork). That's not perfect but would still test reasonably enough before getting this in. |
bd3b11a
to
4f1107b
Compare
pyspark_modules=`cd dev && python -c "import sparktestsupport.modules as m; print(','.join(m.name for m in m.all_modules if m.name.startswith('pyspark')))"` | ||
pyspark=`./dev/is-changed.py -m $pyspark_modules` | ||
sparkr=`./dev/is-changed.py -m sparkr` | ||
tpcds=`./dev/is-changed.py -m sql` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In fact, we don't need any change in the script. The script already checks the dependent modules.
I tested all the possible scenarios (as far as I can tell). should be good to go. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
Will merge this tomorrow morning in my time (KST) .... so I can monitor and fix it if something goes wrong ... |
Merged to master. |
Thanks! |
…once ### What changes were proposed in this pull request? This PR proposes to skip the builds that have never passed in history. #36940 added some more combinations to check if they passed, and this PR disables some of them that do not pass. See also: - SPARK-39681 (https://github.com/apache/spark/runs/7189972241?check_suite_focus=true) - SPARK-39685 (https://github.com/apache/spark/runs/7189971643?check_suite_focus=true) - SPARK-39682 (https://github.com/apache/spark/runs/7189971505?check_suite_focus=true) - SPARK-39684 (https://github.com/apache/spark/runs/7054055518?check_suite_focus=true) ### Why are the changes needed? In order to check the test status easily with each combination of profile and branch. ### Does this PR introduce _any_ user-facing change? No, test-only. ### How was this patch tested? Virtually configuration changes. Will be checked after it's merged. Closes #37091 from HyukjinKwon/SPARK-39686. Lead-authored-by: Hyukjin Kwon <gurwls223@apache.org> Co-authored-by: Hyukjin Kwon <gurwls223@gmail.com> Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
### What changes were proposed in this pull request? This PR is a kind of a followup of #36940 which fixes a mistake of replacing `inputs.type`. This PR changes it to use `PYSPARK_CODECOV` that's defined from https://github.com/apache/spark/blob/353df517f4d37aabbf807638385b0ab92dd95397/.github/workflows/build_coverage.yml#L39. Similar approach is being used in https://github.com/apache/spark/blob/353df517f4d37aabbf807638385b0ab92dd95397/.github/workflows/build_ansi.yml#L39 with https://github.com/apache/spark/blob/919e556d182f86b3e1e22cd05a546e0f2fc3e307/.github/workflows/build_and_test.yml#L797 ### Why are the changes needed? To recover the Codecov site. It stopped working for few months (https://app.codecov.io/gh/apache/spark). ### Does this PR introduce _any_ user-facing change? No, dev-only. ### How was this patch tested? Will monitor the test cases. I am sure it works :-) Closes #39107 from HyukjinKwon/SPARK-41559. Authored-by: Hyukjin Kwon <gurwls223@apache.org> Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
What changes were proposed in this pull request?
This PR borrows the idea from #36928 but adds some more changes in order for scheduled jobs to share the
precondition
so all conditional logic is consolidated here.This PR also adds a new option to
is-changed.py
so dependent modules can be checked together. In this way, we don't have to changebuild_and_test.yml
often when we add a new module.In addition, this PR removes
type
becauseprecondition
job now replaces it.Lastly, this PR enables PySpark, SparkR TPC-DS and Docker integration tests for scheduled jobs when applicable.
Closes #36928
Why are the changes needed?
To make it easier to read.
Does this PR introduce any user-facing change?
No, dev-only.
How was this patch tested?
Tested locally and in my fork (https://github.com/HyukjinKwon/spark/actions)