-
Notifications
You must be signed in to change notification settings - Fork 29.1k
[SPARK-34907][TESTS] Add main class that detects and runs all benchmarks #32005
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
f43f7d5 to
02f2563
Compare
02f2563 to
f257aab
Compare
|
Thanks @srowen. I just fixed some style nits additionally. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you, @HyukjinKwon . This looks helpful.
Please note that some benchmarks need extra cares.
For example, ExternalAppendOnlyUnsafeRowArrayBenchmark requires spark.memory.debugFill=false option additionally. You may want to keep exclude-list for some benchmarks like that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this will affect all benchmark results, it would be great if we can see the actual generated results to verify that they are reasonable.
dongjoon-hyun
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I realized that this PR is designed to use spark-submit only. +1.
MaxGekk
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand the case when you need to run all detected benchmarks. Especially, when the sequential run will take days.
I thought you was going to give to others opportunity of running some specific benchmarks in user's PR/branch.
|
Kubernetes integration test unable to build dist. exiting with code: 1 |
|
Kubernetes integration test starting |
|
Kubernetes integration test status failure |
|
Test build #136731 has finished for PR 32005 at commit
|
|
Test build #136732 has finished for PR 32005 at commit
|
|
Max, this will be used to run a specific benchmark, wild card, and all benchmarks. We should update all the benchmark results. GA build can run up to 72 hours fwiw. |
|
It looks fine, too. |
|
Thanks guys. Let me merge this in first and proceed (it won't break or affect anything in our CI anyway). I am working on SPARK-34821 now. Let's see how it goes! |
|
Merged to master. |
What changes were proposed in this pull request?
This PR proposes to add a script that detects and runs all benchmarks.
Why are the changes needed?
To run the benchmarks easily. This is actually for SPARK-34821.
Does this PR introduce any user-facing change?
No, dev-only.
How was this patch tested?
Manually tested with the command below after building Spark:
SPARK_GENERATE_BENCHMARK_FILES=1 bin/spark-submit --class \ org.apache.spark.benchmark.Benchmarks --jars \ "`find . -name "*3.2.0-SNAPSHOT-tests.jar" | paste -sd ',' -`" \ ./core/target/scala-2.12/spark-core_2.12-3.2.0-SNAPSHOT-tests.jarThis is ongoing work. I will double check with working on SPARK-34821 and updating the results.