-
Notifications
You must be signed in to change notification settings - Fork 28k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-25665][SQL][TEST] Refactor ObjectHashAggregateExecBenchmark to… #22804
Conversation
sparkSession.conf.set(SQLConf.OBJECT_AGG_SORT_BASED_FALLBACK_THRESHOLD.key, "2") | ||
df.groupBy($"id" / (N / 4) cast LongType).agg(percentile_approx($"id", 0.5)).collect() | ||
/** | ||
* Benchmark to measure read performance with Filter pushdown. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
read performance with Filter pushdown?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @wangyum , fixed.
Change-Id: Ib9c6b80822013dce31c22baac2d6f6a5b9b730f2
ok to test |
val spark: SparkSession = TestHive.sparkSession | ||
|
||
override def runBenchmarkSuite(): Unit = { | ||
runBenchmark("Hive UDAF vs Spark AF") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi, @peter-toth . Thank you for making this PR.
Currently, runBenchmarkSuite
is too long. Could you make a separate function for each test case? For example, ignore("Hive UDAF vs Spark AF")
can be a single function. And runBenchmarkSuite
will call a series of those functions.
Test build #97940 has finished for PR 22804 at commit
|
Test build #97972 has finished for PR 22804 at commit
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi, @peter-toth .
Thank you for updating. I made a PR to you. Could you review and merge that?
Change minor stuff and update result
Thanks @dongjoon-hyun for the fixes. Merged. |
Test build #98012 has finished for PR 22804 at commit
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you, @peter-toth and @wangyum .
+1, LGTM.
Merged to master.
Thanks @dongjoon-hyun , @wangyum for the review. |
## What changes were proposed in this pull request? Refactor ObjectHashAggregateExecBenchmark to use main method ## How was this patch tested? Manually tested: ``` bin/spark-submit --class org.apache.spark.sql.execution.benchmark.ObjectHashAggregateExecBenchmark --jars sql/catalyst/target/spark-catalyst_2.11-3.0.0-SNAPSHOT-tests.jar,core/target/spark-core_2.11-3.0.0-SNAPSHOT-tests.jar,sql/hive/target/spark-hive_2.11-3.0.0-SNAPSHOT.jar --packages org.spark-project.hive:hive-exec:1.2.1.spark2 sql/hive/target/spark-hive_2.11-3.0.0-SNAPSHOT-tests.jar ``` Generated results with: ``` SPARK_GENERATE_BENCHMARK_FILES=1 build/sbt "hive/test:runMain org.apache.spark.sql.execution.benchmark.ObjectHashAggregateExecBenchmark" ``` Closes apache#22804 from peter-toth/SPARK-25665. Lead-authored-by: Peter Toth <peter.toth@gmail.com> Co-authored-by: Dongjoon Hyun <dongjoon@apache.org> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
What changes were proposed in this pull request?
Refactor ObjectHashAggregateExecBenchmark to use main method
How was this patch tested?
Manually tested:
Generated results with: