[SPARK-25665][SQL][TEST] Refactor ObjectHashAggregateExecBenchmark to… #22804

peter-toth · 2018-10-23T08:50:02Z

What changes were proposed in this pull request?

Refactor ObjectHashAggregateExecBenchmark to use main method

How was this patch tested?

Manually tested:

bin/spark-submit --class org.apache.spark.sql.execution.benchmark.ObjectHashAggregateExecBenchmark --jars sql/catalyst/target/spark-catalyst_2.11-3.0.0-SNAPSHOT-tests.jar,core/target/spark-core_2.11-3.0.0-SNAPSHOT-tests.jar,sql/hive/target/spark-hive_2.11-3.0.0-SNAPSHOT.jar --packages org.spark-project.hive:hive-exec:1.2.1.spark2 sql/hive/target/spark-hive_2.11-3.0.0-SNAPSHOT-tests.jar

Generated results with:

SPARK_GENERATE_BENCHMARK_FILES=1 build/sbt "hive/test:runMain org.apache.spark.sql.execution.benchmark.ObjectHashAggregateExecBenchmark"

… use main method

wangyum · 2018-10-23T10:38:44Z

...c/test/scala/org/apache/spark/sql/execution/benchmark/ObjectHashAggregateExecBenchmark.scala

-      sparkSession.conf.set(SQLConf.OBJECT_AGG_SORT_BASED_FALLBACK_THRESHOLD.key, "2")
-      df.groupBy($"id" / (N / 4) cast LongType).agg(percentile_approx($"id", 0.5)).collect()
+/**
+ * Benchmark to measure read performance with Filter pushdown.


read performance with Filter pushdown?

Thanks @wangyum , fixed.

Change-Id: Ib9c6b80822013dce31c22baac2d6f6a5b9b730f2

dongjoon-hyun · 2018-10-23T21:52:46Z

ok to test

dongjoon-hyun · 2018-10-23T22:06:27Z

...c/test/scala/org/apache/spark/sql/execution/benchmark/ObjectHashAggregateExecBenchmark.scala

+  val spark: SparkSession = TestHive.sparkSession
+
+  override def runBenchmarkSuite(): Unit = {
+    runBenchmark("Hive UDAF vs Spark AF") {


Hi, @peter-toth . Thank you for making this PR.
Currently, runBenchmarkSuite is too long. Could you make a separate function for each test case? For example, ignore("Hive UDAF vs Spark AF") can be a single function. And runBenchmarkSuite will call a series of those functions.

SparkQA · 2018-10-24T00:14:11Z

Test build #97940 has finished for PR 22804 at commit 2ed884b.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2018-10-24T13:44:44Z

Test build #97972 has finished for PR 22804 at commit 37b40ae.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

dongjoon-hyun

Hi, @peter-toth .
Thank you for updating. I made a PR to you. Could you review and merge that?

Change minor stuff and update result

peter-toth · 2018-10-25T09:29:30Z

Thanks @dongjoon-hyun for the fixes. Merged.

SparkQA · 2018-10-25T11:47:18Z

Test build #98012 has finished for PR 22804 at commit 6849a87.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

dongjoon-hyun

Thank you, @peter-toth and @wangyum .
+1, LGTM.

Merged to master.

peter-toth · 2018-10-25T20:12:09Z

Thanks @dongjoon-hyun , @wangyum for the review.

## What changes were proposed in this pull request? Refactor ObjectHashAggregateExecBenchmark to use main method ## How was this patch tested? Manually tested: ``` bin/spark-submit --class org.apache.spark.sql.execution.benchmark.ObjectHashAggregateExecBenchmark --jars sql/catalyst/target/spark-catalyst_2.11-3.0.0-SNAPSHOT-tests.jar,core/target/spark-core_2.11-3.0.0-SNAPSHOT-tests.jar,sql/hive/target/spark-hive_2.11-3.0.0-SNAPSHOT.jar --packages org.spark-project.hive:hive-exec:1.2.1.spark2 sql/hive/target/spark-hive_2.11-3.0.0-SNAPSHOT-tests.jar ``` Generated results with: ``` SPARK_GENERATE_BENCHMARK_FILES=1 build/sbt "hive/test:runMain org.apache.spark.sql.execution.benchmark.ObjectHashAggregateExecBenchmark" ``` Closes apache#22804 from peter-toth/SPARK-25665. Lead-authored-by: Peter Toth <peter.toth@gmail.com> Co-authored-by: Dongjoon Hyun <dongjoon@apache.org> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>

[SPARK-25665][SQL][TEST] Refactor ObjectHashAggregateExecBenchmark to…

cf2bb2c

… use main method

wangyum reviewed Oct 23, 2018

View reviewed changes

[SPARK-25665][SQL][TEST] fix review findigs

2ed884b

Change-Id: Ib9c6b80822013dce31c22baac2d6f6a5b9b730f2

dongjoon-hyun reviewed Oct 23, 2018

View reviewed changes

[SPARK-25665][SQL][TEST] fix review findings 2

37b40ae

dongjoon-hyun added 3 commits October 24, 2018 16:18

Update minor stuff

3000c88

fix

ed6113c

Update result

a280219

dongjoon-hyun reviewed Oct 25, 2018

View reviewed changes

Merge pull request #2 from dongjoon-hyun/PR-22804

6849a87

Change minor stuff and update result

dongjoon-hyun approved these changes Oct 25, 2018

View reviewed changes

asfgit closed this in ccd07b7 Oct 25, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SPARK-25665][SQL][TEST] Refactor ObjectHashAggregateExecBenchmark to… #22804

[SPARK-25665][SQL][TEST] Refactor ObjectHashAggregateExecBenchmark to… #22804

peter-toth commented Oct 23, 2018

wangyum Oct 23, 2018

peter-toth Oct 23, 2018

dongjoon-hyun commented Oct 23, 2018

dongjoon-hyun Oct 23, 2018

SparkQA commented Oct 24, 2018

SparkQA commented Oct 24, 2018

dongjoon-hyun left a comment •

edited

peter-toth commented Oct 25, 2018

SparkQA commented Oct 25, 2018

dongjoon-hyun left a comment

peter-toth commented Oct 25, 2018

[SPARK-25665][SQL][TEST] Refactor ObjectHashAggregateExecBenchmark to… #22804

[SPARK-25665][SQL][TEST] Refactor ObjectHashAggregateExecBenchmark to… #22804

Conversation

peter-toth commented Oct 23, 2018

What changes were proposed in this pull request?

How was this patch tested?

wangyum Oct 23, 2018

Choose a reason for hiding this comment

peter-toth Oct 23, 2018

Choose a reason for hiding this comment

dongjoon-hyun commented Oct 23, 2018

dongjoon-hyun Oct 23, 2018

Choose a reason for hiding this comment

SparkQA commented Oct 24, 2018

SparkQA commented Oct 24, 2018

dongjoon-hyun left a comment • edited

Choose a reason for hiding this comment

peter-toth commented Oct 25, 2018

SparkQA commented Oct 25, 2018

dongjoon-hyun left a comment

Choose a reason for hiding this comment

peter-toth commented Oct 25, 2018

dongjoon-hyun left a comment •

edited