[SPARK-38354][SQL] Add hash probes metric for shuffled hash join by c21 · Pull Request #35686 · apache/spark

c21 · 2022-02-28T17:23:13Z

What changes were proposed in this pull request?

For hash aggregate, there's a SQL metrics to track number of hash probes per looked-up key. It would be better to add a similar metrics for shuffled hash join as well, to get some idea of hash probing performance. Also renamed the existing SQL metrics (and related methods names) in hash aggregate, from avg hash probe bucket list iters to avg hash probes per key, as the original name is quite obscured to understand.

Why are the changes needed?

To show up in Spark web UI (and allow metrics collection) for shuffled hash join probing performance. When the metrics is more closer to 1.0, the probing performance is better.

Does this PR introduce any user-facing change?

Yes, the added SQL metrics. Will attach screenshot later.

How was this patch tested?

The modified unit test in SQLMetricsSuite.scala.

c21 · 2022-03-01T05:31:20Z

cc @cloud-fan could you help take a look when you have time? Thanks.

cloud-fan · 2022-03-08T13:32:39Z

sql/core/src/main/scala/org/apache/spark/sql/execution/joins/HashedRelation.scala

   */
  private def updateIndex(key: Long, address: Long): Unit = {
+    numKeyLookups += 1
+    numProbes += 1


hmm, do we need to track the probe time when building the hash relation?

@cloud-fan - This is the same behavior for UnsafeHashedRelation, while when it builds hash relation, it updates the lookup/probe metrics as well. I guess it would be good to keep consistent between UnsafeHashedRelation and LongHashedRelation here?

cloud-fan · 2022-03-09T15:26:27Z

thanks, merging to master!

c21 · 2022-03-09T20:53:36Z

Thank you @cloud-fan for review!

### What changes were proposed in this pull request? For hash aggregate, there's a SQL metrics to track number of hash probes per looked-up key. It would be better to add a similar metrics for shuffled hash join as well, to get some idea of hash probing performance. Also renamed the existing SQL metrics (and related methods names) in hash aggregate, from `avg hash probe bucket list iters` to `avg hash probes per key`, as the original name is quite obscured to understand. ### Why are the changes needed? To show up in Spark web UI (and allow metrics collection) for shuffled hash join probing performance. When the metrics is more closer to 1.0, the probing performance is better. ### Does this PR introduce _any_ user-facing change? Yes, the added SQL metrics. Will attach screenshot later. ### How was this patch tested? The modified unit test in `SQLMetricsSuite.scala`. Closes apache#35686 from c21/probe-metrics. Authored-by: Cheng Su <chengsu@fb.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com>

somani · 2022-04-24T09:59:57Z

@c21 @cloud-fan This has caused a performance regression in our tests where broadcast hash join is 5x slower.
It can be reproduced easily on tpcds 3tb data with the following query:

select sum(
ws_ext_sales_price
) sun_sales, count(*)
from
web_sales, date_dim where ws_sold_date_sk = d_date_sk

I could not figure out why it caused a regression, but it is clear it goes away on reverting the commit.

cloud-fan · 2022-04-25T04:32:55Z

probably because adding a new metrics in a critical code path has perf overhead. @c21 can you open a PR to revert it? We can have more time to think about how to add this metrics without significant perf overhead in Spark 3.4.

c21 · 2022-04-25T05:36:55Z

@cloud-fan and @somani - makes sense, let me revert this to unblock release for now.
@somani - it would be very helpful if you could share any profiling or any flume graph for the regressed query.

…oin" This reverts commit 1584366, as the original PR caused performance regression reported in #35686 (comment) . Closes #36338 from c21/revert-metrics. Authored-by: Cheng Su <chengsu@fb.com> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>

…oin" This reverts commit 1584366, as the original PR caused performance regression reported in #35686 (comment) . Closes #36338 from c21/revert-metrics. Authored-by: Cheng Su <chengsu@fb.com> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org> (cherry picked from commit 6b5a1f9) Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>

github-actions bot added CORE SQL labels Feb 28, 2022

Add hash probes metric for shuffled hash join

3c56b98

c21 force-pushed the probe-metrics branch from c491e9f to 3c56b98 Compare March 1, 2022 00:48

cloud-fan reviewed Mar 8, 2022

View reviewed changes

cloud-fan approved these changes Mar 9, 2022

View reviewed changes

cloud-fan closed this in 1584366 Mar 9, 2022

c21 deleted the probe-metrics branch March 9, 2022 20:53

c21 mentioned this pull request Apr 25, 2022

Revert "[SPARK-38354][SQL] Add hash probes metric for shuffled hash join" #36338

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SPARK-38354][SQL] Add hash probes metric for shuffled hash join#35686

[SPARK-38354][SQL] Add hash probes metric for shuffled hash join#35686
c21 wants to merge 1 commit intoapache:masterfrom
c21:probe-metrics

c21 commented Feb 28, 2022

Uh oh!

c21 commented Mar 1, 2022

Uh oh!

cloud-fan Mar 8, 2022

Uh oh!

c21 Mar 8, 2022

Uh oh!

cloud-fan commented Mar 9, 2022

Uh oh!

c21 commented Mar 9, 2022

Uh oh!

somani commented Apr 24, 2022 •

edited

Loading

Uh oh!

cloud-fan commented Apr 25, 2022

Uh oh!

c21 commented Apr 25, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Comments

Conversation

c21 commented Feb 28, 2022

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

Uh oh!

c21 commented Mar 1, 2022

Uh oh!

cloud-fan Mar 8, 2022

Choose a reason for hiding this comment

Uh oh!

c21 Mar 8, 2022

Choose a reason for hiding this comment

Uh oh!

cloud-fan commented Mar 9, 2022

Uh oh!

c21 commented Mar 9, 2022

Uh oh!

somani commented Apr 24, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cloud-fan commented Apr 25, 2022

Uh oh!

c21 commented Apr 25, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Comments

somani commented Apr 24, 2022 •

edited

Loading