[SPARK-54272][SQL] Add aggTime for SortAggregateExec #52968

AngersZhuuuu · 2025-11-10T07:02:54Z

What changes were proposed in this pull request?

Add aggTime metrics for SortAggregateExec

Why are the changes needed?

Add more metrics

Does this PR introduce any user-facing change?

Yes the SQL metrics "time in aggregation build" itself on Spark UI.

How was this patch tested?

UT

Was this patch authored or co-authored using generative AI tooling?

No

AngersZhuuuu · 2025-11-10T09:29:42Z

@HyukjinKwon Could you help review metrics related.

HyukjinKwon · 2025-11-10T16:44:06Z

cc @cloud-fan

cloud-fan · 2025-11-12T15:47:36Z

sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/SortAggregateExec.scala

          outputIter
        }
      }
+      aggTime += NANOSECONDS.toMillis(System.nanoTime() - beforeAgg)


is it the right level to trace the agg time? I think the iterator is lazy, no?

Also find. it's a little strange...If so HashAggregateExec also incorrect?

So I think the difference is the TungstenAggregationIterator is not as lazy -- during it's init step it does the aggregation and whereas sortbasedaggregationiterator does the compute mostly inside of next

Got it, missing TungstenAggregationIterator will call processInputs during construction. So how about my current change?

ping @cloud-fan @holdenk @dongjoon-hyun Could you take a look

holdenk · 2025-11-27T01:19:31Z

This looks reasonable to me, I'd love @cloud-fan to sign-off though :)

cloud-fan · 2025-11-27T02:15:21Z

sql/core/src/test/scala/org/apache/spark/sql/execution/metric/SQLMetricsSuite.scala

+  test("SortAggregate metrics") {
+    // Force use SortAggregateExec instead of HashAggregateExec
+    withSQLConf("spark.sql.test.forceApplySortAggregate" -> "true",
+      SQLConf.WHOLESTAGE_CODEGEN_ENABLED.key -> "false") {


nit: other tests in this suite do not turn off whole stage codegen, why it's necessary here?

Since SortAggregateExec not support codegen then write this, remove it is ok too. remove this line.

cloud-fan · 2025-11-27T05:34:35Z

thanks, merging to master!

SPARK-54272 Add aggTime for SortAggregateExec

657cd4c

github-actions bot added the SQL label Nov 10, 2025

AngersZhuuuu changed the title ~~SPARK-54272 Add aggTime for SortAggregateExec~~ [SPARK-54272][SQL] Add aggTime for SortAggregateExec Nov 10, 2025

AngersZhuuuu added 2 commits November 10, 2025 17:23

Update SQLMetricsSuite.scala

b3ebdca

Update SQLMetricsSuite.scala

43115f4

cloud-fan reviewed Nov 12, 2025

View reviewed changes

AngersZhuuuu added 3 commits November 21, 2025 11:53

update

289480e

Update SortAggregateExec.scala

3eae305

Update SortAggregateExec.scala

a20337c

cloud-fan reviewed Nov 27, 2025

View reviewed changes

cloud-fan approved these changes Nov 27, 2025

View reviewed changes

Update SQLMetricsSuite.scala

88ef9f3

cloud-fan closed this in a8482ad Nov 27, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[SPARK-54272][SQL] Add aggTime for SortAggregateExec #52968

[SPARK-54272][SQL] Add aggTime for SortAggregateExec #52968

Uh oh!

AngersZhuuuu commented Nov 10, 2025 •

edited

Loading

Uh oh!

AngersZhuuuu commented Nov 10, 2025

Uh oh!

HyukjinKwon commented Nov 10, 2025

Uh oh!

cloud-fan Nov 12, 2025

Uh oh!

AngersZhuuuu Nov 13, 2025

Uh oh!

holdenk Nov 21, 2025

Uh oh!

AngersZhuuuu Nov 21, 2025

Uh oh!

AngersZhuuuu Nov 25, 2025

Uh oh!

holdenk commented Nov 27, 2025

Uh oh!

cloud-fan Nov 27, 2025

Uh oh!

AngersZhuuuu Nov 27, 2025

Uh oh!

cloud-fan commented Nov 27, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

[SPARK-54272][SQL] Add aggTime for SortAggregateExec #52968

[SPARK-54272][SQL] Add aggTime for SortAggregateExec #52968

Uh oh!

Conversation

AngersZhuuuu commented Nov 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

Was this patch authored or co-authored using generative AI tooling?

Uh oh!

AngersZhuuuu commented Nov 10, 2025

Uh oh!

HyukjinKwon commented Nov 10, 2025

Uh oh!

cloud-fan Nov 12, 2025

Choose a reason for hiding this comment

Uh oh!

AngersZhuuuu Nov 13, 2025

Choose a reason for hiding this comment

Uh oh!

holdenk Nov 21, 2025

Choose a reason for hiding this comment

Uh oh!

AngersZhuuuu Nov 21, 2025

Choose a reason for hiding this comment

Uh oh!

AngersZhuuuu Nov 25, 2025

Choose a reason for hiding this comment

Uh oh!

holdenk commented Nov 27, 2025

Uh oh!

cloud-fan Nov 27, 2025

Choose a reason for hiding this comment

Uh oh!

AngersZhuuuu Nov 27, 2025

Choose a reason for hiding this comment

Uh oh!

cloud-fan commented Nov 27, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

AngersZhuuuu commented Nov 10, 2025 •

edited

Loading