-
Notifications
You must be signed in to change notification settings - Fork 28.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-35529][SQL] Add fallback metrics for hash aggregate #32671
Conversation
@cloud-fan could you help take a look when you have time? Thanks. |
SQLMetrics.createAverageMetric(sparkContext, "avg hash probe bucket list iters")) | ||
SQLMetrics.createAverageMetric(sparkContext, "avg hash probe bucket list iters"), | ||
"numTasksFallBacked" -> SQLMetrics.createMetric(sparkContext, | ||
"number of tasks fall-backed to sort-based aggregation")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this name the same with object hash agg? it is super long..
probably "number of sort fallback tasks"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, same as in #31340. Let me change them together.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated. Thanks.
Kubernetes integration test starting |
Kubernetes integration test status success |
thanks, merging to master! |
Test build #138976 has finished for PR 32671 at commit
|
Test build #138972 has finished for PR 32671 at commit
|
Thank you @cloud-fan for review! |
What changes were proposed in this pull request?
Add the metrics to record how many tasks fallback to sort-based aggregation for hash aggregation. This will help developers and users to debug and optimize query. Object hash aggregation has similar metrics already.
Why are the changes needed?
Help developers and users to debug and optimize query with hash aggregation.
Does this PR introduce any user-facing change?
Yes, the added metrics will show up in Spark web UI.
Example:
How was this patch tested?
Changed unit test in
SQLMetricsSuite.scala
.