[SPARK-35896][SS] Include more granular metrics for stateful operators in StreamingQueryProgress#33091
[SPARK-35896][SS] Include more granular metrics for stateful operators in StreamingQueryProgress#33091vkorukanti wants to merge 3 commits intoapache:masterfrom
Conversation
|
Test build #140332 has finished for PR 33091 at commit
|
|
Test build #140334 has finished for PR 33091 at commit
|
|
Kubernetes integration test starting |
|
Kubernetes integration test status success |
...ore/src/main/scala/org/apache/spark/sql/execution/streaming/FlatMapGroupsWithStateExec.scala
Outdated
Show resolved
Hide resolved
sql/core/src/test/scala/org/apache/spark/sql/streaming/FlatMapGroupsWithStateSuite.scala
Outdated
Show resolved
Hide resolved
sql/core/src/test/scala/org/apache/spark/sql/streaming/FlatMapGroupsWithStateSuite.scala
Outdated
Show resolved
Hide resolved
sql/core/src/test/scala/org/apache/spark/sql/streaming/StateStoreMetricsTest.scala
Outdated
Show resolved
Hide resolved
|
Kubernetes integration test unable to build dist. exiting with code: 1 |
|
Test build #140367 has finished for PR 33091 at commit
|
|
Test build #140417 has finished for PR 33091 at commit
|
|
retest this, please |
|
Kubernetes integration test starting |
|
GA passed. Thanks! Merging to master. |
|
Kubernetes integration test status success |
|
Kubernetes integration test unable to build dist. exiting with code: 1 |
|
Test build #140421 has finished for PR 33091 at commit
|
dongjoon-hyun
left a comment
There was a problem hiding this comment.
Please confirm this change in the following PR.
| val memoryUsedBytes: Long, | ||
| val numRowsDroppedByWatermark: Long, | ||
| val numShufflePartitions: Long, | ||
| val numStateStoreInstances: Long, |
There was a problem hiding this comment.
This is detected as a binary incompatibility. It will be okay because this is Evolving.
What changes were proposed in this pull request?
Currently the
StateOperatorProgressinStreamingQueryProgressis missing few metrics.Why are the changes needed?
The main motivation is find hotspots and have better visibility in the stateful operations. Detailed explanations are in SPARK-35896.
Does this PR introduce any user-facing change?
Yes. The
StateOperatorProgressentries withinStreamingQueryProgressnow contain additional fields as listed in SPARK-35896. ExampleStreamingQueryProgressoutput in JSON form.Before:
After:
How was this patch tested?
Existing tests for regressions. Added new UTs.