[SPARK-30549][SQL] Fix the subquery shown issue in UI When enable AQE #27260

JkSelf · 2020-01-17T09:20:22Z

What changes were proposed in this pull request?

After PR#25316 fixed the dead lock issue in PR#25308, the subquery metrics can not be shown in UI as following screenshot.

This PR fix the subquery UI shown issue by adding SparkListenerSQLAdaptiveSQLMetricUpdates event to update the suquery sql metric. After with this PR, the suquery UI can show correctly as following screenshot:

Why are the changes needed?

Showing the subquery metric in UI when enable AQE

Does this PR introduce any user-facing change?

No

How was this patch tested?

Existing UT

…syn the value both in accumIdsToMetricType and metrics

JkSelf · 2020-01-17T09:24:21Z

@cloud-fan help me review. Thanks.

SparkQA · 2020-01-17T13:38:14Z

Test build #116925 has finished for PR 27260 at commit 77747c5.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2020-01-17T21:24:05Z

Test build #116950 has finished for PR 27260 at commit 3c4c84f.

This patch fails from timeout after a configured wait of 400m.
This patch merges cleanly.
This patch adds no public classes.

gatorsmile · 2020-01-18T04:06:59Z

retest this please

SparkQA · 2020-01-18T08:02:44Z

Test build #116971 has finished for PR 27260 at commit 3c4c84f.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

cloud-fan · 2020-01-20T08:42:32Z

sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/AdaptiveSparkPlanExec.scala

@@ -132,6 +134,17 @@ case class AdaptiveSparkPlanExec(
    executedPlan.resetMetrics()
  }

+  private def collectSQLMetrics(plan: SparkPlan): Seq[SQLMetric] = {
+    val metrics = new mutable.ArrayBuffer[SQLMetric]()
+    collect(plan) {


we should use the normal collect. We don't need to get the SQLMetrics of already materialized query stages.

cloud-fan · 2020-01-20T08:43:26Z

sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/AdaptiveSparkPlanExec.scala

@@ -151,6 +164,9 @@ case class AdaptiveSparkPlanExec(
        currentPhysicalPlan = result.newPlan
        if (result.newStages.nonEmpty) {
          stagesToReplace = result.newStages ++ stagesToReplace
+          if (isSubquery) {


can we put the code in onUpdatePlan?

cloud-fan · 2020-01-20T08:44:55Z

sql/core/src/main/scala/org/apache/spark/sql/execution/ui/SQLAppStatusListener.scala

+    accumIdsToMetricType.map { case (accumulatorId, metricType) =>
+      stages.foreach { stageId =>
+        val liveStageMetric = stageMetrics.get(stageId)
+        liveStageMetric.accumIdsToMetricType += (accumulatorId -> metricType)


how about

stageMetrics(stageId) = liveStageMetric.copy(accumIdsToMetricType = ...)

It's too hacky to make a UI data class mutable.

cloud-fan · 2020-01-20T08:46:16Z

sql/core/src/main/scala/org/apache/spark/sql/execution/ui/SQLAppStatusListener.scala

+    val SparkListenerSQLAdaptiveAccumUpdates(executionId, accumIdsToMetricType) = event
+
+    val stages = liveExecutions.get(executionId).stages
+    accumIdsToMetricType.map { case (accumulatorId, metricType) =>


why do we need to loop it? we can just update liveStageMetric.accumIdsToMetricType as

liveStageMetric.copy(accumIdsToMetricType = liveStageMetric.accumIdsToMetricType ++ accumIdsToMetricType)

cloud-fan · 2020-01-20T08:47:05Z

@JkSelf can you try it locally and post some screenshots in the PR description?

SparkQA · 2020-01-21T08:05:02Z

Test build #117162 has finished for PR 27260 at commit 0c1650c.

This patch fails due to an unknown error code, -9.
This patch merges cleanly.
This patch adds no public classes.

cloud-fan · 2020-01-21T09:02:13Z

sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/AdaptiveSparkPlanExec.scala

@@ -132,6 +133,17 @@ case class AdaptiveSparkPlanExec(
    executedPlan.resetMetrics()
  }

+  private def collectSQLMetrics(plan: SparkPlan): Seq[SQLMetric] = {
+    val metrics = new mutable.ArrayBuffer[SQLMetric]()
+    plan.collect {


nit: foreach

cloud-fan · 2020-01-21T09:02:21Z

sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/AdaptiveSparkPlanExec.scala

+    val metrics = new mutable.ArrayBuffer[SQLMetric]()
+    plan.collect {
+      case p: SparkPlan =>
+        p.metrics.map { case metric =>


cloud-fan · 2020-01-21T09:03:32Z

sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/AdaptiveSparkPlanExec.scala

@@ -484,12 +496,24 @@ case class AdaptiveSparkPlanExec(
   * Notify the listeners of the physical plan change.
   */
  private def onUpdatePlan(executionId: Long): Unit = {
+    if (isSubquery) {
+      onUpdateAccumulator(collectSQLMetrics(currentPhysicalPlan))


we can pass the executionId parameter to the onUpdateAccumulator method.

We can not put this code in onUpdatePlan, Because when the sql is subquery, the executionId is None and it will not call the OnUpdatePlan.

then can we not make it None now that we have this if branch??

btw, without setting executeID, this onUpdatePlan will not be called at all when isSubquery == true.

SparkQA · 2020-01-21T09:38:37Z

Test build #117164 has finished for PR 27260 at commit 1dacc22.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

maryannxue · 2020-01-21T20:10:40Z

sql/core/src/main/scala/org/apache/spark/sql/execution/ui/SQLAppStatusListener.scala

+    val stages = liveExecutions.get(executionId).stages
+    stages.foreach { stageId =>
+      val liveStageMetric = stageMetrics.get(stageId)
+      stageMetrics.put(stageId, liveStageMetric.copy(


This can be called throughout the entire subquery execution, and by doing this "copy", we could lose metrics for previous stages of the subquery.

+1, some maps in LiveStageMetrics will lose

After discussion with @cloud-fan , we update the sql metric in LiveExecutionData.metirc and here no need to copy. Thanks.

SparkQA · 2020-01-22T03:27:09Z

Test build #117213 has finished for PR 27260 at commit c736ec0.

This patch fails Scala style tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2020-01-22T08:03:13Z

Test build #117220 has finished for PR 27260 at commit 33a9d3f.

This patch fails Scala style tests.
This patch merges cleanly.
This patch adds no public classes.

cloud-fan · 2020-01-22T11:28:21Z

sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/AdaptiveSparkPlanExec.scala

+    val sqlPlanMetrics = sqlMetrics.map { case sqlMetric =>
+      SQLPlanMetric(sqlMetric.name.get, sqlMetric.id, sqlMetric.metricType)
+    }
+    val executionId = context.session.sparkContext.getLocalProperty(SQLExecution.EXECUTION_ID_KEY)


this is not needed.

cloud-fan · 2020-01-22T11:28:56Z

sql/core/src/main/scala/org/apache/spark/sql/execution/ui/SQLListener.scala

@@ -34,6 +34,11 @@ case class SparkListenerSQLAdaptiveExecutionUpdate(
  sparkPlanInfo: SparkPlanInfo)
  extends SparkListenerEvent

+case class SparkListenerSQLAdaptiveAccumUpdates(


@DeveloperAPI

cloud-fan · 2020-01-22T11:34:43Z

sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/AdaptiveSparkPlanExec.scala

-      SQLExecution.getQueryExecution(executionId).toString,
-      SparkPlanInfo.fromSparkPlan(this)))
+    if (isSubquery) {
+      onUpdateAccumulator(collectSQLMetrics(currentPhysicalPlan), executionId)


let's add some comments

When executing subqueries, we can't update the query plan in the UI as the UI doesn't support partial update yet. However, the subquery may have been optimized into a different plan and we must let the UI know the SQL metrics of the new plan nodes, so that it can track the valid accumulator updates later and display SQL metrics correctly.

cloud-fan · 2020-01-22T11:36:10Z

sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/AdaptiveSparkPlanExec.scala

+    }
+  }
+
+  private def onUpdateAccumulator(sqlMetrics: Seq[SQLMetric], executionId: Long): Unit = {


nit: onUpdateSQLMetrics

cloud-fan · 2020-01-22T11:36:43Z

sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/AdaptiveSparkPlanExec.scala

+      SQLPlanMetric(sqlMetric.name.get, sqlMetric.id, sqlMetric.metricType)
+    }
+    val executionId = context.session.sparkContext.getLocalProperty(SQLExecution.EXECUTION_ID_KEY)
+    context.session.sparkContext.listenerBus.post(SparkListenerSQLAdaptiveAccumUpdates(


SparkListenerSQLAdaptiveSQLMetricUpdates

JkSelf · 2020-01-22T12:37:14Z

@cloud-fan Resolved the comments and updated the screenshots.

SparkQA · 2020-01-22T12:49:05Z

Test build #117225 has finished for PR 27260 at commit c66d206.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2020-01-22T14:06:07Z

Test build #117229 has finished for PR 27260 at commit f939787.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

maryannxue · 2020-01-22T14:56:15Z

sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/AdaptiveSparkPlanExec.scala

+    val metrics = new mutable.ArrayBuffer[SQLMetric]()
+    plan.foreach {
+      case p: ShuffleQueryStageExec if (p.resultOption.isEmpty) =>
+        collectSQLMetrics(p.plan).foreach(metrics += _)


Can we either use map/flatMap or pass this metrics array in collectSQLMetrics ?

maryannxue · 2020-01-22T14:59:45Z

sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/AdaptiveSparkPlanExec.scala

+    }
+  }
+
+  private def onUpdateSQLMetrics(sqlMetrics: Seq[SQLMetric], executionId: Long): Unit = {


nit: Why do we need this parameter sqlMetrics: Seq[SQLMetric] ? Can't we just pass executionId. And IMO, I don't even think this extra method is necessary at this point.

SparkQA · 2020-01-22T15:23:33Z

Test build #117234 has finished for PR 27260 at commit b1d7fcc.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2020-01-22T16:48:45Z

Test build #117238 has finished for PR 27260 at commit 18df09a.

This patch passes all tests.
This patch merges cleanly.
This patch adds the following public classes (experimental):
case class SparkListenerSQLAdaptiveSQLMetricUpdates(

gatorsmile · 2020-01-22T17:02:59Z

Thanks! Merged to master.

JkSelf added 11 commits January 5, 2005 04:49

fix the subquery shown issue in UI

d9331c0

remove the change in AdaptiveQueryExecSuite

77747c5

refine code

3c4c84f

resolve comments

0c1650c

small update

1dacc22

resolve the comments

c736ec0

updated

33a9d3f

add the accumIdsToMetricType variable in SQLExecutionUIData

c66d206

remove the accumIdsToMetricType variable from SQLExecutionUIData and …

f939787

…syn the value both in accumIdsToMetricType and metrics

remove the accumIdsToMetricType variable in LiveStageMetrics

b1d7fcc

resolve the comments

18df09a

dongjoon-hyun added SQL WEB UI labels Jan 17, 2020

cloud-fan reviewed Jan 20, 2020

View reviewed changes

cloud-fan reviewed Jan 21, 2020

View reviewed changes

maryannxue reviewed Jan 21, 2020

View reviewed changes

cloud-fan reviewed Jan 22, 2020

View reviewed changes

cloud-fan approved these changes Jan 22, 2020

View reviewed changes

maryannxue reviewed Jan 22, 2020

View reviewed changes

gatorsmile closed this in 6dfaa07 Jan 22, 2020

[SPARK-30549][SQL] Fix the subquery shown issue in UI When enable AQE #27260

[SPARK-30549][SQL] Fix the subquery shown issue in UI When enable AQE #27260

Conversation

JkSelf commented Jan 17, 2020 • edited

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

JkSelf commented Jan 17, 2020

SparkQA commented Jan 17, 2020

SparkQA commented Jan 17, 2020

gatorsmile commented Jan 18, 2020

SparkQA commented Jan 18, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cloud-fan commented Jan 20, 2020

SparkQA commented Jan 21, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

SparkQA commented Jan 21, 2020

maryannxue Jan 21, 2020 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

SparkQA commented Jan 22, 2020

SparkQA commented Jan 22, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cloud-fan Jan 22, 2020 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

JkSelf commented Jan 22, 2020 • edited

SparkQA commented Jan 22, 2020

SparkQA commented Jan 22, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

SparkQA commented Jan 22, 2020

SparkQA commented Jan 22, 2020

gatorsmile commented Jan 22, 2020

JkSelf commented Jan 17, 2020 •

edited

maryannxue Jan 21, 2020 •

edited

cloud-fan Jan 22, 2020 •

edited

JkSelf commented Jan 22, 2020 •

edited