-
Notifications
You must be signed in to change notification settings - Fork 28k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-31440][SQL] Improve SQL Rest API #28208
Conversation
Test build #121221 has finished for PR 28208 at commit
|
Test build #121223 has finished for PR 28208 at commit
|
Test build #121546 has finished for PR 28208 at commit
|
Test build #121547 has finished for PR 28208 at commit
|
Last build failure seems irrelevant: |
retest this please |
sql/core/src/main/scala/org/apache/spark/status/api/v1/sql/api.scala
Outdated
Show resolved
Hide resolved
sql/core/src/main/scala/org/apache/spark/status/api/v1/sql/SqlResource.scala
Outdated
Show resolved
Hide resolved
sql/core/src/main/scala/org/apache/spark/status/api/v1/sql/SqlResource.scala
Outdated
Show resolved
Hide resolved
sql/core/src/main/scala/org/apache/spark/status/api/v1/sql/SqlResource.scala
Outdated
Show resolved
Hide resolved
Hi @erenavsarogullari Thanks for the work. |
Hi @gengliangwang, |
I see. |
Test build #121560 has finished for PR 28208 at commit
|
@erenavsarogullari I meant we can expose the edges, which seems better. |
@gengliangwang Currently Could you please share a sample json? |
Test build #121597 has finished for PR 28208 at commit
|
Test build #121598 has finished for PR 28208 at commit
|
Hi @gengliangwang, Thanks again for the review. Currently, all Also, i created documentation for SQL Rest API - PR: #28354 as the follow-up for this work. Just fyi. |
Hi @erenavsarogullari , sorry for the late reply. I think we can output the whole SparkPlanGraph in json format:
|
@@ -146,4 +146,6 @@ class SparkPlanGraphNodeWrapper( | |||
case class SQLPlanMetric( | |||
name: String, | |||
accumulatorId: Long, | |||
metricType: String) | |||
metricType: String, | |||
nodeId: Option[Long] = None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shall we use the node name and id in SparkPlanGraphNode instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Current implementation uses SQLExecutionUIData
and exposes job
and physicalPlanDescription
details through it. However, SQLExecutionUIData
does not have SparkPlanGraphNode
data where it has SQLPlanMetric
if makes sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As long as we can access the sqlStore
in the SqlResource.scala
, we can get the corresponding SparkPlanGraphNode
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Both nodeId: Option[Long] = None
and nodeName: Option[String] = None
were removed from SQLPlanMetric
and provided both of them by sqlStore.planGraph(executionId)
instead.
Test build #122184 has finished for PR 28208 at commit
|
@gengliangwang Yes, it makes sense and set |
Also, i think this PR can be useful for the community if it can be covered by |
Test build #122420 has finished for PR 28208 at commit
|
Test build #122424 has finished for PR 28208 at commit
|
Test build #122455 has finished for PR 28208 at commit
|
Test build #122457 has finished for PR 28208 at commit
|
} | ||
|
||
private def prepareExecutionData(exec: SQLExecutionUIData, details: Boolean): ExecutionData = { | ||
private def getNodeIdAndWSCGIdMap(graph: SparkPlanGraph): Map[Long, Option[Long]] = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the WSCG node name contains the index. Why do we need to get the mapping here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
WSCG node has both nodeId
and WSCG index
as follows. WSCG index
comes as part of WSCG nodeName
and needs to be parsed before populating of other nodes' for their wholeStageCodegenId
attribute so this map is useful for computation and readability by reducing complexity. Also, i cleaned up existing implementation by covering this part. Please find final patch: 2743296
{
"nodeId": 2,
"nodeName": "WholeStageCodegen (1)",
"metrics": [...]
}
@erenavsarogullari +1 for putting this in 3.0 |
|
||
metricValues.get(accumulatorId).map( mv => { | ||
val metricValue = if (mv.startsWith("\n")) mv.substring(1, mv.length) else mv | ||
Metric(metricName, metricValue) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually after #28037, part of the metrics name is in the value...
I can see the json output like:
{
"nodeId": 9,
"nodeName": "WholeStageCodegen (1)",
"metrics": [
{
"name": "duration",
"value": "total (min, med, max (stageId: taskId))\n4.3 s (269 ms, 270 ms, 272 ms (stage 3.0: task 16))"
}
]
}
But this is not strongly related to this PR. We can just fix it in another PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, sounds good.
Test build #122769 has finished for PR 28208 at commit
|
Test build #122771 has finished for PR 28208 at commit
|
@GET | ||
def sqlList( | ||
@DefaultValue("false") @QueryParam("details") details: Boolean, | ||
@DefaultValue("true") @QueryParam("details") details: Boolean, | ||
@DefaultValue("true") @QueryParam("planDescription") planDescription: Boolean, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@erenavsarogullari one last comment: why do we have extra option "planDescription" here. It seems reasonable to be covered by the option "details"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@gengliangwang Thanks for the review.
Please find my comments as follows:
1- planDescription
exposes Physical Plan which covers dataset column-names
. Column Names can be thought as customer sensitive data so with this option, end-users can disable in the light of their use-cases when they still access metrics.
2- For complex queries, planDescription
can be big string and create network overhead. In this case, it can be disabled where it is not required and metrics are required(e.g: time-series monitoring - metrics need to be persisted & exposed but Physical Plan does not) (if makes sense)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM except one last question. Thanks for the works!
Thanks, merging to master/3.0 |
### What changes were proposed in this pull request? SQL Rest API exposes query execution metrics as Public API. This PR aims to apply following improvements on SQL Rest API by aligning Spark-UI. **Proposed Improvements:** 1- Support Physical Operations and group metrics per physical operation by aligning Spark UI. 2- Support `wholeStageCodegenId` for Physical Operations 3- `nodeId` can be useful for grouping metrics and sorting physical operations (according to execution order) to differentiate same operators (if used multiple times during the same query execution) and their metrics. 4- Filter `empty` metrics by aligning with Spark UI - SQL Tab. Currently, Spark UI does not show empty metrics. 5- Remove line breakers(`\n`) from `metricValue`. 6- `planDescription` can be `optional` Http parameter to avoid network cost where there is specially complex jobs creating big-plans. 7- `metrics` attribute needs to be exposed at the bottom order as `nodes`. Specially, this can be useful for the user where `nodes` array size is high. 8- `edges` attribute is being exposed to show relationship between `nodes`. 9- Reverse order on `metricDetails` aims to match with Spark UI by supporting Physical Operators' execution order. ### Why are the changes needed? Proposed improvements provides more useful (e.g: physical operations and metrics correlation, grouping) and clear (e.g: filtering blank metrics, removing line breakers) result for the end-user. ### Does this PR introduce any user-facing change? Yes. Please find both current and improved versions of the results as attached for following SQL Rest Endpoint: ``` curl -X GET http://localhost:4040/api/v1/applications/$appId/sql/$executionId?details=true ``` **Current version:** https://issues.apache.org/jira/secure/attachment/12999821/current_version.json **Improved version:** https://issues.apache.org/jira/secure/attachment/13000621/improved_version.json ### Backward Compatibility SQL Rest API will be started to expose with `Spark 3.0` and `3.0.0-preview2` (released on 12/23/19) does not cover this API so if PR can catch 3.0 release, this will not have any backward compatibility issue. ### How was this patch tested? 1. New Unit tests are added. 2. Also, patch has been tested manually through both **Spark Core** and **History Server** Rest APIs. Closes #28208 from erenavsarogullari/SPARK-31440. Authored-by: Eren Avsarogullari <eren.avsarogullari@gmail.com> Signed-off-by: Gengliang Wang <gengliang.wang@databricks.com> (cherry picked from commit ab4cf49) Signed-off-by: Gengliang Wang <gengliang.wang@databricks.com>
@erenavsarogullari I think we have to revert this one in branch-3.0. See my comments in #28588 |
### What changes were proposed in this pull request? Revert #28208 and #24076 in branch 3.0 ### Why are the changes needed? Unfortunately, the PR #28208 is merged after Spark 3.0 RC 2 cut. Although the improvement is great, we can't break the policy to add new improvement commits into branch 3.0 now. Also, if we are going to adopt the improvement in a future release, we should not release 3.0 with #24076, since the API result will be changed. After discuss with cloud-fan and gatorsmile offline, we think the best choice is to revert both commits and follow community release policy. ### Does this PR introduce _any_ user-facing change? Yes, let's hold the SQL rest API until next release. ### How was this patch tested? Jenkins unit tests. Closes #28588 from gengliangwang/revertSQLRestAPI. Authored-by: Gengliang Wang <gengliang.wang@databricks.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com>
…rException ### What changes were proposed in this pull request? Added null check for `exec.metricValues`. ### Why are the changes needed? When requesting Restful API {baseURL}/api/v1/applications/$appId/sql/$executionId which is introduced by this PR #28208, it can cause NullPointerException. The root cause is, when calling method doUpdate() of `LiveExecutionData`, `metricsValues` can be null. Then, when statement `printableMetrics(graph.allNodes, exec.metricValues)` is executed, it will throw NullPointerException. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Tested manually. Closes #35884 from yym1995/fix-npe. Lead-authored-by: Yimin <yimin.y@outlook.com> Co-authored-by: Yimin Yang <26797163+yym1995@users.noreply.github.com> Signed-off-by: Yuming Wang <yumwang@ebay.com>
…rException ### What changes were proposed in this pull request? Added null check for `exec.metricValues`. ### Why are the changes needed? When requesting Restful API {baseURL}/api/v1/applications/$appId/sql/$executionId which is introduced by this PR #28208, it can cause NullPointerException. The root cause is, when calling method doUpdate() of `LiveExecutionData`, `metricsValues` can be null. Then, when statement `printableMetrics(graph.allNodes, exec.metricValues)` is executed, it will throw NullPointerException. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Tested manually. Closes #35884 from yym1995/fix-npe. Lead-authored-by: Yimin <yimin.y@outlook.com> Co-authored-by: Yimin Yang <26797163+yym1995@users.noreply.github.com> Signed-off-by: Yuming Wang <yumwang@ebay.com> (cherry picked from commit 99992a4) Signed-off-by: Yuming Wang <yumwang@ebay.com>
…rException ### What changes were proposed in this pull request? Added null check for `exec.metricValues`. ### Why are the changes needed? When requesting Restful API {baseURL}/api/v1/applications/$appId/sql/$executionId which is introduced by this PR #28208, it can cause NullPointerException. The root cause is, when calling method doUpdate() of `LiveExecutionData`, `metricsValues` can be null. Then, when statement `printableMetrics(graph.allNodes, exec.metricValues)` is executed, it will throw NullPointerException. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Tested manually. Closes #35884 from yym1995/fix-npe. Lead-authored-by: Yimin <yimin.y@outlook.com> Co-authored-by: Yimin Yang <26797163+yym1995@users.noreply.github.com> Signed-off-by: Yuming Wang <yumwang@ebay.com> (cherry picked from commit 99992a4) Signed-off-by: Yuming Wang <yumwang@ebay.com>
…rException ### What changes were proposed in this pull request? Added null check for `exec.metricValues`. ### Why are the changes needed? When requesting Restful API {baseURL}/api/v1/applications/$appId/sql/$executionId which is introduced by this PR #28208, it can cause NullPointerException. The root cause is, when calling method doUpdate() of `LiveExecutionData`, `metricsValues` can be null. Then, when statement `printableMetrics(graph.allNodes, exec.metricValues)` is executed, it will throw NullPointerException. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Tested manually. Closes #35884 from yym1995/fix-npe. Lead-authored-by: Yimin <yimin.y@outlook.com> Co-authored-by: Yimin Yang <26797163+yym1995@users.noreply.github.com> Signed-off-by: Yuming Wang <yumwang@ebay.com> (cherry picked from commit 99992a4) Signed-off-by: Yuming Wang <yumwang@ebay.com>
…rException ### What changes were proposed in this pull request? Added null check for `exec.metricValues`. ### Why are the changes needed? When requesting Restful API {baseURL}/api/v1/applications/$appId/sql/$executionId which is introduced by this PR apache#28208, it can cause NullPointerException. The root cause is, when calling method doUpdate() of `LiveExecutionData`, `metricsValues` can be null. Then, when statement `printableMetrics(graph.allNodes, exec.metricValues)` is executed, it will throw NullPointerException. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Tested manually. Closes apache#35884 from yym1995/fix-npe. Lead-authored-by: Yimin <yimin.y@outlook.com> Co-authored-by: Yimin Yang <26797163+yym1995@users.noreply.github.com> Signed-off-by: Yuming Wang <yumwang@ebay.com> (cherry picked from commit 99992a4) Signed-off-by: Yuming Wang <yumwang@ebay.com> (cherry picked from commit c3aace7)
What changes were proposed in this pull request?
SQL Rest API exposes query execution metrics as Public API. This PR aims to apply following improvements on SQL Rest API by aligning Spark-UI.
Proposed Improvements:
1- Support Physical Operations and group metrics per physical operation by aligning Spark UI.
2- Support
wholeStageCodegenId
for Physical Operations3-
nodeId
can be useful for grouping metrics and sorting physical operations (according to execution order) to differentiate same operators (if used multiple times during the same query execution) and their metrics.4- Filter
empty
metrics by aligning with Spark UI - SQL Tab. Currently, Spark UI does not show empty metrics.5- Remove line breakers(
\n
) frommetricValue
.6-
planDescription
can beoptional
Http parameter to avoid network cost where there is specially complex jobs creating big-plans.7-
metrics
attribute needs to be exposed at the bottom order asnodes
. Specially, this can be useful for the user wherenodes
array size is high.8-
edges
attribute is being exposed to show relationship betweennodes
.9- Reverse order on
metricDetails
aims to match with Spark UI by supporting Physical Operators' execution order.Why are the changes needed?
Proposed improvements provides more useful (e.g: physical operations and metrics correlation, grouping) and clear (e.g: filtering blank metrics, removing line breakers) result for the end-user.
Does this PR introduce any user-facing change?
Yes. Please find both current and improved versions of the results as attached for following SQL Rest Endpoint:
Current version:
https://issues.apache.org/jira/secure/attachment/12999821/current_version.json
Improved version:
https://issues.apache.org/jira/secure/attachment/13000621/improved_version.json
Backward Compatibility
SQL Rest API will be started to expose with
Spark 3.0
and3.0.0-preview2
(released on 12/23/19) does not cover this API so if PR can catch 3.0 release, this will not have any backward compatibility issue.How was this patch tested?