-
Notifications
You must be signed in to change notification settings - Fork 28.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-27142][SQL] Provide REST API for SQL information #24076
Conversation
sql/core/src/main/scala/org/apache/spark/status/api/v1/SqlListResource.scala
Outdated
Show resolved
Hide resolved
sql/core/src/main/scala/org/apache/spark/status/api/v1/ApiSqlRootResource.scala
Outdated
Show resolved
Hide resolved
sql/core/src/main/scala/org/apache/spark/status/api/v1/SqlListResource.scala
Outdated
Show resolved
Hide resolved
@ajithme Thanks for the work.
|
@gengliangwang So the full api looks like
is this the expectation.? |
Yes, it is consistent with
|
Ok i have updated the PR accordingly. Please review |
sql/core/src/main/scala/org/apache/spark/status/api/v1/SqlListResource.scala
Outdated
Show resolved
Hide resolved
sql/core/src/main/scala/org/apache/spark/status/api/v1/SqlListResource.scala
Outdated
Show resolved
Hide resolved
sql/core/src/main/scala/org/apache/spark/status/api/v1/SqlListResource.scala
Outdated
Show resolved
Hide resolved
sql/core/src/main/scala/org/apache/spark/status/api/v1/SqlListResource.scala
Outdated
Show resolved
Hide resolved
ok to test |
Test build #103450 has started for PR 24076 at commit |
gentle ping @gengliangwang @dongjoon-hyun @srowen |
As I say I'm not sure this is something we should expose this way. See prior comments on JIRA |
@srowen i have updated the use case in the JIRA already for prior comments. Please have a look |
gentle ping @cloud-fan @srowen @dongjoon-hyun @gengliangwang Any further suggestions of this feature.? |
@ajithme I am +1 for this. |
Thanks @gengliangwang I got confused between |
Updated with latest comments fixed. Please review |
Test build #103950 has finished for PR 24076 at commit
|
retest this please. |
Test build #103965 has finished for PR 24076 at commit
|
@vanzin @srowen @dongjoon-hyun gentle ping. Can we re-look into this.? its been open for quite sometime |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems to be missing the plan description and metrics. Any specific reason not to expose those?
sql/core/src/main/scala/org/apache/spark/status/api/v1/SqlResource.scala
Outdated
Show resolved
Hide resolved
sql/core/src/main/scala/org/apache/spark/status/api/v1/api.scala
Outdated
Show resolved
Hide resolved
sql/core/src/main/scala/org/apache/spark/status/api/v1/SqlResource.scala
Outdated
Show resolved
Hide resolved
sql/core/src/main/scala/org/apache/spark/status/api/v1/SqlResource.scala
Outdated
Show resolved
Hide resolved
@vanzin Wanted this API to be light weight for monitoring. Returning plan and metrics can significantly increase the payload. Please share your thoughts |
Yes, but it's also useful information and there would be no way to retrieve it from the API... In our APIs we generally have a query parameter that tells you what level of detail you have in the response, so you can have a small summary or a full view of an object, for example. That sounds like something we could add here. I even did something similar in the |
Thanks @vanzin for inputs. I agree with your opinion and will update the PR |
@vanzin Thank you for review and i apologise for indent issues. Now i have run |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Few minor things.
sql/core/src/main/scala/org/apache/spark/sql/execution/ui/SQLAppStatusStore.scala
Outdated
Show resolved
Hide resolved
sql/core/src/main/scala/org/apache/spark/sql/execution/ui/SQLAppStatusStore.scala
Outdated
Show resolved
Hide resolved
sql/core/src/main/scala/org/apache/spark/status/api/v1/api.scala
Outdated
Show resolved
Hide resolved
@vanzin please review, updated as per the comments |
Weird tests aren't running. |
ok to test |
Test build #116509 has finished for PR 24076 at commit
|
@vanzin gentle ping |
Merging to master. |
### What changes were proposed in this pull request? Revert #28208 and #24076 in branch 3.0 ### Why are the changes needed? Unfortunately, the PR #28208 is merged after Spark 3.0 RC 2 cut. Although the improvement is great, we can't break the policy to add new improvement commits into branch 3.0 now. Also, if we are going to adopt the improvement in a future release, we should not release 3.0 with #24076, since the API result will be changed. After discuss with cloud-fan and gatorsmile offline, we think the best choice is to revert both commits and follow community release policy. ### Does this PR introduce _any_ user-facing change? Yes, let's hold the SQL rest API until next release. ### How was this patch tested? Jenkins unit tests. Closes #28588 from gengliangwang/revertSQLRestAPI. Authored-by: Gengliang Wang <gengliang.wang@databricks.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com>
What changes were proposed in this pull request?
Currently for Monitoring Spark application SQL information is not available from REST but only via UI. REST provides only applications,jobs,stages,environment. This Jira is targeted to provide a REST API so that SQL level information can be found
![image](https://user-images.githubusercontent.com/22072336/54298729-5524a800-45df-11e9-8e4d-b99a8b882031.png)
A single SQL query can result into multiple jobs. So for end user who is using STS or spark-sql, the intended highest level of probe is the SQL which he has executed. This information can be seen from SQL tab. Attaching a sample.
But same information he cannot access using the REST API exposed by spark and he always have to rely on jobs API which may be difficult. So i intend to expose the information seen in SQL tab in UI via REST API
Mainly:
Id : Long - execution id of the sql
status : String - possible values COMPLETED/RUNNING/FAILED
description : String - executed SQL string
planDescription : String - Plan representation
metrics : Seq[Metrics] -
Metrics
containmetricName: String, metricValue: String
submissionTime : String - formatted
Date
time of SQL submissionduration : Long - total run time in milliseconds
runningJobIds : Seq[Int] - sequence of running job ids
failedJobIds : Seq[Int] - sequence of failed job ids
successJobIds : Seq[Int] - sequence of success job ids
Why are the changes needed?
To support users query SQL information via REST API
Does this PR introduce any user-facing change?
Yes. It provides a new monitoring URL for SQL
How was this patch tested?
Tested manually