Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-28935][SQL][DOCS] Document SQL metrics for Details for Query Plan #25658

Closed
wants to merge 2 commits into from

Conversation

viirya
Copy link
Member

@viirya viirya commented Sep 3, 2019

What changes were proposed in this pull request?

This patch adds the description of common SQL metrics in web ui document.

Why are the changes needed?

The current web ui document describes query plan but does not describe the meaning SQL metrics. For end users, they might not understand the meaning of the metrics.

Does this PR introduce any user-facing change?

No. This is just documentation change.

How was this patch tested?

Built the docs locally.

image

@SparkQA
Copy link

SparkQA commented Sep 3, 2019

Test build #110033 has finished for PR 25658 at commit ed6017c.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@viirya
Copy link
Member Author

viirya commented Sep 3, 2019

cc @gatorsmile @dilipbiswal

docs/web-ui.md Outdated
The metrics of SQL operators show in the block of operators. The SQL metrics can be useful when
we want to dive into the execution details of each operator, for example, how many rows are output
after a Filter operator. The related metrics are different for each type of operator, for example
Exchange has the metrics called "shuffle bytes writte total" which shows the number of bytes written
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: written ?

docs/web-ui.md Outdated
<tr><td> <code>metadata time</code> </td><td> the time spent on getting metadata like number of partitions, number of files </td><td> FileSourceScan </td></tr>
<tr><td> <code>shuffle bytes written</code> </td><td> number of bytes written </td><td> CollectLimit, TakeOrderedAndProject, ShuffleExchange </td></tr>
<tr><td> <code>shuffle records written</code> </td><td> number of records written </td><td> CollectLimit, TakeOrderedAndProject, ShuffleExchange </td></tr>
<tr><td> <code>shuffle write time</code> </td><td> the time on shuffle writing </td><td> CollectLimit, TakeOrderedAndProject, ShuffleExchange </td></tr>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: the time spent on writing shuffle data ?

docs/web-ui.md Outdated
@@ -363,6 +363,42 @@ number of written shuffle records, total data size, etc.
Clicking the 'Details' link on the bottom displays the logical plans and the physical plan, which
illustrate how Spark parses, analyzes, optimizes and performs the query.

### SQL metrics

The metrics of SQL operators show in the block of operators. The SQL metrics can be useful when
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@viirya a little confused on "show in the block of operators" ? Is there a way to reword this ?

@dilipbiswal
Copy link
Contributor

@viirya Looks good to me. I have some minor comments :-)

@gatorsmile
Copy link
Member

LGTM after a few minor updates!

@viirya
Copy link
Member Author

viirya commented Sep 6, 2019

thanks for updating! @gatorsmile

@SparkQA
Copy link

SparkQA commented Sep 6, 2019

Test build #110263 has finished for PR 25658 at commit 6c7037a.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@gatorsmile gatorsmile closed this in 89aba69 Sep 6, 2019
@gatorsmile
Copy link
Member

Thanks! Merged to master.

PavithraRamachandran pushed a commit to PavithraRamachandran/spark that referenced this pull request Sep 15, 2019
### What changes were proposed in this pull request?

This patch adds the description of common SQL metrics in web ui document.

### Why are the changes needed?

The current web ui document describes query plan but does not describe the meaning SQL metrics. For end users, they might not understand the meaning of the metrics.

### Does this PR introduce any user-facing change?

No. This is just documentation change.

### How was this patch tested?

Built the docs locally.

![image](https://user-images.githubusercontent.com/11567269/64463485-1583d800-d0b9-11e9-9916-141f5c09f009.png)

Closes apache#25658 from viirya/SPARK-28935.

Lead-authored-by: Liang-Chi Hsieh <viirya@gmail.com>
Co-authored-by: Xiao Li <gatorsmile@gmail.com>
Signed-off-by: Xiao Li <gatorsmile@gmail.com>
@viirya viirya deleted the SPARK-28935 branch December 27, 2023 18:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants