New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-31081][UI][SQL] Make the display of stageId/stageAttemptId/taskId of sql metrics configurable in UI #27849
Conversation
ping @cloud-fan @gengliangwang @nartal1 Please take a look, thanks! |
Test build #119534 has finished for PR 27849 at commit
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi, @Ngone51 .
The PR title Make SQLMetrics more readable from UI
sounds too broad. Could you revise more specifically?
acc | ||
} | ||
|
||
def createNanoTimingMetric(sc: SparkContext, name: String): SQLMetric = { | ||
// Same with createTimingMetric, just normalize the unit of time to millisecond. | ||
val acc = new SQLMetric(NS_TIMING_METRIC, -1) | ||
acc.register(sc, name = Some(s"$name total (min, med, max (stageId (attemptId): taskId))"), | ||
countFailedValues = false) | ||
acc.register(sc, name = Some(s"name"), countFailedValues = false) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TODO: fix "name"
Thank you for narrowing down title, @Ngone51 . |
buildStaticConf("spark.sql.ui.displayTaskInfoForMaxMetric") | ||
.doc("If turn on, Spark will display stageId-stageAttemptId-taskId of the max metrics to " + | ||
"tell where the max value comes from. It's useful to help debug job quicker.") | ||
.version("3.0.0") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Ngone51 .
Since SPARK-31081 is filed as an Improvement
, this should be 3.1.0
.
If you want to consider this as a bug or regression at 3.0.0, you had better change JIRA first.
cc @rxin since he is the release manager of 3.0.0 (also cc @gatorsmile ).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a UI issue to me. The SQL web UI is really hard to read now with the stage id stuff. Ideally, we should revisit #26843 and think of a better way to improve readability. But no one has proposed an idea yet.
This PR disables the stage id stuff, which seems like a good compromise for 3.0: we keep the UI unchanged, but for people who really need the stage id info, they can still enable it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 for making this 3.0. The UI changes in #26843 can be confusing to users after 3.0 release.
@@ -116,8 +117,8 @@ object SQLMetrics { | |||
// data size total (min, med, max): | |||
// 100GB (100MB, 1GB, 10GB) | |||
val acc = new SQLMetric(SIZE_METRIC, -1) | |||
acc.register(sc, name = Some(s"$name total (min, med, max (stageId (attemptId): taskId))"), | |||
countFailedValues = false) | |||
acc.register(sc, name = Some(s"$name total (min, med, ${attachTaskId("max", sc.conf)})"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The new indentation looks incorrect at this line?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also noted that, thanks.
@@ -233,4 +228,13 @@ object SQLMetrics { | |||
SparkListenerDriverAccumUpdates(executionId.toLong, metrics.map(m => m.id -> m.value))) | |||
} | |||
} | |||
|
|||
private def attachTaskId(name: String, conf: SparkConf): String = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are attaching stageId
and attemptId
, too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I know. I'm thinking about a better name, while adding stageId/attemptId makes me feel too long for the method name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
attachTaskInfo
? stage id is part of the task info which tells which stage the task comes from.
Test build #119674 has finished for PR 27849 at commit
|
@@ -135,6 +135,14 @@ object StaticSQLConf { | |||
.intConf | |||
.createWithDefault(1000) | |||
|
|||
val DISPLAY_TASK_ID_FOR_MAX_METRIC = | |||
buildStaticConf("spark.sql.ui.displayTaskInfoForMaxMetric") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shall we make it an internal conf?
cc @tgravescs as well |
Sorry, I disagree. I understand that it adds more clutter and am definitely open to improvements, but I think it does bring valuable information that once people realize is there will use it. This entire ui is for debugging purposes so I don't get your argument with that. If a user isn't actively looking to see what went on with their job why would they go to this UI at all. Can we leave on by default or don't display and have a checkbox like we have on other tables? (disclaimer - I haven't looked how hard that would be in code to add the checkbox) |
BTW thanks @gengliangwang for adding me, it would have been nice if I was added right away since I added this feature that you are now trying to take back out. |
I agree that UI is mostly for debugging purposes. But please also note that this is living UI(rather than SHS UI). And living UI may shutdown at any time before user figure out the root cause (except those long running applications). So, personally, I aways need to restart and insert And if they really want to see those ids at any time, they can just turn it on always. Or if we can have better way to display ids, then we can turn it on by default. And here's my previous propose to display ids in case you would like it:
Sorry, this is my bad. I'll take care in the future. |
The UI is mostly for debugging but the stage id stuff is not always needed. At least this has been distracting me multiple times when I try to debug something with the SQL UI. The new proposal from @Ngone51 looks good, it's much less distracting to have a short |
@tgravescs as @gatorsmile and I mentioned in #26843 (comment), the new change is not readable. I prefer to hide the slowest task as well.
I like the idea of "don't display and have a checkbox like we have on other tables". @Ngone51 will you consider that? |
I checked with @Ngone51 offline. He is fine with the checkbox idea. But he is not very familiar with the front-end work. I am also busy with other priorities recently. @sarutak @HeartSaVioR would you like to take this task (create a checkbox for hild/show the max stage/stage attamp/task id in the graph) over? |
Thanks for pinging me and I'm sorry I'm not familiar with front-end work as well - that's why I only had to skim the front-end side while reviewing structured streaming UI PR. |
the problem with a config is that if you turn it off, especially by default most users won't see it to know its even an option. Perhaps I should ask everyone this - when you were looking at the UI, why were you looking at it? What values in there were you using for instance? A lot of people have asked me ok, I did this groupby or join and I see it in that box, but how do I know what stage that ran in..... this at least helps with that question. So how do you do that now? my previous answer was you had to click through a lot of jobs/stages to figure it out, which is very time consuming. |
@gengliangwang I'll try to add the checkbox. |
…of sql metrics toggleable ### What changes were proposed in this pull request? This is another solution for `SPARK-31081` and #27849 . I added a checkbox which can toggle display of stageId/taskid in the SQL's DAG page. Mainly, I implemented the toggleable texts in boxes with HTML label feature provided by `dagre-d3`. The additional metrics are enclosed by `<span>` and control the appearance of the text. But the exception is additional metrics in clusters. We can use HTML label for cluster but layout will be broken so I choosed normal text label for clusters. Due to that, this solution contains a little bit tricky code in`spark-sql-viz.js` to manipulate the metric texts and generate DOMs. ### Why are the changes needed? It makes metrics harder to read after #26843 and user may not interest in extra info(stageId/StageAttemptId/taskId ) when they do not need debug. #27849 control the appearance by a new configuration property but providing a checkbox is more flexible. ### Does this PR introduce any user-facing change? Yes. [Additional metrics shown] ![with-checked](https://user-images.githubusercontent.com/4736016/77244214-0f6cd780-6c56-11ea-9275-a30758dd5339.png) [Additional metrics hidden] ![without-chedked](https://user-images.githubusercontent.com/4736016/77244219-14ca2200-6c56-11ea-9874-33a466085fce.png) ### How was this patch tested? Tested manually with a simple DataFrame operation. * The appearance of additional metrics in the boxes are controlled by the newly added checkbox. * No error found with JS-debugger. * Checked/not-checked state is preserved after reloading. Closes #27927 from sarutak/SPARK-31081. Authored-by: Kousuke Saruta <sarutak@oss.nttdata.com> Signed-off-by: Gengliang Wang <gengliang.wang@databricks.com>
…of sql metrics toggleable ### What changes were proposed in this pull request? This is another solution for `SPARK-31081` and #27849 . I added a checkbox which can toggle display of stageId/taskid in the SQL's DAG page. Mainly, I implemented the toggleable texts in boxes with HTML label feature provided by `dagre-d3`. The additional metrics are enclosed by `<span>` and control the appearance of the text. But the exception is additional metrics in clusters. We can use HTML label for cluster but layout will be broken so I choosed normal text label for clusters. Due to that, this solution contains a little bit tricky code in`spark-sql-viz.js` to manipulate the metric texts and generate DOMs. ### Why are the changes needed? It makes metrics harder to read after #26843 and user may not interest in extra info(stageId/StageAttemptId/taskId ) when they do not need debug. #27849 control the appearance by a new configuration property but providing a checkbox is more flexible. ### Does this PR introduce any user-facing change? Yes. [Additional metrics shown] ![with-checked](https://user-images.githubusercontent.com/4736016/77244214-0f6cd780-6c56-11ea-9275-a30758dd5339.png) [Additional metrics hidden] ![without-chedked](https://user-images.githubusercontent.com/4736016/77244219-14ca2200-6c56-11ea-9874-33a466085fce.png) ### How was this patch tested? Tested manually with a simple DataFrame operation. * The appearance of additional metrics in the boxes are controlled by the newly added checkbox. * No error found with JS-debugger. * Checked/not-checked state is preserved after reloading. Closes #27927 from sarutak/SPARK-31081. Authored-by: Kousuke Saruta <sarutak@oss.nttdata.com> Signed-off-by: Gengliang Wang <gengliang.wang@databricks.com> (cherry picked from commit 999c9ed) Signed-off-by: Gengliang Wang <gengliang.wang@databricks.com>
…of sql metrics toggleable ### What changes were proposed in this pull request? This is another solution for `SPARK-31081` and apache#27849 . I added a checkbox which can toggle display of stageId/taskid in the SQL's DAG page. Mainly, I implemented the toggleable texts in boxes with HTML label feature provided by `dagre-d3`. The additional metrics are enclosed by `<span>` and control the appearance of the text. But the exception is additional metrics in clusters. We can use HTML label for cluster but layout will be broken so I choosed normal text label for clusters. Due to that, this solution contains a little bit tricky code in`spark-sql-viz.js` to manipulate the metric texts and generate DOMs. ### Why are the changes needed? It makes metrics harder to read after apache#26843 and user may not interest in extra info(stageId/StageAttemptId/taskId ) when they do not need debug. apache#27849 control the appearance by a new configuration property but providing a checkbox is more flexible. ### Does this PR introduce any user-facing change? Yes. [Additional metrics shown] ![with-checked](https://user-images.githubusercontent.com/4736016/77244214-0f6cd780-6c56-11ea-9275-a30758dd5339.png) [Additional metrics hidden] ![without-chedked](https://user-images.githubusercontent.com/4736016/77244219-14ca2200-6c56-11ea-9874-33a466085fce.png) ### How was this patch tested? Tested manually with a simple DataFrame operation. * The appearance of additional metrics in the boxes are controlled by the newly added checkbox. * No error found with JS-debugger. * Checked/not-checked state is preserved after reloading. Closes apache#27927 from sarutak/SPARK-31081. Authored-by: Kousuke Saruta <sarutak@oss.nttdata.com> Signed-off-by: Gengliang Wang <gengliang.wang@databricks.com>
We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable. |
Closing in favour of #27927 |
What changes were proposed in this pull request?
This PR adds a conf to make the display of
stageId/StageAttemptId/taskId
of SQL metrics configurable in UI. And also propose to not display them by default.Why are the changes needed?
It makes metrics harder to read after #26843 and user may not interest in extra info(
stageId/StageAttemptId/taskId
) when they do not need debug. So, I think we need to a way to make the display configurable.Does this PR introduce any user-facing change?
Yes.
Before this PR:
After this PR without
stageId/StageAttemptId/taskId
:How was this patch tested?
Updated
SQLMetricsSuite
and tested manually.