-
Notifications
You must be signed in to change notification settings - Fork 28.9k
SPARK-6541 - Sort executors by ID (numeric) #9165
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this be sortBy(_.id)
? and I assume it's already ints.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, here, it's already an Int. Let me update for sortBy() instead of sortWith().
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't use sortBy() due to toSeq (and so diverging implicit expansion for ordering). I keep sortWith() for now (or I have to change the toSeq statement).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't you just need a toList
then? indeed a Seq
may not be random-access and so hard to sort.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@srowen yes, it's what I thought. Let me update like this. Thanks (as usual ;))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@srowen actually Scala's List
is a linked list. You may want toIndexedSeq
. In any case, I think all of them will dump the data into an array for sorting (even ArrayBuffer
), so it probably doesn't matter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair enough, though I wonder then why the Seq
didn't work. Not a big deal but I can't see a good semantic reason for sortBy
to not work but sortWith
to work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, that doesn't make sense to me either, I don't see any reason why toSeq
wouldn't cut it -- seems to compile for me ...
ok to test |
Test build #43951 has finished for PR 9165 at commit
|
I'm OK with this. Sorting executor IDs numerically seems like the intent anyway. I don't see a downside to sorting the display otherwise, other than the cost of sorting, but that's pretty trivial given the number of executors. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are executor ids always guaranteed to be longs? If not, this could fail with a NumberFormatException.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, it's just occurred to me that the driver is an executor with ID "driver" for example. @jbonofre did that not actually cause a problem?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not just use JS to trigger the sort in the frontend or to set a default sort order for the table? Are you worried about UI flickering?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@JoshRosen yes, it's an alternative. Let me update like this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jbonofre are you still working on this one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just checked the usage of sorttable (from kryogenix) used in Spark UI, and actually, we have to sort server side.
Basically, using sorttable js at page loading is wrong. Here's the extract for the sorttable FAQ:
"
Sorting the table when the page is loaded
Lots of people ask, "how do I make sorttable sort the table the first time the page is loaded?" The answer is: you don't. Sorttable is about changing the HTML that is served from your server without a page refresh. When the page is first served from the server, you have to incur the wait for it to be served anyway. So, if you want the table sorted when a page is first displayed, serve the table in sorted order. Tables often come out of a database; get the data from the database in a sorted order with an ORDER BY clause in your SQL. Any solution which involves you running sorttable as soon as the page loads (i.e., without user input) is a wrong solution.
"
So, according to @JoshRosen (good catch Josh ;)), executor ID is not necessary a long (for instance, "driver" executor). Anyway, let me check if I can sort (alphanumeric) on server side, matching the sorttable sort ordering.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, that's not a great argument to me, as it means "you must implement sorting logic twice, always". It's not possible to trigger the JS after the table loads? in any event, it may not be so hard to sort both places, sure. But it's a weird sort: by number for things that can be numbers but then alphabetically otherwise. That has to be implemented two places. It may prove to be too much hassle, as in fact, executors aren't really numbers but only usually appear to be
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree @srowen. I can call the sorttable sorting with something like onload. Let me try this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm testing the usage of:
<script>
{
Unparsed {
"""
| window.onload = function() {
| sorttable.innerSortFunction.apply(document.getElementById('executorid'), [])
| };
""".stripMargin
}
}
</script>
in the ExecutorTable, but also in the UIUtils generate table function (as used in stage view) to see if it works as expected.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems that the innerSortFunction function requires latest sorttable version. Let me try to upgrade.
Update with client side executor ID sorting. This PR answers the original request, but I think it makes sense to sort all executor tables the same way. |
Test build #44766 has finished for PR 9165 at commit
|
Thoughts ? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not a big deal, but stripping whitespace on lines just adds to the PR noise and possibility of merge conflicts. If you can disable this it's much preferred.
Hm, by the end here it seems like 90% of the changes are whitespace changes. It's enough compared to the small functional change that I think it's worth re-doing. In the end how does this sort? numerically, but what about IDs like "driver"? does it sort on something like the leading prefix of digits? Does it still need server-side sorting if so? I see that's added but it just sorts lexicographically. |
Actually, I didn't change the file myself: I just upgraded to latest sorttable.js script (downloaded from the website). That explains the whitespace changes. I can try to clean it up. The ordering is only client-side (no change on the server side), and it use numeric/alphanumeric sort. As you can see in the Jira, the user would like ordering (as when he clicks on table header) instead of the "default" one. Thanks Sean ! |
Oh OK, then let's keep the whitespace changes, if they're part of the upstream code we're including. I wouldn't change it then. There is still a sorting change in
|
Let me prepare couple of screenshots to illustrate the sort order. |
PR rebased/cleaned up/updated. |
Yeah I'm still curious where "driver" would sort for example? I think it's clear that sorting numbers as numbers is OK but I don't see how this handles non-numeric IDs. Any reasonable behavior seems fine; just want to make sure it works. |
Yeah, I understand what you mean. Let me simulate the case and attach a screenshot. Thanks again Sean ! |
Test build #45313 has finished for PR 9165 at commit
|
PR rebased. Test failure seems related to flaky tests (Hive). Anyway, I'm trying on my box. |
Test build #45348 has finished for PR 9165 at commit
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these are indented too much. Also please uses spaces where possible
@jbonofre actually I do mind the whitespace changes. Elsewhere we try to use 2 space indentations in JavaScript instead of tabs. This PR currently makes the style inconsistent everywhere. Would you mind fixing these manually? |
Test build #45408 has finished for PR 9165 at commit
|
PR rebased |
Test build #46064 has finished for PR 9165 at commit
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jbonofre wanted to triple-check on this -- is this change intended?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My bad Sean, you are right: it's not intended (resulting upgrade to latest sorttable version). Let me fix that.
Rebase |
Test build #2074 has finished for PR 9165 at commit
|
Test build #46085 has finished for PR 9165 at commit
|
Test build #2078 has finished for PR 9165 at commit
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Heh, github says this line changed but I don't see it. I'm worried it's a difference in some unprintable character that will throw this off. Any way to make sure this line really didn't change? copy and paste from the previous version or is that what you did?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me checkout from master and change just the innerStoreFunction.
Cleanup and rebase |
Test build #46094 has finished for PR 9165 at commit
|
All looks good now. And it still works ? :) |
Lol ;) Let me prepare a screenshot ;) |
Rebase |
It's looking good. If there are no further comments I'll merge shortly. |
Awesome, thanks Sean ! |
Test build #46104 has finished for PR 9165 at commit
|
"Force" the executor ID sort with Int. Author: Jean-Baptiste Onofré <jbonofre@apache.org> Closes #9165 from jbonofre/SPARK-6541. (cherry picked from commit e62820c) Signed-off-by: Sean Owen <sowen@cloudera.com>
"Force" the executor ID sort with Int.