[SPARK-2193] Improve tasks preferrd locality by sorting tasks partial or...#1131
[SPARK-2193] Improve tasks preferrd locality by sorting tasks partial or...#1131li-zhihui wants to merge 1 commit intoapache:masterfrom
Conversation
|
Can one of the admins verify this patch? |
|
A preferred task for one worker might be picked up by another worker on process/node/rack only if they are at the same locality level : in which case, it is irrelevant which worker picks it up since both are at same locality level. I am probably missing why this is required ? |
|
@mridulm |
|
Hi @li-zhihui, Sorry for allowing this to sit unreviewed for so long. To check my understanding, the original issue was that an individual executor's pending task queue might have non-preferred tasks that appear ahead of preferred ones in the queue? It looks like this might have been partially addressed by #1313, which modified TaskSetManage to maintain a separate list of pending tasks without locality preferences: 63bdb1f#diff-bad3987c83bd22d46416d3dd9d208e76R193. Since we now maintain separate lists to track pending tasks for executors, hosts, and racks, I don't think that we need this sorting. If you agree, do you mind closing this pull request? Thanks! |
|
Can one of the admins verify this patch? |
…#1131) Co-authored-by: Egor Krivokon <>
Now, the last executor(s) maybe not get it’s preferred task(s), although these tasks have build in pendingTasksForHosts map. Because executers pick up tasks sequential, their preferred task(s) maybe picked up by other executors.
This appearance can be eliminated by sorting tasks partial ordering. Executor pick up task by host’s order of task’s preferredLocation, that mean, executor firstly pick up all tasks which task.preferredLocations.1 = executor.hostName, then secondly…