-
Notifications
You must be signed in to change notification settings - Fork 28.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-23888][CORE] correct the comment of hasAttemptOnHost() #20998
Conversation
ping @pwendell @kayousterhout . pls help review, thanks :) |
Jenkins, ok to test |
sounds fair, but shouldn't this be up to the scheduler backend? multiple tasks/attempts can run simultaneously on the same physical host? |
Hi, @felixcheung , thank for trigger a test and your comments.
Actually, it is
I think multiple task attempts(actually, speculative tasks) can run on the sam physical host, but not simultaneously, as long as there's no running attempt on it. In PR description, I illustrate a case in which a speculative task chose to run on a host, where a previous task attempts have been run on, but failed finally. I think if the task's failure is not relevant to the host, 'run on the same host' can be acceptable. |
Test build #89026 has finished for PR 20998 at commit
|
Adding isRunning can cause a single 'bad' node (from task pov - not necessarily only bad hardware: just that task fails on node) can keep tasks to fail repeatedly causing app to exit. Particularly with blacklist'ing, I am not very sure how the interactions will play out .. @squito might have more comments. In the specific usecase of only two machines, it is an unfortunate side effect. |
Hi, @mridulm, thank for your comment. Actually, I have the same worry with you. May be we can make this change as a second choice for |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change certainly makes it agree with the comment, so I think we should either make this change, or change the comment.
Blacklisting should still work as expected. dequeueSpeculativeTask
also checks the blacklist, so if host1
is blacklisted, you'll still skip it. But, with blacklisting off, its more significant change. Even on a large cluster, I can imagine this happening most of the time when the non-speculative task fails, due to locality preferences.
Basically there is a behavior choice: Should a speculative task ever be allowed to run on a host where the task has failed previously? I think it should, as that is better handled by blacklisting.
// there's already a running copy. | ||
clock.advance(1000) | ||
info1.finishTime = clock.getTimeMillis() | ||
assert(info1.running === false) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
assert(!info1.running)
// no more running copy of task0 | ||
assert(manager.resourceOffer("execA", "host1", PROCESS_LOCAL).get.index === 0) | ||
val info3 = manager.taskAttempts(0)(0) | ||
assert(info3.running === true) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
assert(info3.running)
// after a long long time, task0.0 failed, and task0.0 can not re-run since | ||
// there's already a running copy. | ||
clock.advance(1000) | ||
info1.finishTime = clock.getTimeMillis() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it would be better here for you to call manager.handleFailedTask
, to more accurately simulate the real behavior, and also makes the purpose of a test a little more clear.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice suggestion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you shouldn't need to set info.finishTime
anymore, that should be taken care of by manager.handleFailedTask
.
Hi, @squito . Thank for review and comments. |
Test build #89164 has finished for PR 20998 at commit
|
@mridulm more thoughts? I think this is the right change but I will leave open for a bit to get more input |
// after a long long time, task0.0 failed, and task0.0 can not re-run since | ||
// there's already a running copy. | ||
clock.advance(1000) | ||
info1.finishTime = clock.getTimeMillis() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you shouldn't need to set info.finishTime
anymore, that should be taken care of by manager.handleFailedTask
.
test("speculative task should not run on a given host where another attempt " + | ||
"is already running on") { | ||
test("SPARK-23888: speculative task should not run on a given host " + | ||
"where another attempt is already running on") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd reword this to be a bit more specific to what you're trying to test:
speculative task cannot run on host with another running attempt, but can run on a host with a failed attempt.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure. Also, do we need to reword PR and jira title? @squito
@squito My concern is, in large workloads, some nodes simply become bad for some tasks (transient env or hardware issues, colocating containers, etc) while being fine for others; speculative tasks should alleviate performance concerns and not increase chances of application failure due to locality preference affinity. For cluster sizes which are very small, speculative execution is less relevant than for those which are large - and here we are tuning for the former. |
Test build #89228 has finished for PR 20998 at commit
|
I'm not even really concerned about the case for two hosts -- I agree its fine if we do something sub-optimal. I'm more concerned about code-clarity and the behavior in general. It seems cleaner to me if speculation doesn't worry about where its failed before, and those exclusions are left to the blacklist. But it sounds like you're saying the prior behavior was really desirable -- you think its better if speculation always excludes hosts that task has ever failed on? I'm happy to defer to your opinion on this, I haven't really stressed speculative execution yet. Then lets just change that comment in the code to be consistent. |
@squito I completely agree that the comment is inaccurate. |
@Ngone51 can you instead leave the behavior as is, and just update the comment? Sorry that its going to be a small change in the end, and all the extra work the bad comments led you to do, but still appreciate you noticing this and fixing. a good PR with a quality test too. |
Will do, and it's okay. |
Test build #89460 has finished for PR 20998 at commit
|
LGTM |
@@ -287,7 +287,7 @@ private[spark] class TaskSetManager( | |||
None | |||
} | |||
|
|||
/** Check whether a task is currently running an attempt on a given host */ | |||
/** Check whether a task once run an attempt on a given host */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be "once ran"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. Thank you.
Test build #89728 has finished for PR 20998 at commit
|
merged to master, thanks @Ngone51 . I also updated the commit msg some before committing, I thought it best to focus on the eventual change, figured it wasn't worth bugging you for another update cycle. |
Agree and thank you @squito . And thanks for all of you. @felixcheung @mridulm @jiangxb1987 @srowen |
What changes were proposed in this pull request?
There's a bug in:
This will ignore hosts which only have finished attempts, so we should check whether the attempt is currently running on the given host.
And it is possible for a speculative task to run on a host where another attempt failed here before.
Assume we have only two machines: host1, host2. We first run task0.0 on host1. Then, due to a long time waiting for task0.0, we launch a speculative task0.1 on host2. And, task0.1 finally failed on host1, but it can not re-run since there's already a copy running on host2. After another long time waiting, we launch a new speculative task0.2. And, now, we can run task0.2 on host1 again, since there's no more running attempt on host1.
After discussion, we simply make the comment be consistent the method's behavior.
How was this patch tested?
Added.