-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FLINK-1484] Adds explicit disconnect message for TaskManagers #368
Conversation
// disconnect the registered task managers | ||
instanceManager.getAllRegisteredInstances.asScala.foreach{ | ||
_.getTaskManager ! Disconnected("JobManager is stopping")} | ||
|
||
for((e,_) <- currentJobs.values){ | ||
e.fail(new Exception("The JobManager is shutting down.")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we are cleaning up messages, maybe remove "The" so it is consistent with other messages.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Henry for spotting it. I corrected it.
+1 LGTM |
Looks good. Since this is a behavior change, can you file a ticket for this, Till? |
My bad, there is a ticket already. Can you squash the commits then and add the ticket tag? Otherwise, good to merge! |
Yes, I'll do it. On Mon, Feb 9, 2015 at 10:23 AM, Stephan Ewen notifications@github.com
|
1febbf4
to
3ff3a8b
Compare
…anagers that the JobManager has failed Fixes TestingTaskManager to detect disconnection of JobManager Fixes race condition when restarting actors in JobManagerFailsITCase Improves comment message Removes test listeners once the test message has been sent Fixes InvalidActorNameException in JobManagerFailsITCase when restarting the JobManager
3ff3a8b
to
b8e38cb
Compare
@tillrohrmann and @StephanEwen worked on some other reliablity issues. Will the changes in this PR be subsumed by the upcoming changes? If not, we should merge this. :-) |
This PR has been merged as part of PR #423 |
In case of a JobManager restart, which can be caused by an uncaught exception, all connected TaskManager are notified by a
Disconnected
message. This message triggers the cleanup of the TaskManagers and makes them try to reconnect to the JobManager.This PR also includes a test case to verify the afore-mentioned behaviour.