-
Notifications
You must be signed in to change notification settings - Fork 13.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FLINK-6174][HA]introduce a SMARTER leader latch to make JobManager less sensitive to disconnect to zookeeper #3599
Conversation
Hi, @WangTaoTheTonic I think we can improve the reaction of ZookeeperLeaderElectionService on zookeeper connection expired or other errors instead of introducing the AlwaysLeaderService such as adding a retry before revoking leadership, because when the problem is caused by errors on the machine which JM is running on, we need to trigger a failover to make the JM change a machine. |
Thanks for your comments @wenlong88 . I also gave a thought about adding retry logic when zk failover, but this part should modify Even with adding this AlwaysLeaderService, the JM failover can also go well as RM will start a new instance. about FLIP-6, I'll check the solution and find if anything can help with this :) |
Hi, I may have described my concern wrongly in the last comment, my concern is that in yarn it is possible that two application master running at the same time: eg: When it is possible that there are two AM running at the same time, we may go into a dead lock using the AlwaysLeaderService as follows:
|
-1 sorry. This needs to go to the drawing board (FLIP or detailed JIRA discussion) before we consider a change that is impacting the guarantees and failure mode so heavily. Some initial comments:
|
I would suggest to fix this the following way:
That is all not a "proper" HA setup - it only works as long as there is strictly only one master Is that what you are looking for? |
I don't think it's a good idea, as it can not solve the "split brain" issue too. The key problem is that I did same work in our own private Spark release, let me see if it can be reused. |
@StephanEwen |
@wenlong88 Feel free to review, thanks :) |
9afc06d
to
cc0f1cd
Compare
Thanks for adding this!
|
@@ -70,6 +70,15 @@ under the License. | |||
<include>org.apache.curator:*</include> | |||
</includes> | |||
</artifactSet> | |||
<relocations> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need this here? I think relocation happens in flink-runtime
, when it puts curator into its shaded jar.
Closing this PR because of inactivity. |
Now in yarn mode, if we use zookeeper as high availability choice, it will create a election service to get a leader depending on zookeeper election.
When zookeeper leader crashes or the connection between JobManager and zookeeper instance was broken, JobManager's leadership will be revoked and send a Disconnect message to TaskManager, which will cancle all running tasks and make them waiting connection rebuild between JM and ZK.
In yarn mode, we have one and only JobManager(AM) in same time, and it should be alwasy leader instead of elected through zookeeper. We can introduce a new leader election service in yarn mode to achive that.
Update:
In case of "split brain" issue, we cannot directly set one JM to leader alltime. Instead i introduce a smarter leader latch that will cache the suspend state and wait a connection timeout duration until the connection to zookeeper is back.