Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug][Master] Master cannot fault tolerant when multiple Masters start at the same time #4840

Closed
zhanguohao opened this issue Feb 23, 2021 · 1 comment · Fixed by #4845
Closed

Comments

@zhanguohao
Copy link
Contributor

**For better global communication, Please describe it in English. If you feel the description in English is not clear, then you can append description in Chinese(just for Mandarin(CN)), thx! **
Describe the bug
If multiple Masters are started at the same time, it will cause global fault tolerance logic errors. Multiple Masters that are online at the same time are recognized, fault tolerant will be skipped
多个Master同时启动,会导致全局容错逻辑错误,识别到多个同时在线的Master,会跳过容错
image
image
image

Expected behavior
Master registration Zookeeper and start fault tolerance at the same time
Master 注册Zookeeper和启动容错同时进行

Which version of Dolphin Scheduler:
-[1.3.*]

@zhanguohao
Copy link
Contributor Author

I'll fix it

zhanguohao added a commit to zhanguohao/incubator-dolphinscheduler that referenced this issue Feb 23, 2021
zhanguohao added a commit to zhanguohao/incubator-dolphinscheduler that referenced this issue Feb 23, 2021
@lenboo lenboo added this to the 1.3.6-release milestone Feb 24, 2021
@lenboo lenboo added this to Requirement(需求) in DolphinScheduler Work Plan via automation Feb 24, 2021
@lenboo lenboo moved this from Requirement(需求) to Doing(正在开发中) in DolphinScheduler Work Plan Feb 24, 2021
CalvinKirs pushed a commit that referenced this issue Feb 24, 2021
* [Fix-#4840][worker] fix master fault tolerance when startup

* [Fix-#4840][worker] move masterRegistry.unRegistry to zkMasterClient.close
@davidzollo davidzollo moved this from Doing to PREPARE-RELEASE-1.3.6 in DolphinScheduler Work Plan Mar 3, 2021
lenboo pushed a commit that referenced this issue Mar 9, 2021
…artup #4845 (#4916)

* [1.3.6-prepare][Fix-#4840][worker] fix master fault tolerance when startup #4845

* code style
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
Development

Successfully merging a pull request may close this issue.

2 participants