-
Notifications
You must be signed in to change notification settings - Fork 13.8k
[FLINK-5501] JM use running job registry to determine whether is the first running #3385
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
PR looks like a good start, but I think we need to add a few things on top:
|
|
I would like to merge this and make a few edits on top... |
|
One issue I think can happen in practice is that the checks "isRunning" and "isFinished" are not atomic. Imagine this scenario:
|
|
With the problem observed above, I think we should change the approach a bit:
|
| String zkPath = runningJobPath + jobID.toString(); | ||
| this.client.newNamespaceAwareEnsurePath(zkPath).ensure(client.getZookeeperClient()); | ||
| this.client.setData().forPath(zkPath); | ||
| this.client.setData().forPath(zkPath, RUNNING.getBytes()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
String to bytes conversion (and bytes to string) must always explicitly specify the encoding (Charset). Otherwise, there can be mismatches when different machines configure different default Charsets.
|
hi @StephanEwen , thank for you review, I modify it according to your comments, add getJobSchedulingStatus to it and add tests. |
|
Thanks! |
…d/running/done This closes apache#3385
|
One test case seemed to be failing in this PR: |
|
@StephanEwen , Thank you very much, sorry for the test break, next time I will be more careful. |
…d/running/done This closes apache#3385
…d/running/done This closes apache#3385
This pr if for jira-#5501.
The main changes are: