[GOBBLIN-1999] Ignore concurrent check if the flow execution ID is the same as the c…#3874
Merged
Will-Lo merged 2 commits intoapache:masterfrom Feb 2, 2024
Conversation
…urrently running flow execution ID to handle race condition of concurrent hosts misreporting status
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## master #3874 +/- ##
============================================
+ Coverage 38.68% 46.73% +8.04%
- Complexity 1595 11125 +9530
============================================
Files 388 2210 +1822
Lines 15953 87402 +71449
Branches 1577 9611 +8034
============================================
+ Hits 6172 40845 +34673
- Misses 9289 42873 +33584
- Partials 492 3684 +3192 ☔ View full report in Codecov by Sentry. |
umustafi
reviewed
Feb 2, 2024
Contributor
umustafi
left a comment
There was a problem hiding this comment.
LGTM overall! small nit and test to debug
| } | ||
|
|
||
| @Test | ||
| public void skipFlowConccurentCheckSameFlowExecutionId() { |
Contributor
There was a problem hiding this comment.
extra c, one less r in concurrent
umustafi
approved these changes
Feb 2, 2024
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
…urrently running flow execution ID to handle race condition of concurrent hosts misreporting status
Dear Gobblin maintainers,
Please accept this PR. I understand that it will not be reviewed until I have checked off all the steps below!
JIRA
Description
In GaaS multileader there is a race condition where one host will compile/kick off a job, and the other hosts have an expected behavior of compiling the job but avoiding submission until the DagManager supports multiple leaders.
The other non-leader hosts will try to check if the current job is running, and then mark the flow as failed due to seeing that the leader has already compiled the job. This leads to scenarios where flows with all successful jobs can be marked as a failed flow before the underlying jobs even report their status, due to the other hosts failing due to the concurrency check.
This PR resolves this issue by making an additional check that if the flow waiting to be submitted is the same flow execution ID as the currently running flow (in progress of submission/already running), we move forward and rely on the DagManager to perform deduplication of multiple submissions.
Tests
Unit tests
Commits