Skip to content

[GOBBLIN-1999] Ignore concurrent check if the flow execution ID is the same as the c…#3874

Merged
Will-Lo merged 2 commits intoapache:masterfrom
Will-Lo:fix-flow-status-reporting-concurrent-gaas
Feb 2, 2024
Merged

[GOBBLIN-1999] Ignore concurrent check if the flow execution ID is the same as the c…#3874
Will-Lo merged 2 commits intoapache:masterfrom
Will-Lo:fix-flow-status-reporting-concurrent-gaas

Conversation

@Will-Lo
Copy link
Copy Markdown
Contributor

@Will-Lo Will-Lo commented Feb 2, 2024

…urrently running flow execution ID to handle race condition of concurrent hosts misreporting status

Dear Gobblin maintainers,

Please accept this PR. I understand that it will not be reviewed until I have checked off all the steps below!

JIRA

Description

  • Here are some details about my PR, including screenshots (if applicable):

In GaaS multileader there is a race condition where one host will compile/kick off a job, and the other hosts have an expected behavior of compiling the job but avoiding submission until the DagManager supports multiple leaders.

The other non-leader hosts will try to check if the current job is running, and then mark the flow as failed due to seeing that the leader has already compiled the job. This leads to scenarios where flows with all successful jobs can be marked as a failed flow before the underlying jobs even report their status, due to the other hosts failing due to the concurrency check.

This PR resolves this issue by making an additional check that if the flow waiting to be submitted is the same flow execution ID as the currently running flow (in progress of submission/already running), we move forward and rely on the DagManager to perform deduplication of multiple submissions.

Tests

  • My PR adds the following unit tests OR does not need testing for this extremely good reason:
    Unit tests

Commits

  • My commits all reference JIRA issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "How to write a good git commit message":
    1. Subject is separated from body by a blank line
    2. Subject is limited to 50 characters
    3. Subject does not end with a period
    4. Subject uses the imperative mood ("add", not "adding")
    5. Body wraps at 72 characters
    6. Body explains "what" and "why", not "how"

…urrently running flow execution ID to handle race condition of concurrent hosts misreporting status
@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Feb 2, 2024

Codecov Report

Attention: 6 lines in your changes are missing coverage. Please review.

Comparison is base (f6f1bc5) 38.68% compared to head (bdbc7ae) 46.73%.

Files Patch % Lines
...modules/utils/FlowCompilationValidationHelper.java 25.00% 3 Missing and 3 partials ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##             master    #3874      +/-   ##
============================================
+ Coverage     38.68%   46.73%   +8.04%     
- Complexity     1595    11125    +9530     
============================================
  Files           388     2210    +1822     
  Lines         15953    87402   +71449     
  Branches       1577     9611    +8034     
============================================
+ Hits           6172    40845   +34673     
- Misses         9289    42873   +33584     
- Partials        492     3684    +3192     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Copy Markdown
Contributor

@umustafi umustafi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM overall! small nit and test to debug

}

@Test
public void skipFlowConccurentCheckSameFlowExecutionId() {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

extra c, one less r in concurrent

@Will-Lo Will-Lo merged commit 3e57562 into apache:master Feb 2, 2024
@Will-Lo Will-Lo deleted the fix-flow-status-reporting-concurrent-gaas branch February 2, 2024 22:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants