Skip to content

GOBBLIN-1416: Fix a race condition caused by Helix task cancellation being invoked before Gobblin task creation#3250

Closed
sv2000 wants to merge 2 commits intoapache:masterfrom
sv2000:taskCancelRaceCondition
Closed

GOBBLIN-1416: Fix a race condition caused by Helix task cancellation being invoked before Gobblin task creation#3250
sv2000 wants to merge 2 commits intoapache:masterfrom
sv2000:taskCancelRaceCondition

Conversation

@sv2000
Copy link
Copy Markdown
Contributor

@sv2000 sv2000 commented Mar 19, 2021

Dear Gobblin maintainers,

Please accept this PR. I understand that it will not be reviewed until I have checked off all the steps below!

JIRA

Description

  • Here are some details about my PR, including screenshots (if applicable):
    Invocation of Gobblin Helix task cancellation before underlying Gobblin tasks have been created or submitted to the task executor results in GobblinHelixTask#cancel() returning successfully, but without preventing future submission of Gobblin tasks to the executor.

Tests

  • My PR adds the following unit tests OR does not need testing for this extremely good reason:
    Added unit test.

Commits

  • My commits all reference JIRA issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "How to write a good git commit message":
    1. Subject is separated from body by a blank line
    2. Subject is limited to 50 characters
    3. Subject does not end with a period
    4. Subject uses the imperative mood ("add", not "adding")
    5. Body wraps at 72 characters
    6. Body explains "what" and "why", not "how"

…being invoked before Gobblin task creation
@sv2000
Copy link
Copy Markdown
Contributor Author

sv2000 commented Mar 19, 2021

@autumnust @ZihanLi58 Please review.

//Has the task-attempt already been cancelled? This can happen for instance when a cancellation has been invoked on
// the GobblinMultiTaskAttempt instance (e.g. in the case of Helix task cancellation) before the Gobblin tasks
// have been submitted to the underlying task executor.
if (this.stopped.get()) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't seem to be enough for synchronization purpose. cancel could happen after this line and between line 407 and 408.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that shutdownTasks() is synchronized too. So if cancel has not been called yet (i.e. at line 402), the task creation is guaranteed to be completed and tasks will be submitted to the executor, before shutdownTasks() gets called.

Pair<List<Task>, Boolean> executionResult = runWorkUnits(countDownLatch);
this.tasks = executionResult.getFirst();

if (this.tasks.isEmpty() && this.stopped.get()) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me know if this understanding is correct: whenever these two conditions hold at the same time, it indicates that the cancel happens before the task is materialized and executed in the executor. ( since the regular cancel will result in tasks being non empty). Can you add a comment if this is the case, just to distinguish between the regular cancel.

Copy link
Copy Markdown
Contributor

@autumnust autumnust left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

@sv2000 sv2000 closed this Mar 22, 2021
@sv2000 sv2000 reopened this Mar 22, 2021
@asfgit asfgit closed this in 24ebc02 Mar 22, 2021
jhsenjaliya pushed a commit to jhsenjaliya/incubator-gobblin that referenced this pull request Jun 7, 2021
… being invoked before Gobblin task creation

Closes apache#3250 from sv2000/taskCancelRaceCondition
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants