[AMORO-2623] Avoid deadlock among TaskRuntime objects#2790
[AMORO-2623] Avoid deadlock among TaskRuntime objects#2790majin1102 merged 3 commits intoapache:masterfrom
Conversation
link3280
left a comment
There was a problem hiding this comment.
Hi @majin1102 , IIUC the main change is to check the process status and release tasks early right after every completed task (in TaskRuntime), but I don't get how it avoids the deadlock. Please take a look at the comments inline.
| token = null; | ||
| threadId = -1; | ||
| }); | ||
| owner.releaseResourcesIfNecessary(); |
There was a problem hiding this comment.
Please elaborate more on how does it avoid the deadlock. I suppose the deadlock would happen when 2 thread block on owner.acceptResult(this) before this line.
There was a problem hiding this comment.
I think two threads are blocked inside cancelTasks within owner.acceptResult(this), so wrapping cancelTask within releaseResourcesIfNecessary and placing it outside owner.acceptResult(this) should indeed solve the deadlock issue
There was a problem hiding this comment.
I think two threads are blocked inside
cancelTaskswithinowner.acceptResult(this), so wrappingcancelTaskwithinreleaseResourcesIfNecessaryand placing it outsideowner.acceptResult(this)should indeed solve the deadlock issue
Sorry for late replying.
Just like @rfyu said, place canceling All Tasks out of task and process lock could prevent all lock conflict cases involving canceling all tasks.
When two threads steping into owner.acceptResult(), one will wait for lock anyway, I don't quite get your question
There was a problem hiding this comment.
I get the idea now. The cancellation is now lock-free. No task lock is needed, thus no deadlock.
|
ping @majin1102 |
xxubai
left a comment
There was a problem hiding this comment.
What's the difference between cancelTasks when 2 thread are failed at same time and releaseResourcesIfNecessary
If you mean the operation of canceling all tasks itself could lead to dead lock. I think the answer is no, concurrent threads would step into task locks in the same order, which would not lead to any dead locks |
* fix AMORO-2623 * spotless apply --------- Co-authored-by: Xavier Bai <xuba@apache.org> (cherry picked from commit 1c5024c) (cherry picked from commit a0ee3f24d73fdaecd5b648785a54bb0db4c8942d)
Why are the changes needed?
Close #2623.
Brief change log
How was this patch tested?
Add some test cases that check the changes thoroughly including negative and positive cases if possible
Add screenshots for manual tests if appropriate
Run test locally before making a pull request
Documentation