[fix][transaction] Remove single thread pool and fix the concurrent problem to avoid performance issues#14940
Closed
mattisonchao wants to merge 12 commits intoapache:masterfrom
mattisonchao:fix_handle_tc_connect
Closed
[fix][transaction] Remove single thread pool and fix the concurrent problem to avoid performance issues#14940mattisonchao wants to merge 12 commits intoapache:masterfrom mattisonchao:fix_handle_tc_connect
mattisonchao wants to merge 12 commits intoapache:masterfrom
mattisonchao:fix_handle_tc_connect
Conversation
eolivelli
requested changes
Mar 30, 2022
Contributor
eolivelli
left a comment
There was a problem hiding this comment.
We are moving the execution out of the pinned executor.
I am not sure this is a good move
Can you please clarify?
pulsar-broker/src/main/java/org/apache/pulsar/broker/TransactionMetadataStoreService.java
Show resolved
Hide resolved
Member
Author
|
@eolivelli I updated this PR, could you please review it again? |
Member
Author
|
I chose to use |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Motivation
According to #13969, We can know that adding a single-threaded pool is because deques in a concurrent environment will cause problems with requesting additions and requesting clears.
We can do some changes and remove the single-threaded pool to improve performance.
e.g:
Current we have threads A, B, C and D.
Backgound
Threads A and B request
handleTcClientConnectconcurrently, then A acquires the semaphore fromopenTransactionMetadataStore, and B adds the failure to acquire the semaphore to the deque.Success openTransactionMetadataStore condition
When A's call to
handleTcClientConnectsucceeds, they will first release the semaphore and then complete all futures in the deque. Because if we release the semaphore after finishing the deque future, which will cause thread B to reconnect and add to the deque before thread A releases the semaphore and never completes Thread B.Exceptionally openTransactionMetadataStore condition
When A fails to call
handleTcClientConnect, they will release the semaphore first and get all futures(it's a threshold) that need to be complete exceptionally after releasing the semaphore to avoid an infinite loop caused by client reconnection, because If we release the semaphore after all deque futures are complete exceptionally, the client will immediately reconnect and add to the deque (because thread A does not release the semaphore) which may cause an infinite loop (which means something like, because We have an end time that will break the loop)Other problems rely on the current implementation
When the future of the semaphore acquisition failure exceeds the maximum timeout. We currently clear the deque directly, which is an issue where the CompletableFuture may not complete.
When
TransactionMetadataStoreService#handleTcClientConnectinvoke method as bellow we omit the exception.Avoid frequent context switching using thread pools.
When checking the TC status for the second time, we just finish the future and don't return. (the code is below)
Modifications
Verifying this change
Documentation
no-need-doc