New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
LedgerHandle: do not complete metadata operation on the ZookKeeper/Metadata callback thread #3516
LedgerHandle: do not complete metadata operation on the ZookKeeper/Metadata callback thread #3516
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
overall LGTM. Client tests failed with
The job is re-running, but fwiw I don't remember BookieWriteLedgerTest being flaky. |
Looks like the change broke BookieWriteLedgerTest.testLedgerCreateAdvWithLedgerIdInLoop |
@dlg99 @shoothzj @StevenLuMT basically the test was writing to one ledger at a time because it was "blocking" the main zookeeper thread. Now createLedger completes in a BK thread (pinned to the ledgerId) and not on the ZK thread, so doing a blocking call that waits for writes on the same ledger leads to a deadlock this is an interesting behaviour change, but I believe that it is better that BK works this way, otherwise the application will execute code in the main zk thread without knowning. cc @merlimat @michaeljmarshall @lhotari @rdhabalia @RaulGracia @fpj |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Great work
…tadata callback thread (apache#3516) (cherry picked from commit 2135708)
…tadata callback thread (apache#3516) (cherry picked from commit 2135708)
…tadata callback thread (apache#3516) (cherry picked from commit 2135708) (cherry picked from commit ca03223)
Motivation
We are currently completing the creation and opening (also with recovery) of the LedgerHandle in the main ZookKeeper (or metadata driver) callback thread.
This leads to unpredictable code to be executed on that thread:
This is a good example of a something that tries to open a connection to a bookie because the application executes a "read" after a "openLedger" operation.
The full story is here, but this patch does not focus on the BookieAddressResolver blocking calls to the metadata storage. The scope is this patch is to prevent application from running code on the metadata store main thread.
apache/pulsar#17913
It is unfortunate that CompletableFuture doesn't allow you to have full control on the thread which will execute the completion tasks
Changes
(Describe: what changes you have made)
Master Issue: #