Entries must be acknowledged by bookies in multiple fault domains before being acknowledged to client#2096
Conversation
…ore being acknowledged to client Bookkeeper write logic makes sure that there are at least ackQuorumSize number of successful writes before sending ack back to the client. In many cases these may fall into the same fault domain. A mechanism to force bookkeeper to make sure that there are acks from at least minNumRacksPerWriteQuorum number of fault domains and a configuration to enforce this. Signed-off-by: Ankit Jain <jain.a@salesforce.com>
|
@reddycharan @jvrao Could you review this and add anyone else who should review this? |
...-server/src/main/java/org/apache/bookkeeper/client/RackawareEnsemblePlacementPolicyImpl.java
Outdated
Show resolved
Hide resolved
Signed-off-by: Ankit Jain <jain.a@salesforce.com>
Signed-off-by: Ankit Jain <jain.a@salesforce.com>
bookkeeper-server/src/test/java/org/apache/bookkeeper/client/BookKeeperTest.java
Outdated
Show resolved
Hide resolved
Signed-off-by: Ankit Jain <jain.a@salesforce.com>
Signed-off-by: Ankit Jain <jain.a@salesforce.com>
|
run integration tests |
|
run bookkeeper-server bookie tests |
bookkeeper-server/src/test/java/org/apache/bookkeeper/client/BookKeeperTest.java
Outdated
Show resolved
Hide resolved
bookkeeper-server/src/test/java/org/apache/bookkeeper/client/BookKeeperTest.java
Outdated
Show resolved
Hide resolved
bookkeeper-server/src/test/java/org/apache/bookkeeper/client/BookKeeperTest.java
Outdated
Show resolved
Hide resolved
bookkeeper-server/src/test/java/org/apache/bookkeeper/client/BookKeeperTest.java
Outdated
Show resolved
Hide resolved
bookkeeper-server/src/test/java/org/apache/bookkeeper/client/BookKeeperTest.java
Outdated
Show resolved
Hide resolved
bookkeeper-server/src/test/java/org/apache/bookkeeper/client/BookKeeperTest.java
Outdated
Show resolved
Hide resolved
bookkeeper-server/src/test/java/org/apache/bookkeeper/client/BookKeeperTest.java
Outdated
Show resolved
Hide resolved
bookkeeper-server/src/test/java/org/apache/bookkeeper/client/BookKeeperTest.java
Outdated
Show resolved
Hide resolved
bookkeeper-server/src/test/java/org/apache/bookkeeper/client/BookKeeperTest.java
Outdated
Show resolved
Hide resolved
...-server/src/main/java/org/apache/bookkeeper/client/RackawareEnsemblePlacementPolicyImpl.java
Outdated
Show resolved
Hide resolved
bookkeeper-server/src/test/java/org/apache/bookkeeper/client/BookKeeperTest.java
Show resolved
Hide resolved
...-server/src/test/java/org/apache/bookkeeper/client/TestRackawareEnsemblePlacementPolicy.java
Outdated
Show resolved
Hide resolved
...-server/src/test/java/org/apache/bookkeeper/client/TestRackawareEnsemblePlacementPolicy.java
Outdated
Show resolved
Hide resolved
Signed-off-by: Ankit Jain <jain.a@salesforce.com>
eolivelli
left a comment
There was a problem hiding this comment.
I left just one last minor comment, then we are good to go
| writeDelayedStartTime = MathUtils.nowInNano(); | ||
| } | ||
| } else { | ||
| completed = true; |
There was a problem hiding this comment.
So in summary you are discarding the acks that are not coming from the desired racks.
Do I understand correctly?
There was a problem hiding this comment.
Not exactly.
What I'm doing here, if enforceMinNumFaultDomainsForWrite is set, is preventing the addEntry from being completing till bookies from minNumRacksPerWriteQuorum(or writeQuorumSize, whichever is lower) number of racks have acknowledged the addEntry. If enforceMinNumFaultDomainsForWrite is not set, no change to the existing logic.
...-server/src/main/java/org/apache/bookkeeper/client/RackawareEnsemblePlacementPolicyImpl.java
Outdated
Show resolved
Hide resolved
bookkeeper-server/src/main/java/org/apache/bookkeeper/client/EnsemblePlacementPolicy.java
Show resolved
Hide resolved
bookkeeper-server/src/test/java/org/apache/bookkeeper/client/BookKeeperTest.java
Outdated
Show resolved
Hide resolved
reddycharan
left a comment
There was a problem hiding this comment.
left couple of nit comments, fix them. Otherwise LGTM!
Signed-off-by: Ankit Jain <jain.a@salesforce.com>
|
Addressed @reddycharan's review comments. @sijie @eolivelli @jvrao Could you review this over and merge it if it's okay? |
There was a problem hiding this comment.
LGTM
@reddycharan you can merge this patch as soon as we have green lights on CI
|
@ankit-j can you follow up on the failure jenkin runs. |
|
run pr validation |
|
run integration tests |
|
@eolivelli @reddycharan The integration tests failure is unrelated to the changes made in this PR, and I see that that same test has failed for other PRs as well, none of which touch the relevant code sections. As mentioned in #2090 (comment), this test is failing consistently. @ivankelly would it be okay to skip waiting on that test to pass to merge this? |
|
@ankit-j merging on a broken branch is how you end up with a broken branch all the time. |
|
run integration tests |
|
All checks have passed. So merging this PR. |
Descriptions of the changes in this PR:
Bookkeeper write logic makes sure that there are at least ackQuorumSize
number of successful writes before sending ack back to the client. In
many cases these may fall into the same fault domain. A mechanism to
force bookkeeper to make sure that there are acks from at least
minNumRacksPerWriteQuorum number of fault domains and a configuration
to enforce this.
Signed-off-by: Ankit Jain jain.a@salesforce.com
Master Issue: #2095