Skip to content

Entries must be acknowledged by bookies in multiple fault domains before being acknowledged to client#2096

Merged
reddycharan merged 7 commits intoapache:masterfrom
ankit-j:ankit-j/enforceFragmentMultipleFaultDomainWrite
Jun 1, 2019
Merged

Entries must be acknowledged by bookies in multiple fault domains before being acknowledged to client#2096
reddycharan merged 7 commits intoapache:masterfrom
ankit-j:ankit-j/enforceFragmentMultipleFaultDomainWrite

Conversation

@ankit-j
Copy link
Contributor

@ankit-j ankit-j commented May 14, 2019

Descriptions of the changes in this PR:

Bookkeeper write logic makes sure that there are at least ackQuorumSize
number of successful writes before sending ack back to the client. In
many cases these may fall into the same fault domain. A mechanism to
force bookkeeper to make sure that there are acks from at least
minNumRacksPerWriteQuorum number of fault domains and a configuration
to enforce this.

Signed-off-by: Ankit Jain jain.a@salesforce.com

Master Issue: #2095

…ore being acknowledged to client

Bookkeeper write logic makes sure that there are at least ackQuorumSize
number of successful writes before sending ack back to the client. In
many cases these may fall into the same fault domain. A mechanism to
force bookkeeper to make sure that there are acks from at least
minNumRacksPerWriteQuorum number of fault domains and a configuration
to enforce this.

Signed-off-by: Ankit Jain <jain.a@salesforce.com>
@ankit-j ankit-j changed the title Entries must be acknowledged by bookies in multiple fault domains bef… Entries must be acknowledged by bookies in multiple fault domains before being acknowledged to client May 14, 2019
@ankit-j
Copy link
Contributor Author

ankit-j commented May 14, 2019

@reddycharan @jvrao Could you review this and add anyone else who should review this?

ankit-j added 2 commits May 15, 2019 17:39
Signed-off-by: Ankit Jain <jain.a@salesforce.com>
Signed-off-by: Ankit Jain <jain.a@salesforce.com>
ankit-j added 2 commits May 21, 2019 11:14
Signed-off-by: Ankit Jain <jain.a@salesforce.com>
Signed-off-by: Ankit Jain <jain.a@salesforce.com>
@ankit-j
Copy link
Contributor Author

ankit-j commented May 21, 2019

run integration tests

@ankit-j
Copy link
Contributor Author

ankit-j commented May 21, 2019

run bookkeeper-server bookie tests

Signed-off-by: Ankit Jain <jain.a@salesforce.com>
Copy link
Contributor

@eolivelli eolivelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left just one last minor comment, then we are good to go

writeDelayedStartTime = MathUtils.nowInNano();
}
} else {
completed = true;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So in summary you are discarding the acks that are not coming from the desired racks.
Do I understand correctly?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not exactly.

What I'm doing here, if enforceMinNumFaultDomainsForWrite is set, is preventing the addEntry from being completing till bookies from minNumRacksPerWriteQuorum(or writeQuorumSize, whichever is lower) number of racks have acknowledged the addEntry. If enforceMinNumFaultDomainsForWrite is not set, no change to the existing logic.

Copy link
Contributor

@reddycharan reddycharan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

left couple of nit comments, fix them. Otherwise LGTM!

Signed-off-by: Ankit Jain <jain.a@salesforce.com>
@ankit-j
Copy link
Contributor Author

ankit-j commented May 23, 2019

Addressed @reddycharan's review comments.

@sijie @eolivelli @jvrao Could you review this over and merge it if it's okay?

Copy link
Contributor

@eolivelli eolivelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM
@reddycharan you can merge this patch as soon as we have green lights on CI

@reddycharan
Copy link
Contributor

@ankit-j can you follow up on the failure jenkin runs.

@ankit-j
Copy link
Contributor Author

ankit-j commented May 24, 2019

run pr validation

@ankit-j
Copy link
Contributor Author

ankit-j commented May 24, 2019

run integration tests

@ankit-j
Copy link
Contributor Author

ankit-j commented May 24, 2019

@eolivelli @reddycharan The integration tests failure is unrelated to the changes made in this PR, and I see that that same test has failed for other PRs as well, none of which touch the relevant code sections.

As mentioned in #2090 (comment), this test is failing consistently. @ivankelly would it be okay to skip waiting on that test to pass to merge this?

@ivankelly
Copy link
Contributor

@ankit-j merging on a broken branch is how you end up with a broken branch all the time.
@reddycharan @karanmehta93 since the break originated on a change you merged/pushed, could you work together to fix the broken tests?

@ankit-j
Copy link
Contributor Author

ankit-j commented May 31, 2019

run integration tests

@reddycharan
Copy link
Contributor

All checks have passed. So merging this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants