-
Notifications
You must be signed in to change notification settings - Fork 901
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Exclude defaultrack bookies when enforceMinNumRacksPerWriteQuorum is enabled #1941
Exclude defaultrack bookies when enforceMinNumRacksPerWriteQuorum is enabled #1941
Conversation
…enabled - enforceMinNumRacksPerWriteQuorum is meant to be used for strict placement policy. So when it is enabled, bookies which belong to default faultzone/rack (because of failure in resolving network location) should be excluded from bookie selection. - add gauge for number of bookies in default faultzone/rack. It will be helpful to create alerts based on this gauge. - add gauge for number of ledgers found not adhering to strict placement policy in Auditor's placement policy check. This gauge will be more helpful in creating alert instead of using monotonously increasing alert.
@pasha-kuznetsov fyi. |
@@ -93,6 +93,9 @@ | |||
String FAILED_CONNECTION_COUNTER = "FAILED_CONNECTION_COUNTER"; | |||
String FAILED_TLS_HANDSHAKE_COUNTER = "FAILED_TLS_HANDSHAKE_COUNTER"; | |||
|
|||
// placementpolicy stats | |||
String NUM_BOOKIES_IN_DEFAULT_FAULTZONE = "NUM_BOOKIES_IN_DEFAULT_FAULTZONE"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@reddycharan @jvrao Shouldn't we have similar stats for each fault zone, including number of writable bookies there? Then we could alert on a fault zone not having enough (writable) bookies in a fault zone.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
responded already - #1941 (comment)
} | ||
} | ||
}; | ||
this.statsLogger.registerGauge(NUM_BOOKIES_IN_DEFAULT_FAULTZONE, numBookiesInDefaultFaultZone); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@reddycharan @jvrao registering the gauge before producing the actual numbers will result in issues similar to #1884 BUG: dir_*_usage stats are reported as 0
for read-only bookies after a restart. While the cause is different here, the result will be the same — 0
will always be reported initially after a restart, resetting the alerts (and then reissuing them because it will quickly encounter the same nodes in the default rack)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pasha-kuznetsov reporting '0' in this case is valid. It means that so far we haven't come across bookie which would fall in default rack. This is appropriate understanding.
placementPolicyCheckEnsembleNotAdheringToPlacementPolicy.get().longValue()); | ||
Gauge<? extends Number> ledgersNotAdheringToPlacementPolicyGuage = statsLogger | ||
.getGauge(ReplicationStats.NUM_LEDGERS_NOT_ADHERING_TO_PLACEMENT_POLICY); | ||
assertEquals("NUM_LEDGERS_NOT_ADHERING_TO_PLACEMENT_POLICY guage value", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: gauge
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will change
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@reddycharan overall looks good to me. I have an additional comment - can we have a stats about num bookies available per rack?
e.g.
/placementpolicy/rack1/num_bookies: 3
/placementpolicy/rack2/num_bookies: 3
...
name = NUM_BOOKIES_IN_DEFAULT_FAULTZONE, | ||
help = "Gauge for the number of Bookies in default Fault Zone" | ||
) | ||
protected Gauge<Integer> numBookiesInDefaultFaultZone; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we rarely use "fault zone" in bookkeeper's concept. I would prefer using a consistent name in the same file, e.g. calling it default rack.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok will rename it to defaultrack
@pasha-kuznetsov @sijie i'm little surprised both of you suggested for metrics/stats for num of bookies (writable) available per rack. I have multiple concerns with this
|
I am thinking of
different metrics/altering system might have different behaviors. some of the alerting systems are allowed '' queries. so you can write alert rules on using
I agree with this.
agree
in current stats library, you can use |
…enabled - testcases
rerun bookkeeper-server client tests |
@sijie @pasha-kuznetsov @jvrao added testcases. |
@sijie @pasha-kuznetsov @jvrao @eolivelli waiting for final review. |
if (node instanceof BookieNode) { | ||
BookieNode bookieNode = (BookieNode) node; | ||
comprehensiveExclusionBookiesSet.add(bookieNode.getAddr()); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Else?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah we should handle else
branch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I added this if block just to be on safe side. Leaves of "getDefaultRack()" should be BookieNode. In no normal circumstances it should be non-BookieNode (InnerNode). Anyhow do you have any strong suggestion for what needs to be in else block?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No I was only pointing here, maybe something that I missed was missing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@reddycharan if we don't expect anything otherthan bookie node, we should just remove the if condition. Or add an else with exception. It really doesn't make sense to leave it like that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will add error logging
if (node instanceof BookieNode) { | ||
BookieNode bookieNode = (BookieNode) node; | ||
comprehensiveExclusionBookiesSet.add(bookieNode.getAddr()); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah we should handle else
branch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@sijie it should be good to go. |
run bookkeeper-server client tests |
@@ -251,6 +251,9 @@ EnsemblePlacementPolicy initialize(ClientConfiguration conf, | |||
* <p>{@code customMetadata} is the same user defined data that user provides | |||
* when {@link BookKeeper#createLedger(int, int, int, BookKeeper.DigestType, byte[], Map)}. | |||
* | |||
* <p>If 'enforceMinNumRacksPerWriteQuorum' config is enabled then the bookies belonging to default | |||
* faultzone (rack) will be excluded while selecting bookies. This is to enable strict placementpolicy. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if the "This is to enable strict placementpolicy." is needed. Rest of the wording gives enough context.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will remove
Set<BookieSocketAddress> excludeBookies) { | ||
Set<BookieSocketAddress> comprehensiveExclusionBookiesSet; | ||
if (enforceMinNumRacksPerWriteQuorum) { | ||
comprehensiveExclusionBookiesSet = new HashSet<BookieSocketAddress>(excludeBookies); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like you have put this inside the if condition to avoid unnecessary object allocation. To serve that purpose, may be you should check if the defaultRack is empty or not before allocating new object.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will do
topology.getLeaves(getDefaultRack()).forEach((Node node) -> { | ||
if (node instanceof BookieNode) { | ||
BookieNode bookieNode = (BookieNode) node; | ||
comprehensiveExclusionBookiesSet.add(bookieNode.getAddr()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would like to have INFO message here with the list of all bookies that are in the default rack and are being excluded. I understand it could amplify logging but I am OK to take it when the configuration is asking for strict placement.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will do
parentPredicate); | ||
rwLock.readLock().lock(); | ||
try { | ||
Set<BookieSocketAddress> comprehensiveExclusionBookiesSet = addDefaultZoneBookiesIfMinNumRacksIsEnforced( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Somehow I am not a big fan of the name. It is too specific to what it is doing. Tomorrow we might add more criteria to exclusion of bookies (Say based on load etc); i am OK with it as-is but why not the name to me "compileComprehensiveExclusionSet" So it is generic; Anyway.. just a suggestion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dont see an issue. will keep it as it is
*/ | ||
LOG.warn("Received exception while trying to get network location of bookie: {}", bookie, e); | ||
} | ||
} | ||
if (racksOrRegionsInQuorum.size() < minNumRacksPerWriteQuorumForThisEnsemble) { | ||
if ((racksInQuorum.size() < minNumRacksPerWriteQuorumForThisEnsemble) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't you need to check for enforceMinNumRacksPerWriteQuorum for this if condition too? If this is implicit why not the defaultRack is not implicit?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
didn't get you, here i'm making sure that number of racks in quorum is not less than minNumRacksPerWriteQuorumForThisEnsemble and if enforceMinNumRacksPerWriteQuorum is enabled then no bookie is from defaultRack.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If enforceMinNumRacksPerWriteQuorum is not enabled, (racksInQuorum.size() < minNumRacksPerWriteQuorumForThisEnsemble) is not an issue right? So I am thinking the check should be enforceMinNumRacksPerWriteQuorum && ((racksInQuorum.size() < minNumRacksPerWriteQuorumForThisEnsemble) || (enforceMinNumRacksPerWriteQuorum && racksInQuorum.contains(getDefaultRack()))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jvrao If enforceMinNumRacksPerWriteQuorum is not enabled, (racksInQuorum.size() < minNumRacksPerWriteQuorumForThisEnsemble is an issue . for writequorum to adhere to placement policy it should have bookies from atleast minNumRacksPerWriteQuorumForThisEnsemble. if enforceMinNumRacksPerWriteQuorum is enabled then on top of the existing condition, no bookie should be from default rack.
LOG.error( | ||
"BKAuditException running periodic placementPolicy check." | ||
+ "numOfLedgersNotAdheringToPlacementPolicy {}", | ||
numOfLedgersFoundInPlacementPolicyCheckValue, e); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@reddycharan Why are you checking for non zero value of numOfLedgersFoundInPlacementPolicyCheckValue to set the stat/guage but not for error message? Do you expect to have an exception but a zero numOfLedgersFoundInPlacementPolicyCheckValue?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you expect to have an exception but a zero numOfLedgersFoundInPlacementPolicyCheckValue?
yes in the case of exception if we have not found faulty ledgers so far, we don't want to update the gauge value with zero. it would be misleading to report 0. so to avoid false negative i added this if condition in exception case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see; what kind of exception do you expect even when numOfLedgersFoundInPlacementPolicyCheckValue is zero.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for placementPolicyCheck, it has to iterate over all the ledgers in the metadataserver and check the placement policy. So half way though the process there could be an exception and no faulty ledger is found so far. So in this case we want to log error but dont want to report gauge with zero value.
…enabled - fixing review comments
@jvrao addressed your comments. |
rebuild java11 |
run bookkeeper-server replication tests |
run bookkeeper-server remaining tests |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@sijie @jvrao @eolivelli since it is been approved and have successful checks, I'm going to merge this. |
Descriptions of the changes in this PR:
it is enabled, bookies which belong to default faultzone/rack (because of failure in resolving
network location) should be excluded from bookie selection.
based on this gauge.
placement policy check. This gauge will be more helpful in creating alert instead of using
monotonously increasing alert.