New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Skipping placementPolicyCheck when ledger replication disabled #3561
Skipping placementPolicyCheck when ledger replication disabled #3561
Conversation
ping @zymap @dlg99 @eolivelli @hangc0276 @shoothzj PTAL. Thanks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems to be in line with the rest of Auditor's logic, looks like all the tasks scheduled by
scheduleBookieCheckTask();
scheduleCheckAllLedgersTask();
schedulePlacementPolicyCheckTask();
scheduleReplicasCheckTask();
won't do anything.
This brings another question: we also don't see what's going on in this case (independent of this change).
E.g. pausing AR for some time (maintenance window etc.) is fine but is it ok to not update counts of underreplicated ledgers?
@dlg99 Good catch. Thanks for your review and comments. :) Auditor: 1. Different ledger status detection. maybe disable AutoRecovery needs to be divided into 2 parts:
This should be a new feature, we need other discussions and PRs. |
### Motivation When `placementPolicyCheck` is enabled and then `bookkeeper shell autorecovery -disable` is executed, `placementPolicyCheck` will detect ledgers that do not satisfy `placementPolicy` and write to zookeeper, but `ReplicationWorker` no longer obtains ledgers from zookeeper for replication work because autorecovery is disabled, which results in a large number of temporary nodes on zookeeper , when there are more ledgers that do not satisfy placementPolicy, the problem will get worse. The method of `ReplicationWorker` to get ledger: ```java private boolean rereplicate() throws InterruptedException, BKException, UnavailableException { long ledgerIdToReplicate = underreplicationManager .getLedgerToRereplicate(); ... ``` So we should also disable `placementPolicyCheck`. (cherry picked from commit cfc6b97)
…e#3561) ### Motivation When `placementPolicyCheck` is enabled and then `bookkeeper shell autorecovery -disable` is executed, `placementPolicyCheck` will detect ledgers that do not satisfy `placementPolicy` and write to zookeeper, but `ReplicationWorker` no longer obtains ledgers from zookeeper for replication work because autorecovery is disabled, which results in a large number of temporary nodes on zookeeper , when there are more ledgers that do not satisfy placementPolicy, the problem will get worse. The method of `ReplicationWorker` to get ledger: ```java private boolean rereplicate() throws InterruptedException, BKException, UnavailableException { long ledgerIdToReplicate = underreplicationManager .getLedgerToRereplicate(); ... ``` So we should also disable `placementPolicyCheck`. (cherry picked from commit cfc6b97)
…e#3561) ### Motivation When `placementPolicyCheck` is enabled and then `bookkeeper shell autorecovery -disable` is executed, `placementPolicyCheck` will detect ledgers that do not satisfy `placementPolicy` and write to zookeeper, but `ReplicationWorker` no longer obtains ledgers from zookeeper for replication work because autorecovery is disabled, which results in a large number of temporary nodes on zookeeper , when there are more ledgers that do not satisfy placementPolicy, the problem will get worse. The method of `ReplicationWorker` to get ledger: ```java private boolean rereplicate() throws InterruptedException, BKException, UnavailableException { long ledgerIdToReplicate = underreplicationManager .getLedgerToRereplicate(); ... ``` So we should also disable `placementPolicyCheck`. (cherry picked from commit cfc6b97)
…e#3561) ### Motivation When `placementPolicyCheck` is enabled and then `bookkeeper shell autorecovery -disable` is executed, `placementPolicyCheck` will detect ledgers that do not satisfy `placementPolicy` and write to zookeeper, but `ReplicationWorker` no longer obtains ledgers from zookeeper for replication work because autorecovery is disabled, which results in a large number of temporary nodes on zookeeper , when there are more ledgers that do not satisfy placementPolicy, the problem will get worse. The method of `ReplicationWorker` to get ledger: ```java private boolean rereplicate() throws InterruptedException, BKException, UnavailableException { long ledgerIdToReplicate = underreplicationManager .getLedgerToRereplicate(); ... ``` So we should also disable `placementPolicyCheck`. (cherry picked from commit cfc6b97) (cherry picked from commit e61a913)
Motivation
When
placementPolicyCheck
is enabled and thenbookkeeper shell autorecovery -disable
is executed,placementPolicyCheck
will detect ledgers that do not satisfyplacementPolicy
and write to zookeeper, butReplicationWorker
no longer obtains ledgers from zookeeper for replication work because autorecovery is disabled, which results in a large number of temporary nodes on zookeeper , when there are more ledgers that do not satisfy placementPolicy, the problem will get worse.The method of
ReplicationWorker
to get ledger:So we should also disable
placementPolicyCheck
.