Feature: auto recover support repaired not adhering placement ledger #3359

horizonzy · 2022-06-25T04:28:17Z

Descriptions of the changes in this PR:
There is a user case.

They have two zones, they have a rack aware policy that ensures it writes across two zones
They had some data on a topic with long retention
They ran a disaster recovery test, during this test, they shut down one zone
During the period of the DR test, auto-recovery ran. Because the DR test only has one zone active, and because the default of auto-recovery is to do rack aware with the best effort, it recovered up to an expected number of replicas
They stopped the DR test and all was well, but now that ledger was only on one zone
They ran another DR test, this time basically moving data to the another zone, but now data is missing because it is all only on one zone

We should support a feature to cover this case.

horizonzy · 2022-06-27T08:40:51Z

For this case, we already support detect these ledger which ensemble is not adhering placement policy at now.
In Auditor, if user config auditorPeriodicPlacementPolicyCheckInterval, it will start a scheduled task to trigger placementPolicyCheck, In placementPolicyCheck, it will record the count of ledger fragment which not adhering placement policy.

bookkeeper/bookkeeper-server/src/main/java/org/apache/bookkeeper/replication/Auditor.java

Line 1378 in 677ccec

void placementPolicyCheck() throws BKAuditException {

But it only record it to stat, not recover data to make ensemble to adhere placement policy.

So we can add a config repairedPlacementPolicyNotAdheringBookieEnabled to control is to repaired the data to adhere placement policy.

In Auditor
It will mark ledgerId to unnder replication managed if the ensemble is not adhering placement policy.

In ReplicationWorker
It will move data from old bookie to new bookie which network location is different to adhere placement policy. If there is not bookie with different network location, do nothing.

Attention
In ReplicationWoker, it just poll under replicated ledger then process it. So when get an under replicated ledger, we should check two case. 1) Is the ledger fragments loss data. 2) Is the ledger fragments is not adhering placement policy. The one fragment maybe meet both case at the same time. If so, we will ignore case 2, just repaired the data loss. If the repaired result is not adhering the placement policy, the auditor will mark it again.

horizonzy · 2022-06-27T08:56:09Z

How to use it?

If we want to repaired the ledger which ensemble is not adhering placement policy, we should config two param.

auditorPeriodicPlacementPolicyCheckInterval=3600
repairedPlacementPolicyNotAdheringBookieEnabled=true

In Auditor
auditorPeriodicPlacementPolicyCheckInterval control the placement policy check detect interval, repairedPlacementPolicyNotAdheringBookieEnabled control is mark ledger Id to under replication managed when found a ledger ensemble not adhere placement policy.

In ReplicationWorker
repairedPlacementPolicyNotAdheringBookieEnabled control is to repaired the ledger which ensemble not adhere placement policy.

Attention

we need ensure the config repairedPlacementPolicyNotAdheringBookieEnabled=true in Auditor and ReplicationWorker at the same time.
we also need the placement policy is same between Auditor and ReplicationWorker, cause both all need use placement policy to help to process.

hangc0276 · 2022-07-01T03:22:45Z

ping @merlimat @eolivelli @dlg99 @zymap @reddycharan Please help take a look at this PR, thanks.

hangc0276 · 2022-07-06T01:37:00Z

bookkeeper-server/src/main/java/org/apache/bookkeeper/meta/AbstractZkLedgerManager.java

@@ -402,6 +402,7 @@ public void registerLedgerMetadataListener(long ledgerId, LedgerMetadataListener
                }
            }
            synchronized (listenerSet) {
+                listenerSet = listeners.computeIfAbsent(ledgerId, k -> new HashSet<>());


Do you bing the previous change to this PR?

yes, I will remove it in this pr.

bookkeeper-server/src/main/java/org/apache/bookkeeper/replication/ReplicationWorker.java

hangc0276 · 2022-07-06T02:20:31Z

...-server/src/main/java/org/apache/bookkeeper/client/TopologyAwareEnsemblePlacementPolicy.java

+        Map<String, List<BookieNode>> toPlaceGroup = new HashMap<>();
+        for (BookieId bookieId : ensemble) {
+            //If the bookieId shutdown, put it to inactive.
+            BookieNode bookieNode = clone.get(bookieId);


If the bookie shutdown, it will be removed from knownBookies immediately. It belongs to DATA_LOSS type

In theory, If the the fragment is DATA_LOSS, it won't invoke this method. It will repair data loss firstly.
But the bookie maybe shutdown after ledger check, so here replace the shutdown bookie firstlt

...-server/src/main/java/org/apache/bookkeeper/client/TopologyAwareEnsemblePlacementPolicy.java

eolivelli · 2022-08-07T14:04:19Z

...-server/src/main/java/org/apache/bookkeeper/client/RackawareEnsemblePlacementPolicyImpl.java

+                return bn;
+            }
+        }
+        throw new BKNotEnoughBookiesException();


Can we log something if we reach to this point?

It already log, in doReplaceToAdherePlacementPolicy, it catch BKNotEnoughBookiesException and log it.

eolivelli · 2022-08-07T14:04:58Z

...-server/src/main/java/org/apache/bookkeeper/client/RackawareEnsemblePlacementPolicyImpl.java

+        }
+    }
+
+    private int differBetweenBookies(List<BookieId> bookiesA, List<BookieId> bookiesB) {


Nit: static?

eolivelli · 2022-08-07T14:06:50Z

bookkeeper-server/src/main/java/org/apache/bookkeeper/client/EnsemblePlacementPolicy.java

+            int ackQuorumSize,
+            Set<BookieId> excludeBookies,
+            List<BookieId> currentEnsemble) {
+        throw new UnsupportedOperationException();


What happens if you don't override this method?

Arw we handling this exception in the code that calls this method?

My understanding is that we cannot provide a good default implementation.

In the called code we could catch this exception, log something and abort gracefully the operation

Yes, that's fine.

horizonzy · 2022-08-08T00:28:02Z

@eolivelli ping, address the comment, could you review it again

bookkeeper-server/src/main/java/org/apache/bookkeeper/client/EnsemblePlacementPolicy.java

...-server/src/main/java/org/apache/bookkeeper/client/RackawareEnsemblePlacementPolicyImpl.java

bookkeeper-server/src/main/java/org/apache/bookkeeper/net/NetworkTopology.java

equanz · 2022-08-08T03:44:51Z

And what about the ~~default rack~~ shutdown bookies?
#2931 (review)

If it is being addressed, please let me know where it is.

horizonzy · 2022-08-08T05:10:09Z

And what about the default rack? #2931 (review)

If it is being addressed, please let me know where it is.

I means if we didn't handle the shutdown bookies, it will be handle as default-bookie, the default-rack is different with other bookie's rack, so it won't be replaced. Now we add the shutdown bookies to excludes nodes to replace it.

equanz · 2022-08-08T06:14:43Z

Now we add the shutdown bookies to excludes nodes to replace it.

Just confirm, are you mentioned here?
https://github.com/horizonzy/bookkeeper/blob/6ef1e2aa8ac0780bd8199b360e840506cbc85e5d/bookkeeper-server/src/main/java/org/apache/bookkeeper/client/RackawareEnsemblePlacementPolicyImpl.java#L1101-L1105

horizonzy · 2022-08-08T06:24:43Z

Now we add the shutdown bookies to excludes nodes to replace it.

Just confirm, are you mentioned here? https://github.com/horizonzy/bookkeeper/blob/6ef1e2aa8ac0780bd8199b360e840506cbc85e5d/bookkeeper-server/src/main/java/org/apache/bookkeeper/client/RackawareEnsemblePlacementPolicyImpl.java#L1101-L1105

yes

equanz

Thank you for your explanation.
LGTM.

StevenLuMT · 2022-08-24T08:00:47Z

fix old workflow,please see #3455 for detail

horizonzy · 2022-09-04T07:41:51Z

rerun failure checks

hangc0276 · 2022-09-05T02:36:49Z

rerun failure checks

hangc0276 · 2022-09-13T03:38:03Z

This PR is an enhancement for auto recovery, and the new interface has a default implementation, which is compatible with the old version. I suggest cherry-picking it to branch-4.14 and branch-4.15. Do you have any suggestions? @merlimat @eolivelli @dlg99 @rdhabalia @zymap

eolivelli · 2022-09-19T07:07:46Z

@hangc0276 please ask on dev@
I agree with you.
Also 4.15 and 4.16 have some problems that are blocking the adoption.

…pache#3359) (cherry picked from commit fc981ba)

rdhabalia · 2022-11-01T18:37:09Z

yes, we should cherry-pick it.

…pache#3359) (cherry picked from commit fc981ba)

…3359) (cherry picked from commit fc981ba)

horizonzy added 4 commits June 24, 2022 20:57

Support repaired not adhering placement policy ledger.

fb7384b

add unit test.

0108c98

code clean.

f7a3b49

tuning doc.

dd8d779

horizonzy changed the title ~~Feature: auto recover support repaired not adhering placement ledger~~ [WIP] Feature: auto recover support repaired not adhering placement ledger Jun 25, 2022

horizonzy added 2 commits June 26, 2022 14:35

code tuning

4a66b58

Complete the test case.

5433819

horizonzy changed the title ~~[WIP] Feature: auto recover support repaired not adhering placement ledger~~ Feature: auto recover support repaired not adhering placement ledger Jun 27, 2022

fix checkstyle.

12582a7

horizonzy force-pushed the feature-auto-recover-match-placement branch from 64bc687 to 12582a7 Compare June 27, 2022 07:57

horizonzy mentioned this pull request Jun 29, 2022

BP-54: Repaired the ledger fragment which ensemble not adhere placement policy. #3377

Open

zymap requested review from eolivelli, nicoloboschi and merlimat July 1, 2022 03:27

zymap assigned horizonzy Jul 1, 2022

zymap added type/feature area/autorecovery labels Jul 1, 2022

zymap added this to the 4.16.0 milestone Jul 1, 2022

zymap requested review from dlg99 and reddycharan July 1, 2022 03:29

hangc0276 reviewed Jul 6, 2022

View reviewed changes

horizonzy added 6 commits July 6, 2022 14:37

code clean.

1e49b22

do nothing when didn't find same rack bookies num > 1

d274c79

use replaceBookie.

552f2d1

fix check style.

f6eeb35

code clean.

828fcd0

enhance test cases.

cba288f

eolivelli approved these changes Aug 7, 2022

View reviewed changes

eolivelli self-requested a review August 7, 2022 14:07

address comment.

465f986

zymap reviewed Aug 8, 2022

View reviewed changes

bookkeeper-server/src/main/java/org/apache/bookkeeper/client/EnsemblePlacementPolicy.java Show resolved Hide resolved

horizonzy added 2 commits August 8, 2022 09:39

address comment.

965bf96

fix check style.

edefdeb

equanz reviewed Aug 8, 2022

View reviewed changes

bookkeeper-server/src/main/java/org/apache/bookkeeper/net/NetworkTopology.java Outdated Show resolved Hide resolved

address comment.

6ef1e2a

equanz approved these changes Aug 8, 2022

View reviewed changes

reset StaticDNSResolver after test.

85bd141

horizonzy closed this Aug 28, 2022

horizonzy reopened this Aug 28, 2022

eolivelli approved these changes Sep 5, 2022

View reviewed changes

eolivelli merged commit fc981ba into apache:master Sep 5, 2022

hangc0276 pushed a commit to streamnative/bookkeeper-achieved that referenced this pull request Oct 18, 2022

Feature: auto recover support repaired not adhering placement ledger (a…

ceb9d6a

…pache#3359) (cherry picked from commit fc981ba)

hangc0276 pushed a commit to streamnative/bookkeeper-achieved that referenced this pull request Mar 28, 2023

Feature: auto recover support repaired not adhering placement ledger (a…

b8ce33e

…pache#3359) (cherry picked from commit fc981ba)

TakaHiR07 mentioned this pull request May 25, 2023

[Bug] repaired not adhering placement ledger feature can not repair opened ledger #3971

Open

hangc0276 pushed a commit that referenced this pull request May 30, 2023

Feature: auto recover support repaired not adhering placement ledger (#…

6363f8c

…3359) (cherry picked from commit fc981ba)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature: auto recover support repaired not adhering placement ledger #3359

Feature: auto recover support repaired not adhering placement ledger #3359

horizonzy commented Jun 25, 2022 •

edited

horizonzy commented Jun 27, 2022

horizonzy commented Jun 27, 2022

hangc0276 commented Jul 1, 2022

hangc0276 Jul 6, 2022

horizonzy Jul 6, 2022

hangc0276 Jul 6, 2022

horizonzy Jul 6, 2022 •

edited

eolivelli Aug 7, 2022

horizonzy Aug 7, 2022

eolivelli Aug 7, 2022

horizonzy Aug 7, 2022

eolivelli Aug 7, 2022

horizonzy Aug 7, 2022

horizonzy commented Aug 8, 2022

equanz commented Aug 8, 2022 •

edited

horizonzy commented Aug 8, 2022

equanz commented Aug 8, 2022

horizonzy commented Aug 8, 2022

equanz left a comment

StevenLuMT commented Aug 24, 2022

horizonzy commented Sep 4, 2022

hangc0276 commented Sep 5, 2022

hangc0276 commented Sep 13, 2022

eolivelli commented Sep 19, 2022

rdhabalia commented Nov 1, 2022

Feature: auto recover support repaired not adhering placement ledger #3359

Feature: auto recover support repaired not adhering placement ledger #3359

Conversation

horizonzy commented Jun 25, 2022 • edited

horizonzy commented Jun 27, 2022

horizonzy commented Jun 27, 2022

hangc0276 commented Jul 1, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

horizonzy Jul 6, 2022 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

horizonzy commented Aug 8, 2022

equanz commented Aug 8, 2022 • edited

horizonzy commented Aug 8, 2022

equanz commented Aug 8, 2022

horizonzy commented Aug 8, 2022

equanz left a comment

Choose a reason for hiding this comment

StevenLuMT commented Aug 24, 2022

horizonzy commented Sep 4, 2022

hangc0276 commented Sep 5, 2022

hangc0276 commented Sep 13, 2022

eolivelli commented Sep 19, 2022

rdhabalia commented Nov 1, 2022

horizonzy commented Jun 25, 2022 •

edited

horizonzy Jul 6, 2022 •

edited

equanz commented Aug 8, 2022 •

edited