Skip to content

Commit

Permalink
Enable reorder read sequence for bk client by default (#4139)
Browse files Browse the repository at this point in the history
### Motivation

<!-- Explain here the context, and why you're making that change. What is the problem you're trying to solve. -->

If one ledger's ensemble is [bk0, bk1] and bk0 is down, the bookie client may send a read request to bk0 first then fail with the following errors, and resend the read request to bk1 in the end. 

```
2023-10-19T18:33:52,042 - ERROR - [BookKeeperClientWorker-OrderedExecutor-3-0:PerChannelBookieClient@563] - Cannot connect to 192.168.31.216:3181 as endpoint resolution failed (probably bookie is down) err org.apache.bookkeeper.proto.BookieAddressResolver$BookieIdNotResolvedException: Cannot resolve bookieId 192.168.31.216:3181, bookie does not exist or it is not running
2023-10-19T18:33:52,042 - INFO  - [BookKeeperClientWorker-OrderedExecutor-3-0:DefaultBookieAddressResolver@77] - Cannot resolve 192.168.31.216:3181, bookie is unknown org.apache.bookkeeper.client.BKException$BKBookieHandleNotAvailableException: Bookie handle is not available
2023-10-19T18:33:52,042 - INFO  - [BookKeeperClientWorker-OrderedExecutor-3-0:PendingReadOp$LedgerEntryRequest@223] - Error: Bookie handle is not available while reading L6 E40 from bookie: 192.168.31.216:3181
```
One of the related issues is in the auto-recovery decommission and there is one PR in the BookKeeper repo: #4113

However, the bookie client already knows the bk0 is down and we should send the read request to bk1 first. So we can reorder the read request based on the known bookie list. If one bookie is lost, it will reorder the lost bookie to the end of the read list.

### Modifications

<!-- Describe the modifications you've done. -->

Enable the `reorderReadSequence` by default for auto-recovery.
  • Loading branch information
hangc0276 committed Feb 1, 2024
1 parent 1eceb5d commit dc2bb1d
Show file tree
Hide file tree
Showing 4 changed files with 11 additions and 5 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -1148,7 +1148,7 @@ public ClientConfiguration setRecoveryReadBatchSize(int batchSize) {
* @return true if reorder read sequence is enabled, otherwise false.
*/
public boolean isReorderReadSequenceEnabled() {
return getBoolean(REORDER_READ_SEQUENCE_ENABLED, false);
return getBoolean(REORDER_READ_SEQUENCE_ENABLED, true);
}

/**
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -96,6 +96,7 @@ public abstract class MockBookKeeperTestCase {
protected BookieClient bookieClient;
protected LedgerManager ledgerManager;
protected LedgerIdGenerator ledgerIdGenerator;
protected EnsemblePlacementPolicy placementPolicy;

private BookieWatcher bookieWatcher;

Expand Down Expand Up @@ -152,6 +153,7 @@ public void setup() throws Exception {
scheduler = OrderedScheduler.newSchedulerBuilder().numThreads(4).name("bk-test").build();
executor = OrderedExecutor.newBuilder().build();
bookieWatcher = mock(BookieWatcher.class);
placementPolicy = new DefaultEnsemblePlacementPolicy();

bookieClient = mock(BookieClient.class);
ledgerManager = mock(LedgerManager.class);
Expand Down Expand Up @@ -194,7 +196,7 @@ public BookieWatcher getBookieWatcher() {

@Override
public EnsemblePlacementPolicy getPlacementPolicy() {
return null;
return placementPolicy;
}

@Override
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -85,7 +85,7 @@ public class ReadLastConfirmedAndEntryOpTest {
private ScheduledExecutorService scheduler;
private OrderedScheduler orderedScheduler;
private ClientInternalConf internalConf;
private EnsemblePlacementPolicy mockPlacementPolicy;
private EnsemblePlacementPolicy placementPolicy;
private LedgerMetadata ledgerMetadata;
private DistributionSchedule distributionSchedule;
private DigestManager digestManager;
Expand Down Expand Up @@ -121,10 +121,11 @@ public void setup() throws Exception {
.build();

this.mockBookieClient = mock(BookieClient.class);
this.mockPlacementPolicy = mock(EnsemblePlacementPolicy.class);
//this.mockPlacementPolicy = mock(EnsemblePlacementPolicy.class);
this.placementPolicy = new DefaultEnsemblePlacementPolicy();
this.mockClientCtx = mock(ClientContext.class);
when(mockClientCtx.getBookieClient()).thenReturn(mockBookieClient);
when(mockClientCtx.getPlacementPolicy()).thenReturn(mockPlacementPolicy);
when(mockClientCtx.getPlacementPolicy()).thenReturn(placementPolicy);
when(mockClientCtx.getConf()).thenReturn(internalConf);
when(mockClientCtx.getScheduler()).thenReturn(orderedScheduler);
when(mockClientCtx.getMainWorkerPool()).thenReturn(orderedScheduler);
Expand Down
3 changes: 3 additions & 0 deletions conf/bk_server.conf
Original file line number Diff line number Diff line change
Expand Up @@ -1057,6 +1057,9 @@ statsProviderClass=org.apache.bookkeeper.stats.prometheus.PrometheusMetricsProvi
# Enable/disable having read operations for a ledger to be sticky to a single bookie.
stickyReadSEnabled=true

# Enable/disable reordering read sequence on reading entries.
reorderReadSequenceEnabled=true

# The grace period, in milliseconds, that the replication worker waits before fencing and
# replicating a ledger fragment that's still being written to upon bookie failure.
# openLedgerRereplicationGracePeriod=30000
Expand Down

0 comments on commit dc2bb1d

Please sign in to comment.