-
Notifications
You must be signed in to change notification settings - Fork 888
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Issue 709: Add Slow Bookkeeper Servers to Placement Policy for read ordering #776
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the idea of adding a bean instead of raw collections.
Averall lokks good to me, but I left some minor comments
@@ -79,7 +85,8 @@ | |||
static final int REMOTE_MASK = 0x04 << 24; | |||
static final int REMOTE_FAIL_MASK = 0x08 << 24; | |||
static final int READ_ONLY_MASK = 0x10 << 24; | |||
static final int UNAVAIL_MASK = 0x20 << 24; | |||
static final int SLOW_MASK = 0x20 << 24; | |||
static final int UNAVAIL_MASK = 0x40 << 24; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can't we preserve old value for constant ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, the order is important, since candidate bookies are sorted by these masks. You want to select a slow bookie before an unavailable one. You probably want to select a slow bookie before a readonly bookie too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok. I understand
@Override | ||
public void registerSlowBookie(BookieSocketAddress bookieSocketAddress, long entryId) { | ||
slowBookies.put(bookieSocketAddress, entryId); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: remove empty line
DistributionSchedule.WriteSet expectedSet | ||
= writeSetFromValues(3, 1, 2, 0); | ||
LOG.info("reorder set : {}", reorderSet); | ||
assertFalse(reorderSet.equals(origWriteSet)); | ||
assertEquals(expectedSet, reorderSet); | ||
} | ||
|
||
@Test | ||
public void testnodeSlow() throws Exception { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: testNodeSlow
* @param bookieSocketAddress | ||
* @return failed entries on a bookie, -1 if there has been no failures. | ||
*/ | ||
long getBookieFailureHistory(BookieSocketAddress bookieSocketAddress) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please make this and other getters public, placement policies are pluggable, so if you pass an object to a policy it should be able to "use" it
* This class encompasses heuristics used to determine the health of a Bookkeeper server for read | ||
* ordering. Currently we use the history of failed entries and length of the pending requests queue. | ||
*/ | ||
public class BookKeeperServerHealthInfo { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this class provides info about multiple bookies.
what about "BookiesHealthInfo" ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@eolivelli If naming is not a big concern, let's skip it or rename it later, since it is a port back from twitter branch.
I think it is good to make this class an interface, since it will be used and exposed in EnsemblePlacementPolicy. we should do similar thing as DistributedSchedule.WriteSet. However, preserve the naming here and do a separate change to make it an interface, it will make change porting easier for them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok I didn't know it is a port. Maybe it was not clear from the description.
+1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok I didn't know it is a port. Maybe it was not clear from the description.
+1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I could make this change - it wasn't a clean port so I had to reimplement a lot of stuff anyways.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a great idea.
A question in my mind is that, How it comes when a bookie is first slow, but then after a fix or something it becomes fast?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 lgtm, we can improve code in later changes
What is the rationale for trying read-only before slow? Surely a write to the read-only is likely to fail. |
lh.bookieFailureHistory.asMap(), | ||
lh.distributionSchedule.getWriteSet(eId)); | ||
bookKeeperServerHealthInfo, | ||
lh.distributionSchedule.getWriteSet(eid)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
eid? I don't see this defined in the local scope.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure why it's showing "eId" when the change looks like it says "entryId" now. In any case, this value defined in the super class LedgerEntry:
abstract class LedgerEntryRequest extends LedgerEntry implements SpeculativeRequestExecutor {
and
public class LedgerEntry
implements org.apache.bookkeeper.client.api.LedgerEntry {
final long ledgerId;
long entryId;
long length;
ByteBuf data;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually looks like the LedgerEntry interface was changed in a recent commit - I'll need a bit more time to incorporate that change. Will submit changes soon.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
boolean completed = super.complete(bookieIndex, host, buffer); | ||
if (completed && lh.bk.getConf().getEnsemblePlacementPolicySlowBookies()) { | ||
int numReplicasTried = getNextReplicaIndexToReadFrom(); | ||
// Check if any speculative reads were issued and mark any slow bookies before |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
broken indent
@Override | ||
boolean complete(int bookieIndex, BookieSocketAddress host, ByteBuf buffer) { | ||
boolean completed = super.complete(bookieIndex, host, buffer); | ||
if (completed && lh.bk.getConf().getEnsemblePlacementPolicySlowBookies()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i don't think we need to check if this is enabled here. if disabled, the placement policy should just ignore the extra info.
int ensembleSize = ensemble.size(); | ||
|
||
for (int i = 0; i < writeSet.size(); i++) { | ||
int idx = writeSet.get(i); | ||
Integer idx = writeSet.get(i); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can probably do this without autoboxing, but not something to worry about for this change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
writeSet.set(i, idx | UNAVAIL_MASK); | ||
} else { | ||
writeSet.set(i, idx | READ_ONLY_MASK); | ||
if (slowBookies.asMap().containsKey(address)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yikes, this is going to run for each entry and allocates a map. Why not use getIfPresent?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm, maybe not. Still though, getIfPresent is more direct
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think there is getIfPresent
in LoadingCache. If I am correct, the only way to get a key from loading cache without loading, is asMap().containsKey(address))
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sijie is correct - there isn't a getIfPresent() function for LoadingCaches. Another option is to call get() and if it returns -1 (defined by the LoadingCache constructor), we can assume it doesn't exist. Thoughts on this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think both get
and getUnchecked
do loading the key. we don't need loading a key here, so I think we should stick to asMap().containsKey(address)
here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
getIfPresent is part of Cache, which LoadingCache inherits from.
https://google.github.io/guava/releases/20.0/api/docs/com/google/common/cache/LoadingCache.html
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed in latest commit
for (int i = 0; i < writeSet.size(); i++) { | ||
writeSetWithRegion.put(writeSet.get(i), ""); | ||
} | ||
return reorderReadSequenceWithRegion(ensemble, writeSet, writeSetWithRegion, bookKeeperServerHealthInfo, false, "", writeSet.size()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is all this region stuff going into rack aware placement policy? Shouldn't it be region aware?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good question. The reasoning is that I wanted to try to avoid duplication in RackAware and RegionAware placement policies, so I put the shared code in RackAwarePlacementPolicyImpl so that both of the policies can access it since RegionAware is a sublcass of RackAware.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok. We should probably refactor this at some point. methods with loads of parameters and boolean switches become very hard to follow and maintain.
I am not sure about. I think slow should be before read-only. @philipsu522 @yzang can you guys answer this? |
@jiazhai, the loading cache will evict the slow bookie after a certain amount of time based on last write , so if it becomes fast then there won't be any speculative reads. No more speculative reads means that the cache won't be updated for the bookie, and it will be evicted after a certain time. |
I think that we put slow bookies behind read-only bookies thinking that slow bookies are more likely to hurt performance, while bookies could be in RO for any number of reasons. However, based on experience of everyone else, if RO bookies seem to be less reliable than slow bookies, I don't mind changing the order so that it is available bookies, slow bookies, RO bookies...
I believe that the new feature in placement policy is used for reads, not writes |
@philipsu522 re: slow vs read only, makes sense. I had it in my head that this was used for writes, but reordering only happens for reads. |
@philipsu522 could you merge master into this, and address any comments when you have a chance. If it's left too long it'll be a nightmare to get in. |
Yep, sorry about that - I'll get this done today |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM +1
@philipsu522 can you help me rebase this to latest master?
Author: eolivelli <eolivelli@apache.org> Reviewers: Sijie Guo <sijie@apache.org> This closes apache#708 from eolivelli/notice-jline, closes apache#707
Upcoming changes are using GRpc in conjunction with protobuf. GRpc generates code that created deprecation warnings when compiled with java 8, so this change moves all generated code out to another module, so that we don't have to turn off -Werror for all code. In any case, at some point we should split the bookkeeper client out from the server module, at which point we would need the definitions on a common place. Author: Ivan Kelly <ivank@apache.org> Reviewers: Enrico Olivelli <eolivelli@gmail.com>, Jia Zhai <None>, Sijie Guo <sijie@apache.org> This closes apache#711 from ivankelly/proto-module, closes apache#691
Descriptions of the changes in this PR: - release `toSend` bytebuf only when a PendingAddOp is recycled. - fix MockBookKeeperTestCase to run callbacks in the same executor thread. Author: Sijie Guo <sijie@apache.org> Reviewers: Enrico Olivelli <eolivelli@gmail.com>, Jia Zhai <None> This closes apache#717 from sijie/issue_176, closes apache#716
Descriptions of the changes in this PR: This request is for DL bookkeeper version upgrade. Because twitter's bk version doesn't include this change, so any ledgers written by new bk version will fail to be read by old version. To simplify the upgrade story, it is good to allow not serialize ctime field. Author: Sijie Guo <sijie@apache.org> Reviewers: Enrico Olivelli <eolivelli@gmail.com>, Jia Zhai <None>, Sijie Guo <sijie@apache.org>, Matteo Merli <mmerli@apache.org> This closes apache#718 from sijie/flag_to_not_serialize_ctime, closes apache#490
…bookies Descriptions of the changes in this PR: This is a change ported from twitter branch. - Enable recovering multiple bookies in bookie shell - Add a unit test for BookieShell. Author: Sijie Guo <sijie@apache.org> Reviewers: Enrico Olivelli <eolivelli@gmail.com>, Jia Zhai <None> This closes apache#714 from sijie/recover_command_fix, closes apache#612
Descriptions of the changes in this PR: - add `getLength()` to ReadHandle interface - add `getLastAddPushed` to WriteHandle interface - add unit tests and fix issue in MockBookKeeperTestCase Author: Jia Zhai <jiazhai@users.noreply.github.com> Author: Sijie Guo <sijie@apache.org> Reviewers: Enrico Olivelli <eolivelli@gmail.com>, Jia Zhai <None> This closes apache#715 from sijie/api/support_get_length, closes apache#694
Integrating some changes made to Yahoo bookkeeper by Matteo Merli. The apache bookkeeper thread pool in use by OrderedScheduler is different, and does not have access to setting a queue size limit, so I created QueueAssessibleExecutorService, and check the queue size instead. Other than that, this code change is pretty similar to Yahoo's. Original bug description for this change: Since we are using a thread pool to handle read requests in bookies, we have seen that when the Bookie read IO is maxed out, the requests are being accumulated in the bookie. Essentially, the bookie is busy serving read requests that have already been timed out on the client side (end to end read latency can reach hours..). The queue keeps growing indefinitely, leading to OOM errors and bookie restart. All unit tests pass for me locally with this change. Author: Aaron Gresch <agresch@yahoo-inc.com> Reviewers: Ivan Kelly <ivan@ivankelly.net>, Jia Zhai <None>, Sijie Guo <sijie@apache.org> This closes apache#706 from agresch/agresch_cms1254
Descriptions of the changes in this PR: Change the link of slack signup from http://apachebookkeeper.slack.com/ to https://apachebookkeeper.herokuapp.com/ in https://bookkeeper.apache.org/community/slack/ and click the self-register link. Author: Jia Zhai <zhaijia@apache.org> Reviewers: Sijie Guo <sijie@apache.org> This closes apache#725 from jiazhai/issue-724, closes apache#724
Descriptions of the changes in this PR: Problem: The problem is introduced after making storing ctime optional. the verification become inconsistent because the way how ctime was assigned and compared. It causes the test cases failing in apache#726 Solution: This change is to change `isConflict` to ignore comparing ctime if client disables storing system time as ctime. Author: Sijie Guo <sijie@apache.org> Reviewers: Enrico Olivelli <eolivelli@gmail.com>, Jia Zhai <None> This closes apache#730 from sijie/fix_ctime_issue, closes apache#726
Descriptions of the changes in this PR: mark new api as experimental - `InterfaceStability.Unstable` Author: Sijie Guo <sijie@apache.org> Reviewers: Ivan Kelly <ivan@ivankelly.net>, Enrico Olivelli <eolivelli@gmail.com>, Jia Zhai <None> This closes apache#736 from sijie/mark_api_experimental, closes apache#713
… are moved to bookkeeper-proto Descriptions of the changes in this PR: Problem: The issue was introduced after protobuf files are moved to bookkeeper-proto. so the protobuf is not shaded correctly. So in bookkeeper-benchmark, it is conflicted with the protobuf version introduced by hadoop. Solution: - update the bookkeeper-server pom.xml to shade protobuf correctly - remove hadoop dependency from bookkeeper-benchmark. Author: Sijie Guo <sijie@apache.org> Reviewers: Ivan Kelly <ivan@ivankelly.net>, Enrico Olivelli <eolivelli@gmail.com>, Jia Zhai <None> This closes apache#735 from sijie/protobuf_issue, closes apache#734
…long poll read Descriptions of the changes in this PR: 1, add a class LastAddConfirmedAndEntry and metnod readLastAddConfirmedAndEntry() in ReadHandle; 2, add implementation for readLastAddConfirmedAndEntry in LedgerHandler; 3, add testcase in BookKeeperApiTest; 4, remove un-used imports, break down long lines, fix wrong comments. Author: Jia Zhai <zhaijia@apache.org> Reviewers: Enrico Olivelli <eolivelli@gmail.com>, Sijie Guo <sijie@apache.org>, Venkateswararao Jujjuri (JV) <None> This closes apache#729 from zhaijack/issue-550, closes apache#550
…4.5.0 Descriptions of the changes in this PR: - 4.5.1 should appear before 4.5.0 - bump dlog version to 0.5.0 Author: Sijie Guo <sijie@apache.org> Reviewers: Enrico Olivelli <eolivelli@gmail.com> This closes apache#748 from sijie/issue_747, closes apache#747
In LedgerEntry interface, getEntry() and getEntryBuffer() have completely different behaviours. getEntry() releases bytebuf automatically, while getEntryBuffer() returns the bytebuf (if you don't call getEntry, you are responsible for releasing the entry buffer. In this change: make getEntry() reenter-able; make LedgerEntry implement AutoCloseable; LedgerEntry is responsible for releasing the bytebuf it is holding. Descriptions of the changes in this PR: 1. add `close` and `duplicate` in `api.LedgerEntry`, 1. LedgerEntryImpl implements `api.LedgerEntry`, 1. client.LedgerEntry doesn't implement the new api. it wraps an `api.LedgerEntry`, 1. add `close` in api.LastConfirmedAndEntry." 1. fix build and test errors. Author: Jia Zhai <zhaijia@apache.org> Reviewers: Enrico Olivelli <eolivelli@gmail.com>, Sijie Guo <sijie@apache.org> This closes apache#738 from zhaijack/issue-731, closes apache#731
Descriptions of the changes in this PR: - move src/ to bookkeeper-dist/src/ - built two packages in bookkeeper-dist: one is `bookkeeper-server` - server only package with prometheus stats provider and vertx http server, the other one is `bookkeeper-all` - includes all packages. - remove assembly plugin from bookkeeper-server module - update release guide Author: Sijie Guo <sijie@apache.org> Reviewers: Enrico Olivelli <eolivelli@gmail.com>, Jia Zhai <None> This closes apache#737 from sijie/add_slim_module, closes apache#688
Descriptions of the changes in this PR: The documentation for release `4.6.0` was copied by apache#696. But this release is not updated in the releases menu. This change is generated by running `site/scripts/release.sh` for preparing 4.6.0 release. Author: Sijie Guo <sijie@apache.org> Reviewers: Enrico Olivelli <eolivelli@gmail.com>, Jia Zhai <None> This closes apache#752 from sijie/update_4.5.1_release_date
Author: eolivelli <eolivelli@apache.org> Reviewers: Jia Zhai <None>, Sijie Guo <sijie@apache.org> This closes apache#758 from eolivelli/updaterelmd451-460
Descriptions of the changes in this PR: The master version was not updated when creating `branch-4.6`. Author: Sijie Guo <sijie@apache.org> Reviewers: Jia Zhai <None> This closes apache#767 from sijie/bump_master_version, closes apache#766
…ndle, WriteAdvHandle, ReadHandle Descriptions of the changes in this PR: - WriteHandle supports append byte[], ByteBuf and ByteBuffer - WriteAdvHandle supports write byte[], ByteBuf and ByteBuffer - LedgerEntry returns byte[], ByteBuffer and ByteBuf Author: Sijie Guo <sijie@apache.org> Reviewers: Ivan Kelly <ivan@ivankelly.net>, Enrico Olivelli <eolivelli@gmail.com>, Jia Zhai <None> This closes apache#755 from sijie/issue_750, closes apache#750
Context in apache#753 For deployments where a RAID battery-backed cache is not available and where durability is not a requirements, we should allow bookies to rely on page cache and relax durability. Author: Matteo Merli <mmerli@apache.org> Reviewers: Enrico Olivelli <eolivelli@gmail.com>, Sijie Guo <sijie@apache.org> This closes apache#754 from merlimat/option-to-disable-sync-master, closes apache#753
Descriptions of the changes in this PR: - add `org.apache.bookkeeper.client.api` in `BookKeeper Client (New Fluent API - Experimental)` - move prometheus package to `BookKeeper Stats Providers` Author: Sijie Guo <sijie@apache.org> Reviewers: Ivan Kelly <ivank@apache.org>, Jia Zhai <zhaijia@apache.org> This closes apache#768 from sijie/fix_javadoc_layout, closes apache#763
In the new API proposed in apache#506 , we return Iterable<LedgerEntry> for read entries. It is a bit problematic when returning multiple iterators and have no ability to release the entry buffers hold by ledger entries. In this change, the implementation of retainIterator() increased the reference of the buffers when creating an iterator, it will be easy for byteBuf's management; it also make LedgerEntries a recycle object, and it is returned to the pool when #close() is called. on #close(), it also releases all the references to release resources. Descriptions of the changes in this PR: 1. add interface LedgerEntries; 2. add implemtation of LedgerEntries; 3. fix related build and test errors. Author: Jia Zhai <zhaijia@apache.org> Reviewers: ivankelly, sijie, eolivelli This closes apache#727 from zhaijack/issue-693, closes apache#693
Descriptions of the changes in this PR: - Add documentation for the new fluent API: create/open/delete, append/write/read, createadv Author: Sijie Guo <sijie@apache.org> Reviewers: Enrico Olivelli <eolivelli@gmail.com>, Jia Zhai <None> This closes apache#764 from sijie/api_documentation, closes apache#553
@sijie I just rebased master |
@@ -318,6 +344,30 @@ void writeLedgerConfig(GenericCallback<Void> writeCb) { | |||
bk.getLedgerManager().writeLedgerMetadata(ledgerId, metadata, writeCb); | |||
} | |||
|
|||
/** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Duplicate code from method above.
* | ||
* @return BookiesHealthInfo for every bookie in the write set. | ||
*/ | ||
public BookiesHealthInfo generateHealthInfoForWriteSet( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is very allocatey for something that runs for every read request. It allocates a new map for bookiePendingMap, and converts bookieFailureHistory into a map, and allocates a new BookiesHealthInfo object, but all of these are just references to info elsewhere in the client.
Instead of this, we can have BookiesHealthInfo as a interface, and have a member field in LedgerHandle or BookKeeper.
final BookiesHealthInfo bookieHealthInfo = new BookiesHealthInfo() {
@Override
public long getBookieFailureHistory(BookieSocketAddress bookieSocketAddress) {
Long lastFailure = bookieFailureHistory.getIfPresent(bookieSocketAddress);
return lastFailure == null ? -1L : lastFailure;
}
@Override
public int getBookiePendingRequests(BookieSocketAddress bookieSocketAddress) {
PerChannelBookieClientPool pcbcPool = client.lookupClient(address);
return pcbcPool == null ? 0 : pcbcPool.getNumPendingCompletionRequests();
}
}
@ivankelly makes sense - pushed a change to convert BookiesHealthInfo into an interface |
@ivankelly can you review this PR again after @philipsu522 addressed your comments? |
@sijie @philipsu522 I'll take a look first thing tomorrow. Got caught up on other things today. @philipsu522 Seems this needs another merge from master. I guess it's checkstyle changes, so the easiest thing to do would be take your changes in the conflict resolution, compile and fix any issues again. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@philipsu522 lgtm. Just needs to be updated to master before merging in.
Descriptions of the changes in this PR:
Maintain a list of slow bookkeeper hosts, which are tried only after readonly bookkeeper hosts are tried. Bookkeeper hosts can be categorized as "slow" when a speculative timeout occurs on that bookkeeper host.