Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cache replicated entries & don't block main task w/ replication #88

Merged
merged 3 commits into from
Nov 24, 2020

Conversation

thedodd
Copy link
Collaborator

@thedodd thedodd commented Nov 20, 2020

closes #12
closes #76
closes #87

With this change, we are also caching entries which come from the leader
replication protocol. As entries come in, we append them to the log and
then cache the entry. When it is safe to apply entries to the state
machine, we will take them directly from the in-memory cache instead of
going to disk.

Moreover, and most importantly, we are not longer blocking the
AppendEntries RPC handler with the logic of the state machine
replication workflow. There is a small amount of async task juggling to
ensure that we don't run into situations where we would have two writers
attempting to write to the state machine at the same time. This is
easily avoided in our algorithm.

closes #12
closes #76
@thedodd thedodd added bug Something isn't working enhancement New feature or request replication Related to the replication system labels Nov 20, 2020
@thedodd thedodd self-assigned this Nov 20, 2020
@thedodd
Copy link
Collaborator Author

thedodd commented Nov 20, 2020

@MarinPostma & @sunli829 I was wondering if you two would be interested in reviewing this PR. Pretty happy with how simple this turned out to be. There were some complexities to think through to ensure that we don't have any Raft "safety" violations as we transition from follower to leader, as cluster leadership changes, and to ensure the cache can be trusted. Fortunately, all quite simple at the end of the day.

@thedodd thedodd force-pushed the 12-cache-replicated-entries-and-dont-block branch from 6cb62ac to 1441e7b Compare November 20, 2020 02:26
@MarinPostma
Copy link
Contributor

Hello @thedodd ! That was fast! I didn't know about OrderedFutures but that nicely does the trick! I will try it later today. Thank you!!

The log index provided to the log compaction interface was a bit
misleading. When performing log compaction, the compaction can only
cover the breadth of the log up to the last applied log (obvs) and under
write load, this value may change quickly. As such, the expectations of
the log compaction interface have been refined and clarified.

Now, the only expectation is that the storage implementation will
export/checkpoint/snapshot its state machine, and then use the value of
that export's last applied log as the metadata indicating the breadth of
the log covered by the snapshot.
@thedodd thedodd merged commit 5835bf4 into master Nov 24, 2020
@thedodd thedodd deleted the 12-cache-replicated-entries-and-dont-block branch November 24, 2020 15:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working enhancement New feature or request replication Related to the replication system
Projects
None yet
2 participants