-
Notifications
You must be signed in to change notification settings - Fork 156
Allow servers to compact their logs once all events have been acked #31
Conversation
So, there's actually still a problem with this algorithm. Consider the following scenario:
There are a couple ways this could be handled. First, servers could keep only compact logs up to the previous command for which an event was published to the client. However, the downside to this is that a session that receives events infrequently could keep the log from being compacted. If a session receives one event over its entire lifetime, logs can't be compacted until the session is unregistered. In that case, the But a better way to do this might be to use the client's There's also a liveness issue here with respect to log compaction. If log compaction is dependent on clients acking events received via their session, that implies that a client must have events to ack. However, this may frequently not be the case. Some mechanism is necessary to ensure that log compaction advances for sessions that don't receive event messages or infrequently receive events. If a client never receives published events, how can it acknowledge that it knows about an event index? To resolve this final issue, I propose using information about the event history following the last event received by a client. Once a client acks an event at index |
…index if no events are queued.
…d command outputs in the internal state machine.
This PR modifies the way logs are committed/compacted to fix #25.
Currently, Copycat does not allow servers to compact any segment that contains entries that haven't been stored on
100%
of the servers in the cluster. The reasoning behind this decision was because session events are dependent on commands being applied to the state machine. When events are published to a session, they're held in memory on each server until received by the client. Events are sequenced in a manner similar to how the leader sequences entries to followers. When an event batch is sent to a client, the server that sends the batch with the index of the previous event. The client verifies that it received the previous event and acks the new event in the response. While it's normally safe for a server to take a snapshot before other servers have applied a command, the dependency of events on commands means if one server doesn't apply all commands for each session, servers can end up with inconsistent event histories. Thus, in order to avoid this case, Copycat currently requires that a command be persisted on all servers before it can be removed from the log.But this approach is not ideal for obvious reasons. As long as at least one server in a configuration is down, no server can compact its logs. This PR proposes a fix to allow servers to compact their logs once all events have been received by their respective clients.
Servers already track which events have been received by which clients through keep-alive requests. When a client sends a
KeepAliveRequest
for an open session, it sends with it the index of the highest event for which it has received a batch. Event indexes are based on the commands by which events were triggered. For instance, if the application of a command at index100
triggers the publishing of an event to 5 sessions, theindex
of that event will be100
. Once each session has received event100
and sends aKeepAliveRequest
, aKeepAliveEntry
with the highest index received by the client is logged, replicated, committed, and applied to the internal state machine.So, since the system already logs and replicates information about the highest event index received for each session, and because event indexes are based on log indexes, we can use the lowest acked event index for all sessions as the basis for safe log compaction. When a client submits a
KeepAliveRequest
and an associatedKeepAliveEntry
is committed and applied, each server recalculates the lowest index received by any session. Session IDs are initialized to the index of the session'sRegisterEntry
, and so each session's event index is also initialized to that index as well (by definition, sessions can't receive events prior to their registration). This ensures that the lowest received index is not reset to0
when a new session is registered. When the lowest index received by all sessions is recalculated, the log is updated to make segments containing earlier entries available for compaction.