-
Notifications
You must be signed in to change notification settings - Fork 156
Handle compacted entries during AppendRequest consistency checks #60
Comments
bTW the conversation referenced is in #29 |
Actually, the description above is somewhat wrong... Entries where // For non-null entries, we determine whether the entry should be exposed to the Raft algorithm
// based on the type of entry and whether it has been cleaned.
if (entry != null) {
// If the entry has not been cleaned by the state machine, return it. Note that the call to isClean()
// on the segment will be done in O(1) time since the search was already done in the get() call.
if (!segment.isClean(index)) {
return entry;
}
// For tombstone entries, if the entry index is greater than the majorIndex, return the entry. This ensures that entries
// are replicated until persisted on all available servers.
else if (entry.isTombstone()) {
if (index > compactor.majorIndex()) {
return entry;
}
}
// If the entry is not a tombstone, return the entry if the index is greater than minorIndex. If the entry is less than minorIndex,
// that indicates that it has been committed and events have been received for the entry, so we can safely stop replicating it.
else {
if (index > compactor.minorIndex()) {
return entry;
}
}
} This hides entries for which events have been received and tombstones that have been stored on all servers as necessary. But an additional condition needs to be added to handle the case of the last committed entry: if (index >= segments.commitIndex()) {
return entry;
} The last case will ensure that |
i think the fix for handling
null
(cleaned) entries duringAppendRequest
consistency checks may be wrongheaded:5001868
So, to redefine the problem, when checking the term of the previous entry while handling an
AppendEntries
RPC, if the entry has been cleaned it will benull
. That causes the consistency check to fail. What we did in the fix for this way ensure thatglobalIndex
is always the lowestmatchIndex
minus1
, but this only fixes the problem for tombstones. Normal entries can still beclean()
ed and thus hidden from reading after theglobalIndex
. So it's likely this problem would still occur if theNoOpEntry
in our scenario were not changed to a tombstone.I think the case is actually that we could safely hide all committed and cleaned entries. If an entry returns
null
from the log, we know it must necessarily have been committed and therefore consistency checks are not necessarily even needed. Any leader that's elected will have the same entry at the same term simply as a product of the Raft election algorithm.But I think just to feel good about it we should still keep the last committed entry in the log visible at all times. As I mentioned, the log compaction process already ensures that the last committed entry always remains on disk and will never be removed, so I think I was wrong and you were right @madjam that we should simply ensure the last committed entry in the log always returns a non-null entry. So, a property of the log would be that entries from
commitIndex
tolastIndex
are always non-null, and that would make this feel a lot better.The text was updated successfully, but these errors were encountered: