-
Notifications
You must be signed in to change notification settings - Fork 136
Conversation
The (open) segments now contain the first index of the encoded log entries and a checksum over the format and first index fields. The 24 byte header looks like: _________________________________________ |[0 - 7] | [8-15] | [16-19] | [20-23]| |FORMAT | FIRST_INDEX | CRC | UNUSED| |________|_____________|_________|________| This allows us to no longer guess the start index and do more direct checks. This also allows removing the barrier when taking a snapshot.
Downgrades all segments in a directory from version 2 to version 1. Use in case one has to downgrade dqlite versions.
I would be in favor of removing the non-blocking variant of the barrier. I didn't think about it through, so not sure if there are other alternatives. If the only options are 1) the new format 2) fire the barrier for the very first sector, then 1) sounds a bit better, although I'm a tad surprised that it requires a large diff (I didn't go through it yet). Option 2) might be more palatable if we want to avoid the need of an additional recovery program, or if it boils down to a simpler change. I'll try to think of alternatives, but I don't have much time these days. I'm not totally sure a recovery program is needed, but I can see why you'd want it. If we could go without a recovery program and if the change can be simplified, perhaps option 1) is better. Just first impression / thinking loud :) |
yeah, the diff is large because I tried to simplify a bunch of logic and drop some branches, but you don't really have to review this, it's more for the discussion, if I would open a final PR I would split this in smaller commits. |
Perhaps there is a third option. Unless I'm missing something, the issue we have is that we can't disambiguate between these two situations:
Situation 1. was the only one the code was handling up to this commit, which made the code correctly handle situation 2. but regressed the handling of situation 1., which was then fixed again by PR #264, where the ambiguity was identified (but not the regression that was introduced by the older commit). Indeed when merging #264 we realized that strictly speaking that fix was only needed for the very first snapshot, which is basically situation 1. (which as said we didn't realize that we had regressed in the first place). What about if we simply name the very first open segment (the one starting at entry 1) with a special filename (e.g. After loading entries in |
I think that could work. |
Yeah, it should indeed, that's logically equivalent to adding the entry index to the header of the segment, but we'd use the filename instead of the header, and we'd do that only for the very first open segment, which is the one we need to disambiguate. Note also that in case an open segment is found with a name different than |
That will break existing servers that upgrade I think. If they get powered down before |
Another problem is with new joining servers I think, they have pristine state so they will start with |
Good point. One way to handle that could be to bump the version number contained in the segment header. That would allow to disambiguate between the two situations. However, in practice I think right now we don't need to put in place neither the check nor the version bump. Introducing the check could be simply deferred further in the future when we'll be reasonably confident that older versions have been upgraded. |
We could also introduce a new standalone file (say But all things considered, I believe it's not practically needed right now, and we can introduce it at any time. |
Good point too. If this case exhausts all possible cases, it doesn't sound too bad. But perhaps there are better alternatives.
Doing a barrier on the first snapshot is fine, we can refine an alternate solution later on. |
Yeah, it's probably not too bad, I'm going to give it a go ... :-) |
This is work in progress I started a while ago to encode the start index in the segment header, leading to a new on-disk segment format.
The 24 byte header would look like: the crc is over the format and the first index header fields.
old 8 byte header
It's quite a big change and would lead to the following consequences:
non-blocking barrier
, a barrier that still allows writes to new segments and simplify the code.The goal is not to review this PR, but start discussion if this is a worthwhile change to do. The barrier logic is already hard enough as it is, and the blocking / non-blocking stuff makes it harder, especially when multiple barriers can be interacting with each other. I'm thinking a "blocking" truncate barrier following an already active non-blocking barrier, then I think we should upgrade the non-blocking barrier to blocking because we don't want to append any new writes etc.... but this just becomes a huge mess that could be avoided if we got rid of that distinction altogether.
edit: I could also just remove the non-blocking barrier and only fire a barrier for the very first snapshot like free suggests here #264 (comment) and then live with the fact that writes will be blocked when that snapshot is being taken, but that doesn't sound like a huge problem. I mainly opened this PR and discussion to simplify the barrier logic because I need to touch it to solve #372 and the blocking / non-blocking distinction is complicating matters.