-
Notifications
You must be signed in to change notification settings - Fork 376
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
VSR: Generalize trailer #1351
VSR: Generalize trailer #1351
Conversation
82d0b72
to
6bac783
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was a git mv
from free_set_encoded.zig
; not sure why it is showing it as a new file. 😞
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I've recently learned, git mv
is literally mv && git add
, it doesn't actually record the fact that this was a move anywhere, move vs insert+delete is still based on diff heuristic. I think the only reliable way to record moves in git is to commit just moves separately, and to fixup in a different commit on-top.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤦 Ah, that explains it, thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll split the commit up, but will hold off on pushing that until I implement other review comments, to avoid messing with any PR comments that you've already written.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not much written yet, please push as it is indeed a bit annoying to review without finer-grained diff :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done! 281360c
In preparation for moving client sessions data from the superblock trailer into the grid, generalize `FreeSetEncoded` into `CheckpointTrailer`. Note that this commit doesn't touch the `schema.FreeSetNode` block type; that will be part of the next PR.
6bac783
to
281360c
Compare
src/vsr/checkpoint_trailer.zig
Outdated
/// Checkpoint process is delicate: | ||
/// 1. Encode free set. | ||
/// 2. Derive the number of blocks required to store the encoding. | ||
/// 3. Allocate blocks for the encoding (in the old checkpoint). | ||
/// 4. Write the blocks to disk. | ||
/// -------------------------------------------------------------- | ||
/// 5. Mark currently released blocks as free. | ||
/// 6. Release the freshly acquired blocks in the new checkpoint. | ||
/// | ||
/// This function handles steps 1-4. The caller is responsible for calling | ||
/// FreeSet.checkpoint which handles 5 and 6. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This description has moved to grid.zig
.
// the superblock, which doesn't have access to grid. It is set to null by reset, to verify | ||
// that the free set is not used before it is opened during sync. | ||
// the grid, which doesn't have access to a stable grid pointer. It is set to null by | ||
// `reset`, to verify that the free set is not used before it is opened during sync. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
some residual free sets in the comment here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think in this particular case the comment is correct -- the reason for lazy-initializing the grid pointer is specifically due to the free-set/grid relationship, not client sessions.
assert(set.grid.?.free_set.count_reservations() == 0); | ||
const reservation = set.grid.?.free_set.reserve(set.block_count()).?; | ||
defer set.grid.?.free_set.forfeit(reservation); | ||
assert(trailer.grid.?.free_set.count_reservations() == 0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder whether this is necessary? I think it would be correct to checkpoint a trailer concurrently with something else? Though, it's fine to leave assert as is, just to solidify the current implementation.
However, let's add the same assert to:
- that place in
grid.zig
where we do "include_staging()" trick. There it is crucialt that there are no outstanding reservations - inside
include_staging
itself
src/vsr/grid.zig
Outdated
); | ||
} | ||
|
||
fn free_set_open_callback(free_set_checkpoint: *CheckpointTrailer) void { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe open_free_set_callback
, so that is lexicographically nested with open
? I think we have both patterns, and I somewhat consciously chose open_xxx_callback
naming for free set
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And we should totally be consitent with checkpoint_free_set_callback
in the same file
src/vsr/grid.zig
Outdated
.encoded = free_set_checkpoint.buffer[0..free_set_checkpoint.size], | ||
.block_addresses = free_set_checkpoint.block_addresses[0..free_set_checkpoint.block_count()], | ||
}); | ||
assert((free_set_checkpoint.size > 0) == (grid.free_set.count_acquired() > 0)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we should add free_set.pristine()
? I worry a little that count_acquired
is O(N), while its enough to just find a single set bit (which probably going to be the first one even)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe .empty()
, since we have open_empty()
?
(And to check that I'm on the same page: pristine
/empty
would check count_acquired() == 0
, but is implemented with an early-exit when the first acquired address is found.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was a bit hesitant about empty, as I think we might want to check all of blocks, staging, and reservations. But I am probably overthinking this. And yes, the idea is to do an early return. Though I guess if we are checking staging, we’d still be O(N)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤔 I think I'd rather assert each of those separately rather than bundling them up into a single function, to make it obvious what is being checked.
That information is already exposed via free_set.count_*
.
pub fn checkpoint(grid: *Grid, callback: *const fn (*Grid) void) void { | ||
assert(grid.checkpointing == null); | ||
assert(grid.canceling == null); | ||
assert(grid.callback == .none); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking about a pattern like
assert(grid.callback == .none);
defere assert(grid.callback == .checkpoint);
to underscore that this is a state transition function. But that's probably more noise than helpful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do like that pattern when there are multiple branches / return points, though!
src/vsr/grid.zig
Outdated
assert(grid.canceling == null); | ||
assert(grid.checkpointing == null); | ||
// grid.open() is cancellable the same way that read_block()/write_block() are. | ||
assert(grid.callback == .none or grid.callback == .open); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe switch for exhaustiveness here? feels useful to decect missing new cancelation cases at compiletime.
src/vsr/superblock.zig
Outdated
checksum: u128, | ||
last_block_address: u64, | ||
last_block_checksum: u128, | ||
free_set_size: u32, | ||
storage_size: u64, | ||
trailer_size: u32, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For generic trailer, let's future-proof this to u64?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should free_set_size
/client_sessions_size
in the CheckpointState
be u64
too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I’d say yes!
.open => {}, | ||
.checkpoint => unreachable, | ||
.cancel => unreachable, | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any specific reason to avoid
.none, .open => {}
.checkpoint, .cancel => unreachable,
grouping here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤔 I think we have one extra invariant here that we aren't actually checking perhaps? For states which are not cancelable (so, checkpoint and cancel), we must also ensure that we don't read any blocks, as that could deadlock.
In particular, we shouldn't be reading blocks during checkpoint. So, assert(grid.callback != .cancel);
in read_block
should also check callback != chcekpoint
, and, really, it should be the same as current assert.
so maybe add gird.callback.can_cancel()
helper for assertions here and on the read path?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In grid.read_block
we already have:
switch (callback) {
.from_local_storage => {
maybe(grid.callback == .checkpoint);
...
},
.from_local_or_global_storage => {
assert(grid.callback != .checkpoint);
...
},
}
Which I think is correct -- reads to satisfy grid repair requests are safe.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any specific reason to avoid
Just personal preference -- I like all of the variants aligned, so if I was grouping them I would do
.none,
.open,
=> {},
.checkpoint,
.cancel,
=> unreachable,
But in this case that is longer than just repeating the rhs.
(Follow-up for #1351). No more trailers in the superblock! Client sessions is encoded with `CheckpointTrailer`, like the free set. Unlike the free set, client sessions encoding/decoding is handled directly by `replica.zig`, not `grid.zig`. This simplifies the superblock checkpoint/sync control flow, which now just needs to care about the superblock headers. This simplifies state sync, since the superblock sync is now just a single `sync_checkpoint` message.
(Follow-up for #1351). No more trailers in the superblock! Client sessions is encoded with `CheckpointTrailer`, like the free set. Unlike the free set, client sessions encoding/decoding is handled directly by `replica.zig`, not `grid.zig`. This simplifies the superblock checkpoint/sync control flow, which now just needs to care about the superblock headers. This simplifies state sync, since the superblock sync is now just a single `sync_checkpoint` message.
(Follow-up for #1351). No more trailers in the superblock! Client sessions is encoded with `CheckpointTrailer`, like the free set. Unlike the free set, client sessions encoding/decoding is handled directly by `replica.zig`, not `grid.zig`. This simplifies the superblock checkpoint/sync control flow, which now just needs to care about the superblock headers. This simplifies state sync, since the superblock sync is now just a single `sync_checkpoint` message.
In preparation for moving client sessions data from the superblock trailer into the grid, generalize
FreeSetEncoded
intoCheckpointTrailer
.Note that this commit doesn't touch the
schema.FreeSetNode
block type; that will be part of the next PR.