Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VSR: Generalize trailer #1351

Merged
merged 4 commits into from
Dec 13, 2023
Merged

VSR: Generalize trailer #1351

merged 4 commits into from
Dec 13, 2023

Conversation

sentientwaffle
Copy link
Member

In preparation for moving client sessions data from the superblock trailer into the grid, generalize FreeSetEncoded into CheckpointTrailer.

Note that this commit doesn't touch the schema.FreeSetNode block type; that will be part of the next PR.

@sentientwaffle sentientwaffle marked this pull request as ready for review December 11, 2023 17:43
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was a git mv from free_set_encoded.zig; not sure why it is showing it as a new file. 😞

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I've recently learned, git mv is literally mv && git add, it doesn't actually record the fact that this was a move anywhere, move vs insert+delete is still based on diff heuristic. I think the only reliable way to record moves in git is to commit just moves separately, and to fixup in a different commit on-top.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤦 Ah, that explains it, thanks.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll split the commit up, but will hold off on pushing that until I implement other review comments, to avoid messing with any PR comments that you've already written.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not much written yet, please push as it is indeed a bit annoying to review without finer-grained diff :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done! 281360c

In preparation for moving client sessions data from the superblock trailer into the grid, generalize `FreeSetEncoded` into `CheckpointTrailer`.

Note that this commit doesn't touch the `schema.FreeSetNode` block type; that will be part of the next PR.
Comment on lines 320 to 330
/// Checkpoint process is delicate:
/// 1. Encode free set.
/// 2. Derive the number of blocks required to store the encoding.
/// 3. Allocate blocks for the encoding (in the old checkpoint).
/// 4. Write the blocks to disk.
/// --------------------------------------------------------------
/// 5. Mark currently released blocks as free.
/// 6. Release the freshly acquired blocks in the new checkpoint.
///
/// This function handles steps 1-4. The caller is responsible for calling
/// FreeSet.checkpoint which handles 5 and 6.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This description has moved to grid.zig.

// the superblock, which doesn't have access to grid. It is set to null by reset, to verify
// that the free set is not used before it is opened during sync.
// the grid, which doesn't have access to a stable grid pointer. It is set to null by
// `reset`, to verify that the free set is not used before it is opened during sync.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some residual free sets in the comment here

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think in this particular case the comment is correct -- the reason for lazy-initializing the grid pointer is specifically due to the free-set/grid relationship, not client sessions.

src/vsr/checkpoint_trailer.zig Show resolved Hide resolved
assert(set.grid.?.free_set.count_reservations() == 0);
const reservation = set.grid.?.free_set.reserve(set.block_count()).?;
defer set.grid.?.free_set.forfeit(reservation);
assert(trailer.grid.?.free_set.count_reservations() == 0);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder whether this is necessary? I think it would be correct to checkpoint a trailer concurrently with something else? Though, it's fine to leave assert as is, just to solidify the current implementation.

However, let's add the same assert to:

  • that place in grid.zig where we do "include_staging()" trick. There it is crucialt that there are no outstanding reservations
  • inside include_staging itself

src/vsr/grid.zig Outdated
);
}

fn free_set_open_callback(free_set_checkpoint: *CheckpointTrailer) void {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe open_free_set_callback, so that is lexicographically nested with open? I think we have both patterns, and I somewhat consciously chose open_xxx_callback naming for free set

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And we should totally be consitent with checkpoint_free_set_callback in the same file

src/vsr/grid.zig Outdated
.encoded = free_set_checkpoint.buffer[0..free_set_checkpoint.size],
.block_addresses = free_set_checkpoint.block_addresses[0..free_set_checkpoint.block_count()],
});
assert((free_set_checkpoint.size > 0) == (grid.free_set.count_acquired() > 0));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we should add free_set.pristine()? I worry a little that count_acquired is O(N), while its enough to just find a single set bit (which probably going to be the first one even)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe .empty(), since we have open_empty()?
(And to check that I'm on the same page: pristine/empty would check count_acquired() == 0, but is implemented with an early-exit when the first acquired address is found.)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was a bit hesitant about empty, as I think we might want to check all of blocks, staging, and reservations. But I am probably overthinking this. And yes, the idea is to do an early return. Though I guess if we are checking staging, we’d still be O(N)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤔 I think I'd rather assert each of those separately rather than bundling them up into a single function, to make it obvious what is being checked.
That information is already exposed via free_set.count_*.

pub fn checkpoint(grid: *Grid, callback: *const fn (*Grid) void) void {
assert(grid.checkpointing == null);
assert(grid.canceling == null);
assert(grid.callback == .none);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking about a pattern like

assert(grid.callback == .none);
defere assert(grid.callback == .checkpoint);

to underscore that this is a state transition function. But that's probably more noise than helpful.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do like that pattern when there are multiple branches / return points, though!

src/vsr/grid.zig Outdated
assert(grid.canceling == null);
assert(grid.checkpointing == null);
// grid.open() is cancellable the same way that read_block()/write_block() are.
assert(grid.callback == .none or grid.callback == .open);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe switch for exhaustiveness here? feels useful to decect missing new cancelation cases at compiletime.

checksum: u128,
last_block_address: u64,
last_block_checksum: u128,
free_set_size: u32,
storage_size: u64,
trailer_size: u32,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For generic trailer, let's future-proof this to u64?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should free_set_size/client_sessions_size in the CheckpointState be u64 too?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’d say yes!

.open => {},
.checkpoint => unreachable,
.cancel => unreachable,
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any specific reason to avoid

.none, .open => {}
.checkpoint, .cancel => unreachable,

grouping here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤔 I think we have one extra invariant here that we aren't actually checking perhaps? For states which are not cancelable (so, checkpoint and cancel), we must also ensure that we don't read any blocks, as that could deadlock.

In particular, we shouldn't be reading blocks during checkpoint. So, assert(grid.callback != .cancel); in read_block should also check callback != chcekpoint, and, really, it should be the same as current assert.

so maybe add gird.callback.can_cancel() helper for assertions here and on the read path?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In grid.read_block we already have:

            switch (callback) {
                .from_local_storage => {
                    maybe(grid.callback == .checkpoint);
                    ...
                },
                .from_local_or_global_storage => {
                    assert(grid.callback != .checkpoint);
                    ...
                },
            }

Which I think is correct -- reads to satisfy grid repair requests are safe.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any specific reason to avoid

Just personal preference -- I like all of the variants aligned, so if I was grouping them I would do

.none,
.open,
=> {},
.checkpoint,
.cancel,
=> unreachable,

But in this case that is longer than just repeating the rhs.

@sentientwaffle sentientwaffle added this pull request to the merge queue Dec 13, 2023
Merged via the queue into main with commit 106b117 Dec 13, 2023
27 checks passed
@sentientwaffle sentientwaffle deleted the dj-vsr-generalize-trailer branch December 13, 2023 14:15
sentientwaffle added a commit that referenced this pull request Dec 13, 2023
(Follow-up for #1351).

No more trailers in the superblock!

Client sessions is encoded with `CheckpointTrailer`, like the free set.
Unlike the free set, client sessions encoding/decoding is handled directly by `replica.zig`, not `grid.zig`.

This simplifies the superblock checkpoint/sync control flow, which now just needs to care about the superblock headers.
This simplifies state sync, since the superblock sync is now just a single `sync_checkpoint` message.
sentientwaffle added a commit that referenced this pull request Dec 13, 2023
(Follow-up for #1351).

No more trailers in the superblock!

Client sessions is encoded with `CheckpointTrailer`, like the free set.
Unlike the free set, client sessions encoding/decoding is handled directly by `replica.zig`, not `grid.zig`.

This simplifies the superblock checkpoint/sync control flow, which now just needs to care about the superblock headers.
This simplifies state sync, since the superblock sync is now just a single `sync_checkpoint` message.
sentientwaffle added a commit that referenced this pull request Dec 13, 2023
(Follow-up for #1351).

No more trailers in the superblock!

Client sessions is encoded with `CheckpointTrailer`, like the free set.
Unlike the free set, client sessions encoding/decoding is handled directly by `replica.zig`, not `grid.zig`.

This simplifies the superblock checkpoint/sync control flow, which now just needs to care about the superblock headers.
This simplifies state sync, since the superblock sync is now just a single `sync_checkpoint` message.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants