-
Notifications
You must be signed in to change notification settings - Fork 553
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
storage: move ghost batch generation into log reader #16721
storage: move ghost batch generation into log reader #16721
Conversation
I have an upcoming change that will iterate through a randomized complete list (i.e. all numbers between A and B, in random order). This adds a method to generate such a list.
9411cc8
to
6da7eb8
Compare
Ghost batches are batches that don't contain data whose purpose is to ensure: - the caller (recovery_stm) sees a contiguous set of offsets - the terms of the returned batches match with the original terms provided at append time, regardless of whether compactions have happened These requirements are currently expected to be honored by the Raft layer at recovery time, where it currently generates batches in between batches read from storage. Since the storage layer is the current owner of term metadata and compactions, the fact that these ghost batches are generated by the Raft layer makes it trickier to evolve the underlying storage implementation, since the guarantees would need to be tested / validated at multiple layers. So, to support development and testing of future log implementations, this patch pulls out ghost batch generation into the storage layer as an option to the log_reader.
6da7eb8
to
9a8d255
Compare
/ci-repeat |
new failures in https://buildkite.com/redpanda/redpanda/builds/45411#018deb99-4dff-4dcf-8fb3-1e3c6ba0cf90:
|
ducktape was retried in https://buildkite.com/redpanda/redpanda/builds/45411#018deb99-4dfc-497b-92ec-6b6b2caeb0a0 |
CI failure: #16764 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this looks good, but there is something i don't understand yet.
the motivation here seems to be that storage contains all the term info/compaction info etc... so that makes it better suited to deal with ghost batches as storage implementation evolves. additionally, it seems like there is a simplification and potential performance benefit since raft doesn't need to do the read and then go back and fill in the gaps as a second pass.
however, i don't see why storage needs to do this from the point of view of making it easier to evolve the implementation because currently it seems like raft depends on information in batches that will be be the same both before and after storage evolution project.
maybe there is a specific change in the evolution project that fills in this knowledge gap (yep)?
You're right that this isn't needed for the goal of an MVCC-versioned log. The aspect of the storage format evolution this helps with is when we get to implementing compaction that can remove entire terms' worth of batches -- once we get to that, the owner of term metadata should be the one to create ghost batches |
Ghost batches are batches that don't contain data whose purpose is to
ensure:
provided at append time, regardless of whether compactions have
happened
These requirements are currently expected to be honored by the Raft
layer at recovery time, where it currently generates batches in between
batches read from storage. Since the storage layer is the current owner
of term metadata and compactions, the fact that these ghost batches are
generated by the Raft layer makes it trickier to evolve the underlying
storage implementation, since the guarantees would need to be tested /
validated at multiple layers.
So, to support development and testing of future log implementations,
this patch pulls out ghost batch generation into the storage layer as an
option to the log_reader.
Backports Required
Release Notes