Refactor snapshots to single BucketListSnapshot class#5069
Refactor snapshots to single BucketListSnapshot class#5069SirTyson merged 3 commits intostellar:masterfrom
Conversation
There was a problem hiding this comment.
Pull request overview
This PR refactors the BucketList snapshot infrastructure to simplify the class hierarchy and improve thread safety. The key change is separating immutable snapshot data (BucketListSnapshotData) from thread-local state (SearchableBucketListSnapshot), eliminating multiple layers of abstraction that were no longer needed after previous refactoring work.
Key Changes:
- Introduced
BucketListSnapshotDatato hold immutable, thread-safe bucket references and ledger headers - Consolidated
BucketSnapshot,BucketListSnapshotBase, andSearchableBucketListclasses into a singleSearchableBucketListSnapshotclass hierarchy - Updated
BucketSnapshotManagerto work with shared pointers to immutable snapshot data instead of managing full snapshot copies
Reviewed changes
Copilot reviewed 33 out of 33 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
| src/bucket/BucketListSnapshot.h | New unified header consolidating snapshot classes; defines BucketListSnapshotData and SearchableBucketListSnapshot |
| src/bucket/BucketListSnapshot.cpp | New implementation file with all snapshot functionality consolidated from deleted files |
| src/bucket/SearchableBucketList.{h,cpp} | Removed - functionality merged into BucketListSnapshot |
| src/bucket/BucketSnapshot.{h,cpp} | Removed - functionality merged into BucketListSnapshot |
| src/bucket/BucketListSnapshotBase.{h,cpp} | Removed - functionality merged into BucketListSnapshot |
| src/bucket/BucketSnapshotManager.{h,cpp} | Updated to work with BucketListSnapshotData shared pointers and simplified snapshot creation |
| src/bucket/BucketManager.cpp | Updated initialize() to pass bucket lists directly instead of pre-constructed snapshots |
| src/bucket/BucketUtils.h | Removed SnapshotPtrT type alias, kept using declarations |
| src/bucket/LiveBucket.h | Updated friend declaration from LiveBucketSnapshot to SearchableLiveBucketListSnapshot |
| src/bucket/HotArchiveBucket.h | Updated friend declaration from HotArchiveBucketSnapshot to SearchableHotArchiveBucketListSnapshot |
| src/bucket/BucketBase.h | Updated friend declaration to use new SearchableBucketListSnapshot template |
| src/ledger/LedgerManagerImpl.cpp | Updated copySearchableLiveBucketListSnapshot call to pass metrics parameter |
| src/bucket/test/BucketTestUtils.cpp | Simplified test helper to pass bucket lists directly instead of creating snapshots |
| src/simulation/ApplyLoad.cpp | Updated includes to use BucketListSnapshot.h |
| Multiple other files | Updated includes from SearchableBucketList.h to BucketListSnapshot.h and added BucketManager.h where needed |
| Builds/VisualStudio/* | Updated project files to reflect renamed source files |
b612e62 to
31320f7
Compare
dmkozh
left a comment
There was a problem hiding this comment.
I haven't looked too deeply into BucketListSnapshot beyond BucketListSnapshotData and diffs with the removed files. Please let me know if something else is worth of more thorough review.
BucketListSnapshot.cpp is basically just a copy-paste from all the Snapshot files that got deleted. I think things worth reviewing are BucketListSnapshotData, the layout off the new snapshot class in BucketListSnapshot.h, and the changes to BucketSnapshotManager.cpp, since it's fairly sensitive. The diff in the 2nd commit would also be good to review, as it's small and addresses a few race conditions that are actual bugs. |
31320f7 to
837e8d0
Compare
Yeah, I've already looked at most of these. Looks fine to me. |
aef930f to
b0253aa
Compare
|
This has been rebased and should be in a mergable state. This is the first part of the refactor discussed here. Our end goal is to centralize all immutable ledger state (LedgerData in the doc) to a single struct that is exclusively managed by the LedgerManager. Snapshots will then be lightweight objects that point to this data and maintain small amounts of thread local state (such as file stream and metrics). With the current SearchableBucketList design, this would be quite challenging. There are many layers of abstraction, and both mutable and immutable state are stored per-bucket via |
graydon
left a comment
There was a problem hiding this comment.
Handful of nits to address, but basically fine. You have to rebase also. Ping me when you're ready and I'll re-rubber-stamp.
Thanks for doing this! Looks much better. And looking forward to the followups.
b0253aa to
5132a0b
Compare
Description
Refactor to simplify the BucketListSnapshot interface.
This is the first of a few refactors I want to make around the BucketList snapshots. After reviewing the thread safety bug addressed in #5068, I realized the snapshot interface has gotten fairly difficult to reason about with lots of levels of abstraction. Initially, this complexity was necessary for snapshots to be able to "self update" themselves without access to
app. We no longer do this and have a much simplified interface where the caller just routinely callsBucketSnapshotManagerto make sure it's snapshot is up to date.This PR is a no-op refactor that flattens that snapshot class hierarchy. The key observation is that a snapshot really just maintains 2 pieces of data:
Bucketobjects (their file and index), ledgerSeq, and ledger header. For a given snapshot, these are immutable and thread safe, captured in the newBucketListSnapshotDataclass.In this refactor, the
BucketSnapshotManageris moving away from managing full snapshots and towards just being a record keeper forBucketListSnapshotData. This is much simpler and safer, sinceBucketListSnapshotDatacan just be astd::shared_ptr<const>that we hand off to whoever asks, and we don't have to worry about copy semantics or thread safety at the manager level.This distinction between the thread-safe
BucketListSnapshotDataand thread-local state is setting up for the next refactor, which will more strongly enforce that each thread maintains it's own BucketListSnapshot. We'll do this by no longer having the canonical BucketListSnapshot type be a pointer and override copy operators. There's also some opportunity for simplification around managing historical snapshots. I think this may also help with other refactors, like centralizing network config settings viaBucketListSnapshotData, though I haven't looked too much into it yet.Finally, I found a few places where we were racing on multiple threads accessing the same snapshot. The current manual copy is not very elegant and is still error prone, but the follow up PR to this will clean this up shortly.
Checklist
clang-formatv8.0.0 (viamake formator the Visual Studio extension)