-
Notifications
You must be signed in to change notification settings - Fork 25
Fix: remove heavy rocksdb deletes #698
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
# Conflicts: # magicblock-ledger/src/ledger_truncator.rs
Manual Deploy AvailableYou can trigger a manual deploy of this PR branch to testnet: Alternative: Comment
Comment updated automatically when the PR is synchronized. |
📝 WalkthroughWalkthroughThis pull request adds range-based deletion to the ledger: Possibly related PRs
Suggested reviewers
✨ Finishing touches
🧪 Generate unit tests (beta)
📜 Recent review detailsConfiguration used: Path: .coderabbit.yaml Review profile: ASSERTIVE Plan: Pro ⛔ Files ignored due to path filters (2)
📒 Files selected for processing (1)
🧰 Additional context used🧠 Learnings (1)📓 Common learnings🔇 Additional comments (3)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (3)
magicblock-ledger/src/ledger_truncator.rs (3)
132-143: Potential underflow whennum_slots_to_truncateis zero.If
slot_size > excess, thennum_slots_to_truncatewill be 0, andlowest_slot + 0 - 1will underflow foru64, wrapping tou64::MAX. This would cause incorrect truncation behavior.Add a guard before the subtraction:
let num_slots_to_truncate = excess / slot_size; + if num_slots_to_truncate == 0 { + info!("Fat truncation: excess size too small for even one slot, skipping"); + return Ok(()); + } // Calculating up to which slot we're truncating let truncate_to_slot = lowest_slot + num_slots_to_truncate - 1;
127-130: Replace informal debug message with descriptive warning."Nani3?" provides no context for operators reviewing logs. Consider a descriptive message.
if lowest_slot == highest_slot { - warn!("Nani3?"); + warn!("Fat truncation skipped: only one slot in ledger (slot {})", lowest_slot); return Ok(()); }
316-319: Replace informal debug message with descriptive warning.if to_slot < from_slot { - warn!("LedgerTruncator: Nani2?"); + warn!("LedgerTruncator: skipping compaction, invalid range to_slot ({}) < from_slot ({})", to_slot, from_slot); return; }
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
⛔ Files ignored due to path filters (2)
Cargo.lockis excluded by!**/*.locktest-integration/Cargo.lockis excluded by!**/*.lock
📒 Files selected for processing (6)
magicblock-ledger/Cargo.toml(1 hunks)magicblock-ledger/src/database/ledger_column.rs(1 hunks)magicblock-ledger/src/database/rocks_db.rs(1 hunks)magicblock-ledger/src/ledger_truncator.rs(9 hunks)magicblock-ledger/src/store/api.rs(3 hunks)magicblock-metrics/src/metrics/mod.rs(3 hunks)
🧰 Additional context used
🧠 Learnings (6)
📓 Common learnings
Learnt from: bmuddha
Repo: magicblock-labs/magicblock-validator PR: 596
File: magicblock-processor/src/scheduler.rs:1-1
Timestamp: 2025-10-28T13:15:42.706Z
Learning: In magicblock-processor, transaction indexes were always set to 0 even before the changes in PR #596. The proper transaction indexing within slots will be addressed during the planned ledger rewrite.
📚 Learning: 2025-11-24T14:21:00.996Z
Learnt from: Dodecahedr0x
Repo: magicblock-labs/magicblock-validator PR: 639
File: Cargo.toml:58-58
Timestamp: 2025-11-24T14:21:00.996Z
Learning: In the magicblock-validator codebase, magicblock-api/Cargo.toml intentionally uses borsh = "1.5.3" (instead of the workspace version 0.10.4) because it needs to deserialize types from the magic-domain-program external dependency, which requires borsh 1.5.x compatibility. This is an intentional exception for interoperability with the magic domain program.
Applied to files:
magicblock-ledger/Cargo.toml
📚 Learning: 2025-10-21T14:00:54.642Z
Learnt from: bmuddha
Repo: magicblock-labs/magicblock-validator PR: 578
File: magicblock-aperture/src/requests/websocket/account_subscribe.rs:18-27
Timestamp: 2025-10-21T14:00:54.642Z
Learning: In magicblock-aperture account_subscribe handler (src/requests/websocket/account_subscribe.rs), the RpcAccountInfoConfig fields data_slice, commitment, and min_context_slot are currently ignored—only encoding is applied. This is tracked as technical debt in issue #579: https://github.com/magicblock-labs/magicblock-validator/issues/579
Applied to files:
magicblock-ledger/Cargo.toml
📚 Learning: 2025-10-28T13:15:42.706Z
Learnt from: bmuddha
Repo: magicblock-labs/magicblock-validator PR: 596
File: magicblock-processor/src/scheduler.rs:1-1
Timestamp: 2025-10-28T13:15:42.706Z
Learning: In magicblock-processor, transaction indexes were always set to 0 even before the changes in PR #596. The proper transaction indexing within slots will be addressed during the planned ledger rewrite.
Applied to files:
magicblock-ledger/src/ledger_truncator.rsmagicblock-ledger/src/store/api.rs
📚 Learning: 2025-11-21T10:22:07.520Z
Learnt from: taco-paco
Repo: magicblock-labs/magicblock-validator PR: 661
File: magicblock-committor-service/src/intent_executor/single_stage_executor.rs:20-28
Timestamp: 2025-11-21T10:22:07.520Z
Learning: In magicblock-committor-service's SingleStageExecutor and TwoStageExecutor (single_stage_executor.rs and two_stage_executor.rs), the fields transaction_strategy, junk, and patched_errors are intentionally public because these executors are designed to be used independently outside of the IntentExecutor scope, and callers need access to these execution reports for cleanup and error handling.
Applied to files:
magicblock-ledger/src/ledger_truncator.rs
📚 Learning: 2025-11-07T13:20:13.793Z
Learnt from: bmuddha
Repo: magicblock-labs/magicblock-validator PR: 589
File: magicblock-processor/src/scheduler/coordinator.rs:227-238
Timestamp: 2025-11-07T13:20:13.793Z
Learning: In magicblock-processor's ExecutionCoordinator (scheduler/coordinator.rs), the `account_contention` HashMap intentionally does not call `shrink_to_fit()`. Maintaining slack capacity is beneficial for performance by avoiding frequent reallocations during high transaction throughput. As long as empty entries are removed from the map (which `clear_account_contention` does), the capacity overhead is acceptable.
Applied to files:
magicblock-ledger/src/ledger_truncator.rs
🧬 Code graph analysis (3)
magicblock-ledger/src/database/rocks_db.rs (3)
magicblock-ledger/src/store/api.rs (1)
delete_range_cf(1220-1230)magicblock-ledger/src/database/db.rs (1)
delete_range_cf(130-147)magicblock-ledger/src/database/write_batch.rs (1)
delete_range_cf(57-65)
magicblock-ledger/src/database/ledger_column.rs (1)
magicblock-ledger/src/database/columns.rs (8)
key(125-125)key(170-174)key(234-243)key(318-323)key(393-398)key(495-497)key(566-571)key(641-643)
magicblock-ledger/src/store/api.rs (4)
magicblock-ledger/src/ledger_truncator.rs (4)
ledger(173-173)ledger(181-181)ledger(189-189)ledger(200-200)magicblock-ledger/src/database/rocks_db.rs (1)
delete_range_cf(124-132)magicblock-ledger/src/database/db.rs (1)
delete_range_cf(130-147)magicblock-ledger/src/database/write_batch.rs (1)
delete_range_cf(57-65)
🔇 Additional comments (8)
magicblock-ledger/Cargo.toml (1)
23-23: LGTM!The workspace dependency addition is appropriate for integrating the new ledger truncator metrics.
magicblock-ledger/src/database/rocks_db.rs (1)
124-132: LGTM!Clean low-level wrapper for RocksDB's range delete. The method correctly delegates range semantics to the caller, consistent with the existing API pattern.
magicblock-ledger/src/database/ledger_column.rs (1)
257-264: LGTM!The
delete_rangemethod follows the established patterns inLedgerColumnfor key conversion and backend delegation.magicblock-metrics/src/metrics/mod.rs (1)
478-484: LGTM!The helper functions follow the established patterns for histogram timers and closure-based observation.
magicblock-ledger/src/ledger_truncator.rs (3)
162-227: LGTM - well-structured range deletion logic.The implementation correctly:
- Uses
to_slot + 1for RocksDB's exclusive end semantics- Decreases counters by the exact slot count for slot-indexed columns
- Resets counters to
DIRTY_COUNTfor columns where element count is unknown- Uses composite key boundaries correctly for
SlotSignaturesThe macro for resetting entry counters is a clean pattern.
329-344: LGTM!Clean RAII pattern combining the existing
Measurewith the newHistogramTimer. Both record on drop for consistent timing measurement.
275-304: LGTM!The refactored flow correctly sequences: set cleanup boundary → insert tombstones → flush → compact. This aligns with the PR objective of using range deletes with CompactionFilter cleanup.
magicblock-ledger/src/store/api.rs (1)
174-179: Initializelowest_cleanup_sloton open – behavior change to be aware ofCalling
initialize_lowest_cleanup_slot()duringdo_openmeans that, for a populated ledger,lowest_cleanup_slot(and thus theoldest_slotbeacon used by the compaction filter) is now derived from the current lowestBlockhashslot instead of always starting at 0. That seems aligned with the goal of resuming cleanup across restarts; just make sure there are no callers/tests that implicitly relied onget_lowest_cleanup_slot()returning 0 immediately afterLedger::openon a non-empty store.
bmuddha
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM with few minor nits
| LEDGER_COLUMNS_COUNT_DURATION_SECONDS.observe_closure_duration(f) | ||
| } | ||
|
|
||
| pub fn start_ledger_truncator_compaction_timer() -> HistogramTimer { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: cleaner approach is to use scoped timer guards, using closure based measuring makes the invocation site quite cumbersome to read.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We still use scoped guard on call site
let _measure = CompactionMeasure {
measure: Measure::start("Manual compaction"),
_histogram_timer: start_ledger_truncator_compaction_timer(),
};
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's just more code imo
| } | ||
| } | ||
|
|
||
| pub trait HasColumn<C> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Better name would be TruncateableColumn
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Disagree, this isn't only for columns on which we delete, this is implemented for every column we have - just return a corresponding &LedgerColumn.
In some cases we want to map passed generic C to its column, but without this trait there's no good way to do it. Maybe only matching C::Name to expected set of names, but this will result in function returning Option<LedgerColumn> which is ugly imo.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not against using the trait, just pointing out that the name is confusing
Prior
Ledger::delete_slot_rangewas a very write/read intensive operation where we were traversing the whole db to delete all desired items in range.This likely was a reason for response time degradation and stalling of concurrent writes to DB from TX processing.
This PR proposes an alternative approach:
CompactionFilterto do maintanance for us, eg delete slots earlier thanoldest_slotSummary by CodeRabbit
New Features
Improvements
Chores
✏️ Tip: You can customize this high-level summary in your review settings.