fix: remove excess vec capacity in caches#5935
Conversation
WalkthroughAdds shrink_to_fit calls before caching or storing vectors in multiple components and introduces verify_block_headers in Tipset::new to validate header invariants before constructing a Tipset. Minor method parameter mutability updates to accommodate shrinking. One cache insert method’s signature now takes a mutable Vec. Changes
Sequence Diagram(s)sequenceDiagram
participant Caller
participant Tipset
participant Verifier as verify_block_headers
Caller->>Tipset: new(headers_in)
Tipset->>Tipset: build NonEmpty<CachingBlockHeader>
Tipset->>Tipset: headers.shrink_to_fit()
Tipset->>Verifier: verify_block_headers(&headers)
Verifier-->>Tipset: Ok() or Error
Tipset-->>Caller: Ok(Tipset { headers, key })
sequenceDiagram
participant Code as Caller
participant Cache as MsgsInTipsetCache
participant Store as Internal Cache
Code->>Cache: insert(key, value: Vec<ChainMessage>)
Cache->>Cache: value.shrink_to_fit()
Cache->>Store: push(key, value)
Store-->>Cache: ack
Estimated code review effort🎯 2 (Simple) | ⏱️ ~8 minutes Suggested reviewers
Tip 🔌 Remote MCP (Model Context Protocol) integration is now available!Pro plan users can now connect to remote MCP servers from the Integrations page. Connect with popular remote MCPs such as Notion and Linear to add more context to your reviews and chats. ✨ Finishing Touches
🧪 Generate unit tests
🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR/Issue comments)Type Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (2)
src/blocks/tipset.rs (2)
244-253: Verify before shrinking to avoid wasted work on invalid inputCall
verify_block_headersbeforeshrink_to_fit()so we fail fast and avoid an unnecessary reallocation if the tipset is invalid.- headers.shrink_to_fit(); - verify_block_headers(&headers)?; + verify_block_headers(&headers)?; + headers.shrink_to_fit();
551-573: Avoid allocation in verify_block_headers by iterating without collecting
verify_block_headerscurrently collects into aNonEmpty<&CachingBlockHeader>(allocates) just to perform simple “all equal to first” checks. You can iterate the provided iterator directly, capture the first element, and check invariants without intermediate allocation. This keeps the function zero-alloc and still enforces non-emptiness via a single peek.For example (illustrative, not a diff):
fn verify_block_headers<'a>( headers: impl IntoIterator<Item = &'a CachingBlockHeader>, ) -> Result<(), CreateTipsetError> { let mut iter = headers.into_iter(); let first = match iter.next() { Some(h) => h, None => return Err(CreateTipsetError::Empty), }; if !iter.clone().all(|it| it.parents == first.parents) { return Err(CreateTipsetError::BadParents); } if !iter.clone().all(|it| it.state_root == first.state_root) { return Err(CreateTipsetError::BadStateRoot); } if !iter.clone().all(|it| it.epoch == first.epoch) { return Err(CreateTipsetError::BadEpoch); } // miner uniqueness if iter.map(|it| it.miner_address).chain(std::iter::once(first.miner_address)).all_unique() == false { return Err(CreateTipsetError::DuplicateMiner); } Ok(()) }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these settings in your CodeRabbit configuration.
📒 Files selected for processing (3)
src/blocks/tipset.rs(1 hunks)src/chain/store/chain_store.rs(1 hunks)src/state_manager/cache.rs(1 hunks)
🧰 Additional context used
🧠 Learnings (1)
📓 Common learnings
Learnt from: hanabi1224
PR: ChainSafe/forest#5930
File: build.rs:64-77
Timestamp: 2025-08-13T09:43:20.301Z
Learning: hanabi1224 prefers hard compile-time errors in build scripts rather than runtime safeguards or collision detection, believing it's better to fail fast and fix root causes of issues like malformed snapshot names.
🧬 Code Graph Analysis (2)
src/chain/store/chain_store.rs (2)
src/state_manager/cache.rs (1)
insert(154-159)src/message/chain_message.rs (1)
value(65-70)
src/state_manager/cache.rs (2)
src/blocks/tipset.rs (2)
key(335-338)key(529-532)src/shim/executor.rs (1)
key(305-310)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)
- GitHub Check: Build forest binaries on Linux AMD64
- GitHub Check: tests-release
- GitHub Check: tests
- GitHub Check: cargo-publish-dry-run
- GitHub Check: All lint checks
- GitHub Check: Build Ubuntu
- GitHub Check: Build MacOS
- GitHub Check: Analyze (go)
- GitHub Check: Analyze (rust)
🔇 Additional comments (1)
src/chain/store/chain_store.rs (1)
586-589: LGTM: shrink cached Vec to reduce retained capacityShrinking before pushing into the LRU is consistent with the PR goal and helps the size-tracking be more accurate.
| fn insert_receipt(&self, key: &TipsetKey, mut receipts: Vec<Receipt>) { | ||
| if !receipts.is_empty() { | ||
| receipts.shrink_to_fit(); | ||
| self.receipt_cache.insert(key.clone(), receipts); | ||
| } | ||
| } |
There was a problem hiding this comment.
💡 Verification agent
🧩 Analysis chain
Don’t skip caching empty receipts; it causes repeated recomputation and changes semantics
Gating insertion on non-empty vectors changes behavior: callers that legitimately produce empty receipts will no longer be cached and may recompute on subsequent lookups. Empty results are still results and should be cached (negative caching).
Apply this change to always cache, while still shrinking:
- fn insert_receipt(&self, key: &TipsetKey, mut receipts: Vec<Receipt>) {
- if !receipts.is_empty() {
- receipts.shrink_to_fit();
- self.receipt_cache.insert(key.clone(), receipts);
- }
- }
+ fn insert_receipt(&self, key: &TipsetKey, mut receipts: Vec<Receipt>) {
+ receipts.shrink_to_fit();
+ self.receipt_cache.insert(key.clone(), receipts);
+ }To check current usage patterns and potential reliance on empty caching, run:
🏁 Script executed:
#!/bin/bash
rg -n "insert_receipt\(|get_receipt_or_else\(" -A 3 -B 3Length of output: 4422
Always cache empty receipts for proper negative caching
Currently both the handler and its caller short-circuit empty vectors, so empty receipt results are never stored—causing repeated recomputation. We need to:
• In src/state_manager/cache.rs, remove the if !receipts.is_empty() guard in insert_receipt so it always calls insert(...) after shrink_to_fit().
• In src/state_manager/mod.rs, remove the surrounding if !receipts.is_empty() check before calling insert_receipt so empty vectors get cached.
Proposed diffs:
--- a/src/state_manager/cache.rs
+++ b/src/state_manager/cache.rs
@@ -206,7 +206,6 @@ impl TipsetReceiptEventCacheHandler for EnabledTipsetDataCache {
- fn insert_receipt(&self, key: &TipsetKey, mut receipts: Vec<Receipt>) {
- if !receipts.is_empty() {
- receipts.shrink_to_fit();
- self.receipt_cache.insert(key.clone(), receipts);
- }
- }
+ fn insert_receipt(&self, key: &TipsetKey, mut receipts: Vec<Receipt>) {
+ receipts.shrink_to_fit();
+ self.receipt_cache.insert(key.clone(), receipts);
+ }--- a/src/state_manager/mod.rs
+++ b/src/state_manager/mod.rs
@@ -462,5 +462,4 @@ impl StateManager {
- && !receipts.is_empty()
- {
- self.receipt_event_cache_handler
- .insert_receipt(key, receipts);
- }
+ // always cache receipts, even if empty
+ {
+ self.receipt_event_cache_handler.insert_receipt(key, receipts);
+ }This ensures that even an empty Vec<Receipt> is cached once and won’t trigger redundant recomputations in get_receipt_or_else.
Committable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In src/state_manager/cache.rs around lines 206-211, remove the `if
!receipts.is_empty()` guard so that after `receipts.shrink_to_fit()` the
function always calls `self.receipt_cache.insert(key.clone(), receipts);`; also
update src/state_manager/mod.rs at the call site (remove the surrounding `if
!receipts.is_empty()` check) so `insert_receipt` is invoked even for empty
Vec<Receipt> values, ensuring empty receipts are cached for negative caching.
There was a problem hiding this comment.
I believe empty vec indicates value computation is skipped, thus should not be cached. cc @akaladarshi to confirm
There was a problem hiding this comment.
Seems like the humans are having a chat. I'll hop back into my burrow for now. If you need me again, just tag @coderabbitai in a new comment, and I'll come hopping out!
| fn insert_events(&self, key: &TipsetKey, mut events_data: StateEvents) { | ||
| if !events_data.events.is_empty() { | ||
| events_data.events.shrink_to_fit(); | ||
| events_data.roots.shrink_to_fit(); | ||
| self.events_cache.insert(key.clone(), events_data); | ||
| } | ||
| } |
There was a problem hiding this comment.
💡 Verification agent
🧩 Analysis chain
Same here: cache empty event sets to avoid unnecessary re-computation
As with receipts, events can legitimately be empty. Skipping insertion changes cache semantics and may increase CPU churn.
- fn insert_events(&self, key: &TipsetKey, mut events_data: StateEvents) {
- if !events_data.events.is_empty() {
- events_data.events.shrink_to_fit();
- events_data.roots.shrink_to_fit();
- self.events_cache.insert(key.clone(), events_data);
- }
- }
+ fn insert_events(&self, key: &TipsetKey, mut events_data: StateEvents) {
+ events_data.events.shrink_to_fit();
+ events_data.roots.shrink_to_fit();
+ self.events_cache.insert(key.clone(), events_data);
+ }You can also scan for call sites:
🏁 Script executed:
#!/bin/bash
rg -n "insert_events\(|get_events_or_else\(" -A 3 -B 3Length of output: 3584
🏁 Script executed:
#!/bin/bash
rg -n "events_cache" -A5 -B5 src/state_manager/cache.rsLength of output: 1649
🏁 Script executed:
#!/bin/bash
rg -n "fn insert_receipt" -A5 -B5 src/state_manager/cache.rsLength of output: 1390
🏁 Script executed:
#!/bin/bash
rg -n "struct TipsetStateCache" -A5 -B5 .Length of output: 1434
Always cache empty StateEvents to avoid redundant recomputation
Skipping insertion when events_data.events is empty means callers of get_events_or_else will recompute identical (empty) results on every miss. Align with insert_receipt’s behavior by always caching—even empty—event sets.
Points of attention:
- File:
src/state_manager/cache.rs - Method:
EnabledTipsetDataCache::insert_events(lines 213–219)
Suggested diff:
- fn insert_events(&self, key: &TipsetKey, mut events_data: StateEvents) {
- if !events_data.events.is_empty() {
- events_data.events.shrink_to_fit();
- events_data.roots.shrink_to_fit();
- self.events_cache.insert(key.clone(), events_data);
- }
- }
+ fn insert_events(&self, key: &TipsetKey, mut events_data: StateEvents) {
+ events_data.events.shrink_to_fit();
+ events_data.roots.shrink_to_fit();
+ self.events_cache.insert(key.clone(), events_data);
+ }📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| fn insert_events(&self, key: &TipsetKey, mut events_data: StateEvents) { | |
| if !events_data.events.is_empty() { | |
| events_data.events.shrink_to_fit(); | |
| events_data.roots.shrink_to_fit(); | |
| self.events_cache.insert(key.clone(), events_data); | |
| } | |
| } | |
| fn insert_events(&self, key: &TipsetKey, mut events_data: StateEvents) { | |
| events_data.events.shrink_to_fit(); | |
| events_data.roots.shrink_to_fit(); | |
| self.events_cache.insert(key.clone(), events_data); | |
| } |
🤖 Prompt for AI Agents
In src/state_manager/cache.rs around lines 213 to 219, change
EnabledTipsetDataCache::insert_events to always cache the provided StateEvents
(including when events is empty) instead of skipping insertion; remove the if
!events_data.events.is_empty() guard, keep the shrink_to_fit() calls on events
and roots, then always call self.events_cache.insert(key.clone(), events_data)
so empty results are cached like insert_receipt does.
There was a problem hiding this comment.
I believe empty vec indicates value computation is skipped, thus should not be cached. cc @aatifsyed to confirm
There was a problem hiding this comment.
Seems like the humans are having a chat. I'll hop back into my burrow for now. If you need me again, just tag @coderabbitai in a new comment, and I'll come hopping out!
There was a problem hiding this comment.
I don't think I have enough context to confirm or deny! Is this a Rust question, a protocol question, or a library question? :))
There was a problem hiding this comment.
Hey @aatifsyed thanks for your reply! I actually meant to tag @akaladarshi on this sorry about it and have a great day!
Summary of changes
This PR tries to reduce potential excess Vec capacity in caches for lower memory usage
Changes introduced in this pull request:
Reference issue to close (if applicable)
Closes
Other information and links
Change checklist
Summary by CodeRabbit