Skip to content

Refactor tipset_by_height and load_child_tipset to distinguish error from miss#7005

Merged
sudo-shashank merged 6 commits intomainfrom
shashank/refactor-tipset-lookup
May 7, 2026
Merged

Refactor tipset_by_height and load_child_tipset to distinguish error from miss#7005
sudo-shashank merged 6 commits intomainfrom
shashank/refactor-tipset-lookup

Conversation

@sudo-shashank
Copy link
Copy Markdown
Contributor

@sudo-shashank sudo-shashank commented May 6, 2026

Summary of changes

Changes introduced in this pull request:

  • tipset_by_height and load_child_tipset now return Result<Option<Tipset>, Error> (Ok(None) = clean cache miss, Err = actual storage/IO failure)
  • Propagate error as a genuine storage/IO failure, enabling fast-fail behavior

Reference issue to close (if applicable)

Closes #6755

Other information and links

Change checklist

  • I have performed a self-review of my own code,
  • I have made corresponding changes to the documentation. All new code adheres to the team's documentation standards,
  • I have added tests that prove my fix is effective or that my feature works (if possible),
  • I have made sure the CHANGELOG is up-to-date. All user-facing changes should be reflected in this document.

Outside contributions

  • I have read and agree to the CONTRIBUTING document.
  • I have read and agree to the AI Policy document. I understand that failure to comply with the guidelines will lead to rejection of the pull request.

Summary by CodeRabbit

  • Bug Fixes

    • Commands, RPCs, and tools now surface missing-tipset conditions and propagate lookup failures with clearer, contextual errors.
  • Improvements

    • Tipset/state loading and finality logic tightened to avoid silently swallowed failures; export, diff, snapshot, and event flows hardened.
  • API Changes

    • Tipset lookup behavior refined: optional "not found" results are distinguished from hard errors; some finality-related handlers now return fallible results.

@sudo-shashank sudo-shashank added the RPC requires calibnet RPC checks to run on CI label May 6, 2026
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 6, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • ✅ Review completed - (🔄 Check again to review again)

Walkthrough

Refactors tipset lookup APIs: tipset_by_height and load_child_tipset now return Result<Option<Tipset>, Error> (Ok(None) for a known miss, Err for storage/blockstore errors). Call sites updated to use load_required_tipset_by_height where lookups are required and to surface contextual errors.

Changes

Tipset lookup & call-site propagation

Layer / File(s) Summary
Core API Signatures / Behavior
src/chain/store/index.rs, src/chain/store/chain_store.rs
tipset_by_height and load_child_tipset signatures changed to Result<Option<Tipset>, Error>; docs updated to describe Ok(Some)/Ok(None)/Err semantics.
Core ChainStore logic
src/chain/store/chain_store.rs
load_child_tipset reimplemented to return Ok(Some(child)) when a matching child is found, Ok(None) when absent, and Err for blockstore/index failures.
Index resolution logic & tests
src/chain/store/index.rs
Ancestor-walk, genesis, exact epoch and null-tipset branches return Ok(Some(...)); unresolved ancestor walks return Ok(None); load_required_tipset_by_height maps Ok(None)Error::NotFound(...). Tests updated and a new test asserts broken ancestor chains return Ok(None).
Call-site conversions to required semantics
src/rpc/methods/*, src/dev/subcommands/*, src/tool/subcommands/*, src/interpreter/*, src/state_manager/*, src/message_pool/msgpool/provider.rs
Numerous call sites switched from tipset_by_height to load_required_tipset_by_height for required lookups; several callers added anyhow::Context/.with_context(...) or ?-based propagation to surface height/epoch-specific diagnostics.
State manager & execution wiring
src/state_manager/mod.rs, src/state_manager/chain_rand.rs
Rewind, state-loading, and executed-tipset flows now use load_child_tipset return Option and propagate errors (replacing prior .ok() suppression). Receipt/event recomputation branches updated to honor optional child tipset presence.
Finality & RPC adjustments
src/rpc/methods/chain.rs, src/rpc/methods/eth/tipset_resolver.rs, src/rpc/methods/eth.rs, src/rpc/methods/f3.rs
Finality helper and RPC endpoints updated to propagate errors (get_finality_status now returns anyhow::Result); many RPC height lookups now use required-by-height loader and attach contextual errors.
Tools: export/diff/snapshot/index subcommands
src/tool/subcommands/archive_cmd.rs, src/tool/subcommands/{benchmark_cmd,index_cmd,snapshot_cmd}.rs, src/dev/subcommands/{export_state_tree_cmd,state_cmd}.rs
Subcommands updated to use required-by-height API or add error context; diff/export paths validate diff epochs and compute streaming diffs using resolved tipsets.

Sequence Diagram(s)

(Skipped — changes are API return-shape refactors and enriched error contexts; no new multi-component sequential control flow requiring visualization.)

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

  • ChainSafe/forest#6811: Overlaps on finality-resolution and related RPC changes that intersect with this PR's chain finality adjustments.
  • ChainSafe/forest#6880: Modifies tipset_by_height in src/chain/store/index.rs and touches similar signature/semantics changes.
  • ChainSafe/forest#6897: Related changes to finality/tipset resolution in src/rpc/methods/chain.rs.

Suggested reviewers

  • LesnyRumcajs
  • akaladarshi
🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main change: refactoring tipset_by_height and load_child_tipset to distinguish errors from cache misses.
Linked Issues check ✅ Passed The PR fully implements the objectives from issue #6755: both tipset_by_height and load_child_tipset now return Result<Option>, enabling distinction between cache misses (Ok(None)) and storage errors (Err(...)). All affected call sites have been updated accordingly.
Out of Scope Changes check ✅ Passed All changes are directly related to refactoring tipset_by_height and load_child_tipset signatures and their call sites to support Result<Option> semantics, with no unrelated modifications detected.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch shashank/refactor-tipset-lookup
✨ Simplify code
  • Create PR with simplified code
  • Commit simplified code in branch shashank/refactor-tipset-lookup

Comment @coderabbitai help to get the list of available commands and usage tips.

@sudo-shashank sudo-shashank changed the title refactor tipset_by_height and load_child_tipset to distinguish error from miss Refactor tipset_by_height and load_child_tipset to distinguish error from miss May 6, 2026
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (4)
src/state_manager/mod.rs (1)

472-478: ⚡ Quick win

Avoid the second load_child_tipset on the None path.

When load_child_tipset(ts) returns None, load_tipset_state falls through to load_executed_tipset(ts), which immediately calls load_child_tipset(ts) again inside the cache fill. That adds an extra chain-store lookup on head-tipset cache misses in a hot path. Thread the already-resolved child tipset into the executed-tipset load instead of re-querying it.

Also applies to: 497-503

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/state_manager/mod.rs` around lines 472 - 478, The match in
load_tipset_state currently calls self.chain_store().load_child_tipset(ts) and,
on None, falls through to load_executed_tipset(ts) which re-queries
load_child_tipset inside its cache fill; to avoid the duplicate lookup, change
load_executed_tipset to accept an Option<ChildTipset> (or add a helper like
load_executed_tipset_with_child) and pass the already-resolved child tipset when
available so the cache-fill path uses the provided child instead of calling
load_child_tipset again; apply the same pattern to the other occurrence
referenced (around lines 497-503) so both sites thread the resolved child tipset
into the executed-tipset loader.
src/tool/subcommands/archive_cmd.rs (1)

844-854: ⚡ Quick win

Mirror the two-stage error context used in do_export.

Here the new lookup only labels the miss case. If tipset_by_height returns a real storage/index error, the CLI loses the requested epoch from the error path.

Proposed pattern
     let tipset = chain_index
-        .tipset_by_height(epoch, heaviest_tipset.clone(), ResolveNullTipset::TakeOlder)?
+        .tipset_by_height(epoch, heaviest_tipset.clone(), ResolveNullTipset::TakeOlder)
+        .with_context(|| format!("failed to resolve tipset at epoch {epoch}"))?
         .context("tipset not found at requested epoch")?;

     let child_tipset = chain_index
         .tipset_by_height(
             epoch + 1,
             heaviest_tipset.clone(),
             ResolveNullTipset::TakeNewer,
-        )?
+        )
+        .with_context(|| format!("failed to resolve child tipset after epoch {epoch}"))?
         .context("child tipset not found at epoch+1")?;

As per coding guidelines "Use anyhow::Result<T> for most operations and add context with .context() when errors occur".

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/tool/subcommands/archive_cmd.rs` around lines 844 - 854, The two tipset
lookups using chain_index.tipset_by_height (for variables tipset and
child_tipset with ResolveNullTipset::TakeOlder/TakeNewer and heaviest_tipset)
only add a final `.context("... not found")` which hides the requested epoch on
storage/index errors; change both calls to use the two-stage pattern used in
do_export: first attempt the call, map or propagate the underlying error, and
then apply `.context(format!("tipset not found at requested epoch {}", epoch))`
(and similarly for epoch+1) so that real storage/index errors retain their
original messages while the missing-tipset case still includes the epoch
context.
src/rpc/methods/chain.rs (1)

397-400: ⚡ Quick win

Preserve call-site context for the Err(_) path too.

These updated lookups mostly convert None into a better error, but they still let storage/index failures escape without height-specific context. The finality helper fallback lookups also propagate raw errors today.

Proposed pattern
         let tss = ctx
             .chain_index()
-            .tipset_by_height(height, ts, ResolveNullTipset::TakeOlder)?
+            .tipset_by_height(height, ts, ResolveNullTipset::TakeOlder)
+            .with_context(|| format!("failed to resolve tipset at height {height}"))?
             .with_context(|| format!("tipset not found at height {height}"))?;

Use the same two-stage pattern in the export call sites and in the finality fallback lookups so both Err(_) and Ok(None) are diagnosable.

As per coding guidelines "Use anyhow::Result<T> for most operations and add context with .context() when errors occur".

Also applies to: 596-599, 898-900, 929-930, 1071-1074, 1099-1102, 1237-1254

src/rpc/methods/eth.rs (1)

957-960: ⚡ Quick win

Add context before unwrapping the lookup Result.

These call sites currently only annotate the Ok(None) case. If tipset_by_height fails due to blockstore/index IO, the ? returns early and the queried height never appears in the error.

Proposed pattern
     chain
         .chain_index()
-        .tipset_by_height(height, head, resolve)?
+        .tipset_by_height(height, head, resolve)
+        .with_context(|| format!("failed to resolve tipset at height {height}"))?
         .with_context(|| format!("tipset not found at height {height}"))

Apply the same two-stage pattern to the canonical-height lookup as well.

As per coding guidelines "Use anyhow::Result<T> for most operations and add context with .context() when errors occur".

Also applies to: 973-976

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/rpc/methods/eth.rs` around lines 957 - 960, The call to
chain.chain_index().tipset_by_height(...) currently applies ? before adding
context, so IO/index errors lose the height context; change to the two-stage
pattern: first call chain.chain_index().tipset_by_height(height, head,
resolve).with_context(|| format!("failed to lookup tipset at height {height}"))?
to attach context to any Err, then separately handle the Ok(None) case (e.g.,
.ok_or_else(|| anyhow::anyhow!("tipset not found at height {height}"))?) so that
both IO errors and "not found" return messages include the height; apply the
same change for the other occurrence around tipset_by_height at the later call
site.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/chain/store/index.rs`:
- Around line 121-126: The chain walk currently uses
from.chain(&self.db).tuple_windows() which stops silently on missing ancestor
tipsets and causes missing-parent errors to be treated as Ok(None); replace that
iterator usage with an explicit ancestor-load loop that calls the tipset
parent-loading API (e.g. invoke the Tipset/Store method that loads parents from
self.db for each step) and propagate any storage/load errors as Err(...)
immediately, only returning Ok(None) when you have explicitly verified the chain
contains no tipset at the requested epoch; update the same logic used in the
comparable block further down (the code region replacing tuple_windows) so both
places explicitly load parents and fail on missing data instead of downgrading
to Ok(None).

---

Nitpick comments:
In `@src/rpc/methods/eth.rs`:
- Around line 957-960: The call to chain.chain_index().tipset_by_height(...)
currently applies ? before adding context, so IO/index errors lose the height
context; change to the two-stage pattern: first call
chain.chain_index().tipset_by_height(height, head, resolve).with_context(||
format!("failed to lookup tipset at height {height}"))? to attach context to any
Err, then separately handle the Ok(None) case (e.g., .ok_or_else(||
anyhow::anyhow!("tipset not found at height {height}"))?) so that both IO errors
and "not found" return messages include the height; apply the same change for
the other occurrence around tipset_by_height at the later call site.

In `@src/state_manager/mod.rs`:
- Around line 472-478: The match in load_tipset_state currently calls
self.chain_store().load_child_tipset(ts) and, on None, falls through to
load_executed_tipset(ts) which re-queries load_child_tipset inside its cache
fill; to avoid the duplicate lookup, change load_executed_tipset to accept an
Option<ChildTipset> (or add a helper like load_executed_tipset_with_child) and
pass the already-resolved child tipset when available so the cache-fill path
uses the provided child instead of calling load_child_tipset again; apply the
same pattern to the other occurrence referenced (around lines 497-503) so both
sites thread the resolved child tipset into the executed-tipset loader.

In `@src/tool/subcommands/archive_cmd.rs`:
- Around line 844-854: The two tipset lookups using chain_index.tipset_by_height
(for variables tipset and child_tipset with
ResolveNullTipset::TakeOlder/TakeNewer and heaviest_tipset) only add a final
`.context("... not found")` which hides the requested epoch on storage/index
errors; change both calls to use the two-stage pattern used in do_export: first
attempt the call, map or propagate the underlying error, and then apply
`.context(format!("tipset not found at requested epoch {}", epoch))` (and
similarly for epoch+1) so that real storage/index errors retain their original
messages while the missing-tipset case still includes the epoch context.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 1d34d777-c49e-410e-86e7-998b738ebfad

📥 Commits

Reviewing files that changed from the base of the PR and between 0042f5b and 487c1a1.

📒 Files selected for processing (19)
  • src/chain/store/chain_store.rs
  • src/chain/store/index.rs
  • src/dev/subcommands/export_state_tree_cmd.rs
  • src/dev/subcommands/state_cmd.rs
  • src/interpreter/fvm3.rs
  • src/interpreter/fvm4.rs
  • src/message_pool/msgpool/provider.rs
  • src/rpc/methods/chain.rs
  • src/rpc/methods/eth.rs
  • src/rpc/methods/eth/filter/mod.rs
  • src/rpc/methods/eth/tipset_resolver.rs
  • src/rpc/methods/f3.rs
  • src/rpc/methods/state.rs
  • src/state_manager/chain_rand.rs
  • src/state_manager/mod.rs
  • src/tool/subcommands/archive_cmd.rs
  • src/tool/subcommands/benchmark_cmd.rs
  • src/tool/subcommands/index_cmd.rs
  • src/tool/subcommands/snapshot_cmd.rs

Comment thread src/chain/store/index.rs Outdated
@sudo-shashank sudo-shashank marked this pull request as ready for review May 6, 2026 05:30
@sudo-shashank sudo-shashank requested a review from a team as a code owner May 6, 2026 05:30
@sudo-shashank sudo-shashank requested review from LesnyRumcajs and hanabi1224 and removed request for a team May 6, 2026 05:30
@codecov
Copy link
Copy Markdown

codecov Bot commented May 6, 2026

Codecov Report

❌ Patch coverage is 57.57576% with 56 lines in your changes missing coverage. Please review.
✅ Project coverage is 64.21%. Comparing base (868b7c2) to head (8220ef9).
✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
src/rpc/methods/chain.rs 57.50% 13 Missing and 4 partials ⚠️
src/state_manager/mod.rs 26.66% 8 Missing and 3 partials ⚠️
src/dev/subcommands/state_cmd.rs 0.00% 6 Missing ⚠️
src/chain/store/index.rs 91.66% 0 Missing and 3 partials ⚠️
src/tool/subcommands/archive_cmd.rs 25.00% 3 Missing ⚠️
src/chain/store/chain_store.rs 75.00% 2 Missing ⚠️
src/rpc/methods/eth.rs 66.66% 1 Missing and 1 partial ⚠️
src/rpc/methods/eth/tipset_resolver.rs 0.00% 2 Missing ⚠️
src/rpc/methods/state.rs 0.00% 2 Missing ⚠️
src/dev/subcommands/export_state_tree_cmd.rs 0.00% 1 Missing ⚠️
... and 7 more
Additional details and impacted files
Files with missing lines Coverage Δ
src/message_pool/msgpool/provider.rs 65.67% <100.00%> (+1.49%) ⬆️
src/rpc/methods/eth/filter/mod.rs 88.96% <100.00%> (ø)
src/dev/subcommands/export_state_tree_cmd.rs 0.00% <0.00%> (ø)
src/interpreter/fvm3.rs 42.34% <0.00%> (ø)
src/interpreter/fvm4.rs 42.38% <0.00%> (ø)
src/rpc/methods/f3.rs 0.00% <0.00%> (ø)
src/state_manager/chain_rand.rs 71.54% <75.00%> (ø)
src/tool/subcommands/benchmark_cmd.rs 0.00% <0.00%> (ø)
src/tool/subcommands/index_cmd.rs 0.00% <0.00%> (ø)
src/tool/subcommands/snapshot_cmd.rs 0.00% <0.00%> (ø)
... and 9 more

... and 10 files with indirect coverage changes


Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 868b7c2...8220ef9. Read the comment docs.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@sudo-shashank sudo-shashank marked this pull request as draft May 6, 2026 05:50
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
src/chain/store/index.rs (1)

183-187: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Propagate checkpoint load errors instead of treating them like cache misses.

This fast path currently collapses Err(_) and Ok(None) into the same branch. If a cached checkpoint key hits a real blockstore/IO failure, tipset_by_height silently retries from the original from tipset instead of fast-failing.

Suggested fix
         let mut checkpoint_from_epoch = to;
         while checkpoint_from_epoch < from_epoch {
-            if let Some(checkpoint_from_key) =
-                self.ts_height_cache.get_cloned(&checkpoint_from_epoch)
-                && let Ok(Some(checkpoint_from)) = self.load_tipset(&checkpoint_from_key)
-            {
-                from = checkpoint_from;
-                break;
+            if let Some(checkpoint_from_key) =
+                self.ts_height_cache.get_cloned(&checkpoint_from_epoch)
+            {
+                match self.load_tipset(&checkpoint_from_key) {
+                    Ok(Some(checkpoint_from)) => {
+                        from = checkpoint_from;
+                        break;
+                    }
+                    Ok(None) => {}
+                    Err(err) => return Err(err),
+                }
             }
             checkpoint_from_epoch = next_checkpoint(checkpoint_from_epoch);
         }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/chain/store/index.rs` around lines 183 - 187, The fast-path inside
tipset_by_height currently treats load_tipset errors the same as cache misses;
change the logic around checkpoint_from_epoch so that after retrieving
checkpoint_from_key via self.ts_height_cache.get_cloned(&checkpoint_from_epoch),
you explicitly match the result of self.load_tipset(&checkpoint_from_key): if
load_tipset returns Err(e) propagate/return that Err immediately (fast-fail), if
it returns Ok(Some(checkpoint_from)) continue the fast-path as intended, and if
it returns Ok(None) treat it as a cache miss and fall back to the existing slow
path. Update the block containing checkpoint_from_epoch,
ts_height_cache.get_cloned, and load_tipset to implement this explicit
matching/propagation.
src/rpc/methods/chain.rs (1)

1235-1249: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Don't collapse EC finality lookup errors into None.

Both branches here still swallow tipset_by_height failures (let Ok(Some(...)) / .ok().flatten()). That makes blockstore corruption look like “no finalized tipset” and defeats the new miss-vs-error split for ChainGetTipSetFinalityStatus.

Suggested fix
-        let finalized = if depth >= 0
-            && let Ok(Some(ts)) = ctx.chain_index().tipset_by_height(
-                (head.epoch() - depth).max(0),
-                head.shallow_clone(),
-                ResolveNullTipset::TakeOlder,
-            )
-        {
-            Some(ts)
-        } else {
-            let ec_finality_epoch =
-                (head.epoch() - ctx.chain_config().policy.chain_finality).max(0);
-            ctx.chain_index()
-                .tipset_by_height(ec_finality_epoch, head, ResolveNullTipset::TakeOlder)
-                .ok()
-                .flatten()
-        };
+        let finalized = if depth >= 0 {
+            let threshold_epoch = (head.epoch() - depth).max(0);
+            match ctx
+                .chain_index()
+                .tipset_by_height(
+                    threshold_epoch,
+                    head.shallow_clone(),
+                    ResolveNullTipset::TakeOlder,
+                )
+                .with_context(|| {
+                    format!("failed to resolve EC finality tipset at height {threshold_epoch}")
+                })?
+            {
+                Some(ts) => Some(ts),
+                None => {
+                    let ec_finality_epoch =
+                        (head.epoch() - ctx.chain_config().policy.chain_finality).max(0);
+                    ctx.chain_index()
+                        .tipset_by_height(ec_finality_epoch, head, ResolveNullTipset::TakeOlder)
+                        .with_context(|| {
+                            format!(
+                                "failed to resolve fallback EC finality tipset at height {ec_finality_epoch}"
+                            )
+                        })?
+                }
+            }
+        } else {
+            let ec_finality_epoch =
+                (head.epoch() - ctx.chain_config().policy.chain_finality).max(0);
+            ctx.chain_index()
+                .tipset_by_height(ec_finality_epoch, head, ResolveNullTipset::TakeOlder)
+                .with_context(|| {
+                    format!(
+                        "failed to resolve fallback EC finality tipset at height {ec_finality_epoch}"
+                    )
+                })?
+        };
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/rpc/methods/chain.rs` around lines 1235 - 1249, The code currently
silences errors from ctx.chain_index().tipset_by_height in both branches (the
pattern using let Ok(Some(...)) and .ok().flatten()), turning real errors into a
None "no finalized tipset" result; update the logic in the block that computes
finalized so that you explicitly handle Result from
ctx.chain_index().tipset_by_height: match the Result to propagate Err (return or
map the error into the RPC error used by ChainGetTipSetFinalityStatus), treat
Ok(Some(ts)) as Some(ts), and treat Ok(None) as the intended None/miss case;
locate the computation assigned to finalized and replace the uses of let
Ok(Some(...)) and .ok().flatten() with explicit match/if let handling that
distinguishes Err vs Ok(None) so blockstore/lookup errors are not collapsed into
a missing-finalized-tipset outcome.
🧹 Nitpick comments (1)
src/rpc/methods/chain.rs (1)

397-400: ⚡ Quick win

Add context on the Result layer before unwrapping the Option.

These call sites only annotate the None case right now. If tipset_by_height returns a real storage error, the RPC loses the height-specific context that this PR is trying to preserve.

Example pattern
 let start_ts = ctx
     .chain_index()
-    .tipset_by_height(epoch, head, ResolveNullTipset::TakeOlder)?
+    .tipset_by_height(epoch, head, ResolveNullTipset::TakeOlder)
+    .with_context(|| format!("couldn't get a tipset at height {epoch}"))?
     .with_context(|| format!("tipset not found at height {epoch}"))?;

As per coding guidelines, "**/*.rs: Use anyhow::Result<T> for most operations and add context with .context() when errors occur`."

Also applies to: 596-600, 896-899, 927-930, 1071-1074, 1099-1102

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/rpc/methods/chain.rs` around lines 397 - 400, The call to
chain_index().tipset_by_height(..., ResolveNullTipset::TakeOlder) currently
attaches the height context only when the Option is None; instead add context on
the Result layer before unwrapping so storage errors also get the height info:
call .context(|| format!("tipset not found at height {epoch}")) (or
.with_context on the Result) immediately after tipset_by_height(...) and before
any ? that converts the Result/Option, e.g. adjust the
chain_index().tipset_by_height(...) usage in the current snippet and the other
similar tipset_by_height call sites in this file so the error from storage is
annotated with the epoch message.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@src/chain/store/index.rs`:
- Around line 183-187: The fast-path inside tipset_by_height currently treats
load_tipset errors the same as cache misses; change the logic around
checkpoint_from_epoch so that after retrieving checkpoint_from_key via
self.ts_height_cache.get_cloned(&checkpoint_from_epoch), you explicitly match
the result of self.load_tipset(&checkpoint_from_key): if load_tipset returns
Err(e) propagate/return that Err immediately (fast-fail), if it returns
Ok(Some(checkpoint_from)) continue the fast-path as intended, and if it returns
Ok(None) treat it as a cache miss and fall back to the existing slow path.
Update the block containing checkpoint_from_epoch, ts_height_cache.get_cloned,
and load_tipset to implement this explicit matching/propagation.

In `@src/rpc/methods/chain.rs`:
- Around line 1235-1249: The code currently silences errors from
ctx.chain_index().tipset_by_height in both branches (the pattern using let
Ok(Some(...)) and .ok().flatten()), turning real errors into a None "no
finalized tipset" result; update the logic in the block that computes finalized
so that you explicitly handle Result from ctx.chain_index().tipset_by_height:
match the Result to propagate Err (return or map the error into the RPC error
used by ChainGetTipSetFinalityStatus), treat Ok(Some(ts)) as Some(ts), and treat
Ok(None) as the intended None/miss case; locate the computation assigned to
finalized and replace the uses of let Ok(Some(...)) and .ok().flatten() with
explicit match/if let handling that distinguishes Err vs Ok(None) so
blockstore/lookup errors are not collapsed into a missing-finalized-tipset
outcome.

---

Nitpick comments:
In `@src/rpc/methods/chain.rs`:
- Around line 397-400: The call to chain_index().tipset_by_height(...,
ResolveNullTipset::TakeOlder) currently attaches the height context only when
the Option is None; instead add context on the Result layer before unwrapping so
storage errors also get the height info: call .context(|| format!("tipset not
found at height {epoch}")) (or .with_context on the Result) immediately after
tipset_by_height(...) and before any ? that converts the Result/Option, e.g.
adjust the chain_index().tipset_by_height(...) usage in the current snippet and
the other similar tipset_by_height call sites in this file so the error from
storage is annotated with the epoch message.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 683bb560-d9e6-4ae9-8e20-70282996de34

📥 Commits

Reviewing files that changed from the base of the PR and between 487c1a1 and f111bb7.

📒 Files selected for processing (2)
  • src/chain/store/index.rs
  • src/rpc/methods/chain.rs

@sudo-shashank sudo-shashank marked this pull request as ready for review May 6, 2026 08:43
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/rpc/methods/chain.rs (1)

1202-1216: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Finality status still suppresses lookup failures.

These branches discard tipset load errors with if let Ok(parent) and .ok().flatten(). On a corrupted or partially-pruned chain, ChainGetTipSetFinalityStatus, safe, and finalized can now return a fallback tipset instead of surfacing the storage failure, which is the same ambiguity this refactor is trying to remove elsewhere.

🛠️ Suggested direction
         while chain.len() < chain_len {
             chain.push(ts.len() as i64);
-            if let Ok(parent) = ctx.chain_index().load_required_tipset(ts.parents()) {
-                // insert 0 for null rounds
-                if let Ok(n_null_tipsets_to_pad) = usize::try_from(ts.epoch() - parent.epoch() - 1)
-                    && n_null_tipsets_to_pad > 0
-                {
-                    let target_len =
-                        (chain.len().saturating_add(n_null_tipsets_to_pad)).min(chain_len);
-                    chain.resize(target_len, 0);
-                }
-                ts = parent;
-            } else {
+            if ts.epoch() == 0 {
                 break;
             }
+            let parent = ctx
+                .chain_index()
+                .load_required_tipset(ts.parents())
+                .context("failed to load ancestor while computing EC finality")?;
+            if let Ok(n_null_tipsets_to_pad) = usize::try_from(ts.epoch() - parent.epoch() - 1)
+                && n_null_tipsets_to_pad > 0
+            {
+                let target_len =
+                    (chain.len().saturating_add(n_null_tipsets_to_pad)).min(chain_len);
+                chain.resize(target_len, 0);
+            }
+            ts = parent;
         }
…
-        let finalized = if depth >= 0
-            && let Ok(Some(ts)) = ctx.chain_index().tipset_by_height(
-                (head.epoch() - depth).max(0),
-                head.shallow_clone(),
-                ResolveNullTipset::TakeOlder,
-            ) {
-            Some(ts)
+        let finalized = if depth >= 0 {
+            ctx.chain_index()
+                .tipset_by_height(
+                    (head.epoch() - depth).max(0),
+                    head.shallow_clone(),
+                    ResolveNullTipset::TakeOlder,
+                )
+                .with_context(|| format!("failed to resolve EC finalized tipset at depth {depth}"))?
         } else {
             let ec_finality_epoch =
                 (head.epoch() - ctx.chain_config().policy.chain_finality).max(0);
             ctx.chain_index()
                 .tipset_by_height(ec_finality_epoch, head, ResolveNullTipset::TakeOlder)
-                .ok()
-                .flatten()
+                .with_context(|| format!("failed to resolve EC finalized tipset at epoch {ec_finality_epoch}"))?
         };

As per coding guidelines: **/*.rs: Use anyhow::Result<T> for most operations and add context with .context() when errors occur.

Also applies to: 1235-1248

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/rpc/methods/chain.rs` around lines 1202 - 1216, The loop currently
swallows storage errors by using "if let Ok(parent)" and .ok().flatten(),
causing ChainGetTipSetFinalityStatus (and helper code paths like safe and
finalized) to return fallback tipsets instead of propagating load failures;
change these to propagate errors with anyhow::Result and add context: replace
the "if let Ok(parent) = ctx.chain_index().load_required_tipset(ts.parents()) {
... } else { break; }" pattern with a fallible load that uses ? (or
map_err(...).context(...)?), e.g. call
ctx.chain_index().load_required_tipset(ts.parents()).context("loading parent
tipset for ChainGetTipSetFinalityStatus")? and handle the result accordingly so
storage failures bubble up; apply the same treatment to the similar block around
lines 1235-1248 (and any .ok().flatten() uses in this function) so errors are
surfaced instead of suppressed.
♻️ Duplicate comments (1)
src/chain/store/index.rs (1)

121-125: ⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

tipset_by_height still has no clean-miss path.

Line 212 still walks with from.chain(&self.db).tuple_windows(), and the only unresolved exit at Lines 230-232 is Err(...), so Result<Option<Tipset>, _> never actually produces Ok(None). That means callers in this PR still can't observe the new miss-vs-error split from #6755. Either keep the old non-optional signature or switch this to an explicit parent-load loop so missing ancestors stay Err and a proven miss can return Ok(None).

Also applies to: 162-233

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@src/rpc/methods/chain.rs`:
- Around line 1202-1216: The loop currently swallows storage errors by using "if
let Ok(parent)" and .ok().flatten(), causing ChainGetTipSetFinalityStatus (and
helper code paths like safe and finalized) to return fallback tipsets instead of
propagating load failures; change these to propagate errors with anyhow::Result
and add context: replace the "if let Ok(parent) =
ctx.chain_index().load_required_tipset(ts.parents()) { ... } else { break; }"
pattern with a fallible load that uses ? (or map_err(...).context(...)?), e.g.
call ctx.chain_index().load_required_tipset(ts.parents()).context("loading
parent tipset for ChainGetTipSetFinalityStatus")? and handle the result
accordingly so storage failures bubble up; apply the same treatment to the
similar block around lines 1235-1248 (and any .ok().flatten() uses in this
function) so errors are surfaced instead of suppressed.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 14a65015-54f6-456a-a482-9641c7fe7033

📥 Commits

Reviewing files that changed from the base of the PR and between 487c1a1 and f111bb7.

📒 Files selected for processing (2)
  • src/chain/store/index.rs
  • src/rpc/methods/chain.rs

@sudo-shashank sudo-shashank force-pushed the shashank/refactor-tipset-lookup branch from f111bb7 to 39c0377 Compare May 6, 2026 09:11
@sudo-shashank sudo-shashank force-pushed the shashank/refactor-tipset-lookup branch from 39c0377 to 878bb44 Compare May 6, 2026 09:12
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/rpc/methods/chain.rs (1)

1235-1250: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Storage/IO errors silently swallowed in first tipset_by_height call.

The let Ok(Some(ts)) = ... pattern treats Err(...) the same as Ok(None), causing genuine storage/IO errors to fall through to the else branch instead of propagating. This contradicts the PR objective of ensuring fast-fail behavior for real errors.

Consider separating the error case from the cache-miss case:

Proposed fix to propagate errors while preserving fallback for cache miss
-        let finalized = if depth >= 0
-            && let Ok(Some(ts)) = ctx.chain_index().tipset_by_height(
+        let finalized = if depth >= 0 {
+            match ctx.chain_index().tipset_by_height(
                 (head.epoch() - depth).max(0),
                 head.shallow_clone(),
                 ResolveNullTipset::TakeOlder,
-            ) {
-            Some(ts)
+            )? {
+                Some(ts) => Some(ts),
+                None => {
+                    let ec_finality_epoch =
+                        (head.epoch() - ctx.chain_config().policy.chain_finality).max(0);
+                    ctx.chain_index().tipset_by_height(
+                        ec_finality_epoch,
+                        head,
+                        ResolveNullTipset::TakeOlder,
+                    )?
+                }
+            }
         } else {
             let ec_finality_epoch =
                 (head.epoch() - ctx.chain_config().policy.chain_finality).max(0);
             ctx.chain_index().tipset_by_height(
                 ec_finality_epoch,
                 head,
                 ResolveNullTipset::TakeOlder,
             )?
         };
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/rpc/methods/chain.rs` around lines 1235 - 1250, The current let
Ok(Some(ts)) = ctx.chain_index().tipset_by_height(...) pattern silently treats
Err as Ok(None) and lets storage/IO errors fall through; change this to
explicitly match the Result from ctx.chain_index().tipset_by_height (or use let
tmp = ...? if you want to propagate immediately) so that Err is propagated
(using ? or returning the error) while still distinguishing Ok(Some(ts)) (use
that for finalized) and Ok(None) (perform the fallback that computes
ec_finality_epoch and calls ctx.chain_index().tipset_by_height(...) with
ResolveNullTipset::TakeOlder); specifically update the logic around finalized,
head, depth, ctx.chain_index().tipset_by_height and ResolveNullTipset::TakeOlder
to handle Err vs Ok(None) separately.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@src/rpc/methods/chain.rs`:
- Around line 1235-1250: The current let Ok(Some(ts)) =
ctx.chain_index().tipset_by_height(...) pattern silently treats Err as Ok(None)
and lets storage/IO errors fall through; change this to explicitly match the
Result from ctx.chain_index().tipset_by_height (or use let tmp = ...? if you
want to propagate immediately) so that Err is propagated (using ? or returning
the error) while still distinguishing Ok(Some(ts)) (use that for finalized) and
Ok(None) (perform the fallback that computes ec_finality_epoch and calls
ctx.chain_index().tipset_by_height(...) with ResolveNullTipset::TakeOlder);
specifically update the logic around finalized, head, depth,
ctx.chain_index().tipset_by_height and ResolveNullTipset::TakeOlder to handle
Err vs Ok(None) separately.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 40b2a2e1-f45f-4091-9918-b92aeb1a0ca5

📥 Commits

Reviewing files that changed from the base of the PR and between f111bb7 and 878bb44.

📒 Files selected for processing (2)
  • src/chain/store/index.rs
  • src/rpc/methods/chain.rs

Comment thread src/chain/store/index.rs
Comment thread src/interpreter/fvm3.rs Outdated
@sudo-shashank sudo-shashank force-pushed the shashank/refactor-tipset-lookup branch 2 times, most recently from 3f8d1dd to 1cf95d2 Compare May 7, 2026 00:55
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (1)
src/chain/store/index.rs (1)

123-125: ⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Don't downgrade missing ancestors to a clean miss.

from.chain(&self.db).tuple_windows() can still stop the walk when a parent tipset can't be loaded, and the new Ok(None)/NotFound path plus test now treat that as an ordinary miss. That reintroduces the ambiguity this PR is trying to remove: broken ancestor links or blockstore gaps should fast-fail as Err(...), not be surfaced as "tipset not found".

tipset_by_height needs an explicit parent-loading loop that propagates ancestor load failures immediately, and this test should assert that error instead of None.

Also applies to: 212-230, 233-241, 370-386

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/chain/store/index.rs` around lines 123 - 125, The current
tipset_by_height implementation uses from.chain(&self.db).tuple_windows() which
can stop when a parent tipset fails to load and then returns Ok(None)/NotFound;
instead change tipset_by_height (in src/chain/store/index.rs) to perform an
explicit parent-loading loop that attempts to load each parent tipset via the
BlockStore/db and immediately propagate load failures as Err(...) (rather than
treating them as a clean miss), ensuring any missing ancestor or blockstore gap
returns an error; update analogous logic in the other affected blocks (around
the ranges you noted, e.g., the other tipset-by-height-like handlers at
~212-230, 233-241, 370-386) to use the same parent-loading-and-propagate pattern
and update the related test to expect an error when a parent cannot be loaded
instead of Ok(None).
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Duplicate comments:
In `@src/chain/store/index.rs`:
- Around line 123-125: The current tipset_by_height implementation uses
from.chain(&self.db).tuple_windows() which can stop when a parent tipset fails
to load and then returns Ok(None)/NotFound; instead change tipset_by_height (in
src/chain/store/index.rs) to perform an explicit parent-loading loop that
attempts to load each parent tipset via the BlockStore/db and immediately
propagate load failures as Err(...) (rather than treating them as a clean miss),
ensuring any missing ancestor or blockstore gap returns an error; update
analogous logic in the other affected blocks (around the ranges you noted, e.g.,
the other tipset-by-height-like handlers at ~212-230, 233-241, 370-386) to use
the same parent-loading-and-propagate pattern and update the related test to
expect an error when a parent cannot be loaded instead of Ok(None).

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 20b8280d-6706-4045-9833-bf22d55764ab

📥 Commits

Reviewing files that changed from the base of the PR and between 878bb44 and 1cf95d2.

📒 Files selected for processing (8)
  • src/chain/store/index.rs
  • src/dev/subcommands/export_state_tree_cmd.rs
  • src/dev/subcommands/state_cmd.rs
  • src/interpreter/fvm3.rs
  • src/interpreter/fvm4.rs
  • src/rpc/methods/f3.rs
  • src/rpc/methods/state.rs
  • src/tool/subcommands/index_cmd.rs

@sudo-shashank sudo-shashank force-pushed the shashank/refactor-tipset-lookup branch from 1cf95d2 to 9e202ff Compare May 7, 2026 01:27
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/rpc/methods/chain.rs (1)

1241-1256: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Don't swallow tipset_by_height errors in EC finality resolution.

The if let Ok(Some(ts)) = ... check still collapses genuine lookup failures into the fallback path. That defeats this PR's Ok(None) vs Err(_) contract and can return a stale EC-finalized tipset when the preferred lookup actually hit a storage/IO error.

Proposed fix
-        let finalized = if depth >= 0
-            && let Ok(Some(ts)) = ctx.chain_index().tipset_by_height(
-                (head.epoch() - depth).max(0),
-                head.shallow_clone(),
-                ResolveNullTipset::TakeOlder,
-            ) {
-            Some(ts)
+        let finalized = if depth >= 0 {
+            Some(ctx.chain_index().load_required_tipset_by_height(
+                (head.epoch() - depth).max(0),
+                head.shallow_clone(),
+                ResolveNullTipset::TakeOlder,
+            )?)
         } else {
             let ec_finality_epoch =
                 (head.epoch() - ctx.chain_config().policy.chain_finality).max(0);
             ctx.chain_index().tipset_by_height(
                 ec_finality_epoch,
                 head,
                 ResolveNullTipset::TakeOlder,
             )?
         };
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/rpc/methods/chain.rs` around lines 1241 - 1256, The current `if let
Ok(Some(ts)) = ctx.chain_index().tipset_by_height(...)` hides genuine errors by
treating Err(_) the same as Ok(None), causing storage/IO failures to fall
through to the EC-finality fallback; change that logic to match on the first
`tipset_by_height` call (the one using `(head.epoch() - depth).max(0)` and
`ResolveNullTipset::TakeOlder`) and: return Err on Err(e) (propagate the error),
use Some(ts) on Ok(Some(ts)), and only use the EC-finality fallback when you get
Ok(None); refer to `ctx.chain_index().tipset_by_height`,
`ResolveNullTipset::TakeOlder`, `ec_finality_epoch`, and the `finalized`
assignment when making this change.
🧹 Nitpick comments (3)
src/dev/subcommands/state_cmd.rs (1)

95-100: ⚡ Quick win

Add context to load_child_tipset failures too.

The new with_context(...) only decorates the None case after ? unwraps the Result. A real storage/IO failure from load_child_tipset still bubbles up without the epoch-specific context that this command needs.

Proposed fix
-            let ts_next = chain_store.load_child_tipset(&ts)?.with_context(|| {
+            let ts_next = chain_store
+                .load_child_tipset(&ts)
+                .with_context(|| format!("failed to load child tipset for epoch {}", ts.epoch()))?
+                .with_context(|| {
                 format!(
                     "no child tipset for epoch {} (may be chain head)",
                     ts.epoch()
                 )
-            })?;
+            })?;

As per coding guidelines **/*.rs: Use anyhow::Result<T> for most operations and add context with .context() when errors occur.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/dev/subcommands/state_cmd.rs` around lines 95 - 100, The call to
chain_store.load_child_tipset currently only wraps the None case with
with_context after the Result is unwrapped, so actual IO/storage errors lack
epoch-specific context; change the expression to add context on the Result
itself (use .context(...) from anyhow) before the ? so any Err from
chain_store.load_child_tipset includes the formatted message referencing
ts.epoch(), i.e., call chain_store.load_child_tipset(&ts).context(format!(...))?
and assign to ts_next as before to ensure both Err and None cases carry the
epoch context.
src/state_manager/mod.rs (2)

259-263: ⚡ Quick win

Add rewind-specific context to the required height lookup.

This now surfaces genuine lookup failures, but the bare ? drops the actor-bundle rewind context that explains which target height failed. validate_range below already uses the stronger pattern.

Suggested change
-                let target_head = self.chain_index().load_required_tipset_by_height(
-                    (expected_height_info.epoch - 1).max(0),
-                    head,
-                    ResolveNullTipset::TakeOlder,
-                )?;
+                let rewind_epoch = (expected_height_info.epoch - 1).max(0);
+                let target_head = self
+                    .chain_index()
+                    .load_required_tipset_by_height(
+                        rewind_epoch,
+                        head,
+                        ResolveNullTipset::TakeOlder,
+                    )
+                    .with_context(|| {
+                        format!(
+                            "failed to resolve rewind target at height {rewind_epoch} from head height {current_epoch}"
+                        )
+                    })?;

As per coding guidelines, "Use anyhow::Result for most operations and add context with .context() when errors occur".

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/state_manager/mod.rs` around lines 259 - 263, The call to
chain_index().load_required_tipset_by_height that assigns target_head can fail
but currently uses `?`, losing rewind-specific context; update the call in the
target_head assignment (the load_required_tipset_by_height invocation where
expected_height_info is used) to append .context(...) before the ? to include a
clear rewind-specific message (mentioning the target epoch/height from
expected_height_info and why the rewind failed), mirroring the stronger pattern
used by validate_range so errors surface with the desired actor-bundle rewind
context.

458-459: ⚡ Quick win

Wrap load_child_tipset failures with tipset context before propagating them.

These calls now correctly preserve Err(_), but the raw error will not say which tipset was being resolved or whether it came from load_tipset_state vs load_executed_tipset. Since this PR is making those failures user-visible, adding with_context(...) here will make fast-fail diagnostics much clearer.

Suggested change
-            match self.chain_store().load_child_tipset(ts)? {
+            match self
+                .chain_store()
+                .load_child_tipset(ts)
+                .with_context(|| {
+                    format!(
+                        "failed to load child tipset while loading tipset state for {}@{}",
+                        ts.key(),
+                        ts.epoch()
+                    )
+                })? {
                 Some(receipt_ts) => Ok(TipsetState {
                     state_root: *receipt_ts.parent_state(),
                     receipt_root: *receipt_ts.parent_message_receipts(),
                 }),
                 None => Ok(self.load_executed_tipset(ts).await?.into()),
@@
-                let receipt_ts = self.chain_store().load_child_tipset(ts)?;
+                let receipt_ts = self
+                    .chain_store()
+                    .load_child_tipset(ts)
+                    .with_context(|| {
+                        format!(
+                            "failed to load child tipset while loading executed tipset for {}@{}",
+                            ts.key(),
+                            ts.epoch()
+                        )
+                    })?;

As per coding guidelines, "Use anyhow::Result for most operations and add context with .context() when errors occur".

Also applies to: 485-486

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/state_manager/mod.rs` around lines 458 - 459, Wrap failures from
chain_store().load_child_tipset(ts) (and the similar call around
load_executed_tipset/load_tipset_state) with contextual information before
returning so the propagated anyhow::Error includes which tipset and which
resolution path failed; specifically, call .context(...) (or with_context(...))
on the Result returned by load_child_tipset(ts) and the related calls and
include the tipset identifier (ts) and the operation name (e.g., "resolving
child tipset for", "loading executed tipset", or "loading tipset state") in the
message to make diagnostics explicit (refer to load_child_tipset, TipsetState,
load_executed_tipset, load_tipset_state).
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/tool/subcommands/archive_cmd.rs`:
- Around line 587-590: Reject diff values that are greater than or equal to the
target tipset epoch before attempting to resolve the diff tipset: add a check
just before the call to index.load_required_tipset_by_height(...) that compares
diff with ts.epoch() (or the target epoch variable) and return an error (using
.context("diff epoch must be smaller than target epoch")? or equivalent) if diff
>= ts.epoch(); this prevents relying on ResolveNullTipset::TakeOlder to silently
return the anchor tipset and ensures the validation message is surfaced.

---

Outside diff comments:
In `@src/rpc/methods/chain.rs`:
- Around line 1241-1256: The current `if let Ok(Some(ts)) =
ctx.chain_index().tipset_by_height(...)` hides genuine errors by treating Err(_)
the same as Ok(None), causing storage/IO failures to fall through to the
EC-finality fallback; change that logic to match on the first `tipset_by_height`
call (the one using `(head.epoch() - depth).max(0)` and
`ResolveNullTipset::TakeOlder`) and: return Err on Err(e) (propagate the error),
use Some(ts) on Ok(Some(ts)), and only use the EC-finality fallback when you get
Ok(None); refer to `ctx.chain_index().tipset_by_height`,
`ResolveNullTipset::TakeOlder`, `ec_finality_epoch`, and the `finalized`
assignment when making this change.

---

Nitpick comments:
In `@src/dev/subcommands/state_cmd.rs`:
- Around line 95-100: The call to chain_store.load_child_tipset currently only
wraps the None case with with_context after the Result is unwrapped, so actual
IO/storage errors lack epoch-specific context; change the expression to add
context on the Result itself (use .context(...) from anyhow) before the ? so any
Err from chain_store.load_child_tipset includes the formatted message
referencing ts.epoch(), i.e., call
chain_store.load_child_tipset(&ts).context(format!(...))? and assign to ts_next
as before to ensure both Err and None cases carry the epoch context.

In `@src/state_manager/mod.rs`:
- Around line 259-263: The call to chain_index().load_required_tipset_by_height
that assigns target_head can fail but currently uses `?`, losing rewind-specific
context; update the call in the target_head assignment (the
load_required_tipset_by_height invocation where expected_height_info is used) to
append .context(...) before the ? to include a clear rewind-specific message
(mentioning the target epoch/height from expected_height_info and why the rewind
failed), mirroring the stronger pattern used by validate_range so errors surface
with the desired actor-bundle rewind context.
- Around line 458-459: Wrap failures from chain_store().load_child_tipset(ts)
(and the similar call around load_executed_tipset/load_tipset_state) with
contextual information before returning so the propagated anyhow::Error includes
which tipset and which resolution path failed; specifically, call .context(...)
(or with_context(...)) on the Result returned by load_child_tipset(ts) and the
related calls and include the tipset identifier (ts) and the operation name
(e.g., "resolving child tipset for", "loading executed tipset", or "loading
tipset state") in the message to make diagnostics explicit (refer to
load_child_tipset, TipsetState, load_executed_tipset, load_tipset_state).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 397d1550-fa58-492c-bcef-355e148f5533

📥 Commits

Reviewing files that changed from the base of the PR and between 1cf95d2 and 9e202ff.

📒 Files selected for processing (18)
  • src/chain/store/index.rs
  • src/dev/subcommands/export_state_tree_cmd.rs
  • src/dev/subcommands/state_cmd.rs
  • src/interpreter/fvm3.rs
  • src/interpreter/fvm4.rs
  • src/message_pool/msgpool/provider.rs
  • src/rpc/methods/chain.rs
  • src/rpc/methods/eth.rs
  • src/rpc/methods/eth/filter/mod.rs
  • src/rpc/methods/eth/tipset_resolver.rs
  • src/rpc/methods/f3.rs
  • src/rpc/methods/state.rs
  • src/state_manager/chain_rand.rs
  • src/state_manager/mod.rs
  • src/tool/subcommands/archive_cmd.rs
  • src/tool/subcommands/benchmark_cmd.rs
  • src/tool/subcommands/index_cmd.rs
  • src/tool/subcommands/snapshot_cmd.rs
🚧 Files skipped from review as they are similar to previous changes (4)
  • src/rpc/methods/f3.rs
  • src/rpc/methods/eth.rs
  • src/rpc/methods/eth/tipset_resolver.rs
  • src/chain/store/index.rs

Comment thread src/tool/subcommands/archive_cmd.rs
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
src/chain/store/index.rs (1)

184-186: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Don't ignore checkpoint tipset load failures.

A hit in ts_height_cache should not silently fall back to a slower scan when load_tipset returns Err or Ok(None). That turns blockstore corruption/IO failures into a fake cache miss and can later surface as a misleading None/NotFound from this lookup path. Use a required load here and propagate the failure instead of skipping it.

Suggested fix
-            if let Some(checkpoint_from_key) =
-                self.ts_height_cache.get_cloned(&checkpoint_from_epoch)
-                && let Ok(Some(checkpoint_from)) = self.load_tipset(&checkpoint_from_key)
-            {
-                from = checkpoint_from;
-                break;
-            }
+            if let Some(checkpoint_from_key) =
+                self.ts_height_cache.get_cloned(&checkpoint_from_epoch)
+            {
+                from = self.load_required_tipset(&checkpoint_from_key)?;
+                break;
+            }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/chain/store/index.rs` around lines 184 - 186, The current conditional
treats an Err/Ok(None) from load_tipset as a cache miss; instead make loading
the checkpoint required and propagate failures: after finding
checkpoint_from_key in ts_height_cache, call
self.load_tipset(&checkpoint_from_key) and if it returns Err or Ok(None)
return/propagate that error (don’t fall through to the slow scan), otherwise use
the loaded checkpoint_from; update the logic around checkpoint_from_epoch,
checkpoint_from_key and load_tipset to ensure load failures are surfaced rather
than ignored.
src/rpc/methods/chain.rs (1)

1242-1255: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Do not suppress lookup errors in EC finalized tipset resolution.

At Line 1242, let Ok(Some(ts)) = ... turns Err(...) into a silent fallback path. That reintroduces miss/error conflation and can hide real blockstore/IO failures instead of fast-failing.

Suggested fix
-        let finalized = if depth >= 0
-            && let Ok(Some(ts)) = ctx.chain_index().tipset_by_height(
-                (head.epoch() - depth).max(0),
-                head.shallow_clone(),
-                ResolveNullTipset::TakeOlder,
-            ) {
-            Some(ts)
-        } else {
-            let ec_finality_epoch =
-                (head.epoch() - ctx.chain_config().policy.chain_finality).max(0);
-            ctx.chain_index().tipset_by_height(
-                ec_finality_epoch,
-                head,
-                ResolveNullTipset::TakeOlder,
-            )?
-        };
+        let finalized = if depth >= 0 {
+            match ctx.chain_index().tipset_by_height(
+                (head.epoch() - depth).max(0),
+                head.shallow_clone(),
+                ResolveNullTipset::TakeOlder,
+            )? {
+                Some(ts) => Some(ts),
+                None => {
+                    let ec_finality_epoch =
+                        (head.epoch() - ctx.chain_config().policy.chain_finality).max(0);
+                    ctx.chain_index().tipset_by_height(
+                        ec_finality_epoch,
+                        head,
+                        ResolveNullTipset::TakeOlder,
+                    )?
+                }
+            }
+        } else {
+            let ec_finality_epoch =
+                (head.epoch() - ctx.chain_config().policy.chain_finality).max(0);
+            ctx.chain_index().tipset_by_height(
+                ec_finality_epoch,
+                head,
+                ResolveNullTipset::TakeOlder,
+            )?
+        };
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/rpc/methods/chain.rs` around lines 1242 - 1255, The current code swallows
errors by using `let Ok(Some(ts)) = ctx.chain_index().tipset_by_height(...)`,
converting any `Err` into the fallback path; instead call
`ctx.chain_index().tipset_by_height(...)` and propagate errors with `?`, then
branch on the Option result: if `Some(ts)` use it, otherwise compute
`ec_finality_epoch` and call
`ctx.chain_index().tipset_by_height(ec_finality_epoch, head,
ResolveNullTipset::TakeOlder)?` to also propagate errors; reference
`tipset_by_height`, `ctx.chain_index()`, `ResolveNullTipset::TakeOlder`,
`ec_finality_epoch`, `head`, and `depth`.
♻️ Duplicate comments (1)
src/chain/store/index.rs (1)

212-230: ⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

tuple_windows() still downgrades missing parents into Ok(None).

from.chain(&self.db) stops when an ancestor can't be loaded, so this loop can end on broken ancestry and return Ok(None) instead of the underlying storage/load error. That reintroduces the ambiguity this PR is meant to remove, and it also makes load_required_tipset_by_height report NotFound for corruption/pruned-parent failures. This needs an explicit parent-loading walk that propagates load errors immediately; None should stay reserved for a proven logical miss.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/chain/store/index.rs` around lines 212 - 230, The loop using
from.chain(&self.db).tuple_windows() can silently stop on a missing/broken
parent and return Ok(None), masking storage/load errors; change the logic in
load_required_tipset_by_height to perform an explicit parent-loading walk
(iteratively load parent tipset via db loads) instead of relying on
tuple_windows so you can immediately propagate any Err from the DB/load
routines; ensure the code paths that inspect epochs (child.epoch(),
parent.epoch()) still update ts_height_cache when is_checkpoint/is_finalized,
handle ResolveNullTipset (TakeOlder/TakeNewer) the same way, and only return
Ok(None) for a proven logical miss, never when a storage load returned an error.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@src/chain/store/index.rs`:
- Around line 184-186: The current conditional treats an Err/Ok(None) from
load_tipset as a cache miss; instead make loading the checkpoint required and
propagate failures: after finding checkpoint_from_key in ts_height_cache, call
self.load_tipset(&checkpoint_from_key) and if it returns Err or Ok(None)
return/propagate that error (don’t fall through to the slow scan), otherwise use
the loaded checkpoint_from; update the logic around checkpoint_from_epoch,
checkpoint_from_key and load_tipset to ensure load failures are surfaced rather
than ignored.

In `@src/rpc/methods/chain.rs`:
- Around line 1242-1255: The current code swallows errors by using `let
Ok(Some(ts)) = ctx.chain_index().tipset_by_height(...)`, converting any `Err`
into the fallback path; instead call `ctx.chain_index().tipset_by_height(...)`
and propagate errors with `?`, then branch on the Option result: if `Some(ts)`
use it, otherwise compute `ec_finality_epoch` and call
`ctx.chain_index().tipset_by_height(ec_finality_epoch, head,
ResolveNullTipset::TakeOlder)?` to also propagate errors; reference
`tipset_by_height`, `ctx.chain_index()`, `ResolveNullTipset::TakeOlder`,
`ec_finality_epoch`, `head`, and `depth`.

---

Duplicate comments:
In `@src/chain/store/index.rs`:
- Around line 212-230: The loop using from.chain(&self.db).tuple_windows() can
silently stop on a missing/broken parent and return Ok(None), masking
storage/load errors; change the logic in load_required_tipset_by_height to
perform an explicit parent-loading walk (iteratively load parent tipset via db
loads) instead of relying on tuple_windows so you can immediately propagate any
Err from the DB/load routines; ensure the code paths that inspect epochs
(child.epoch(), parent.epoch()) still update ts_height_cache when
is_checkpoint/is_finalized, handle ResolveNullTipset (TakeOlder/TakeNewer) the
same way, and only return Ok(None) for a proven logical miss, never when a
storage load returned an error.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: c8754371-c75c-4648-ab80-5346fd6bc616

📥 Commits

Reviewing files that changed from the base of the PR and between 1cf95d2 and 75f8ba3.

📒 Files selected for processing (19)
  • src/chain/store/chain_store.rs
  • src/chain/store/index.rs
  • src/dev/subcommands/export_state_tree_cmd.rs
  • src/dev/subcommands/state_cmd.rs
  • src/interpreter/fvm3.rs
  • src/interpreter/fvm4.rs
  • src/message_pool/msgpool/provider.rs
  • src/rpc/methods/chain.rs
  • src/rpc/methods/eth.rs
  • src/rpc/methods/eth/filter/mod.rs
  • src/rpc/methods/eth/tipset_resolver.rs
  • src/rpc/methods/f3.rs
  • src/rpc/methods/state.rs
  • src/state_manager/chain_rand.rs
  • src/state_manager/mod.rs
  • src/tool/subcommands/archive_cmd.rs
  • src/tool/subcommands/benchmark_cmd.rs
  • src/tool/subcommands/index_cmd.rs
  • src/tool/subcommands/snapshot_cmd.rs
✅ Files skipped from review due to trivial changes (1)
  • src/dev/subcommands/export_state_tree_cmd.rs
🚧 Files skipped from review as they are similar to previous changes (4)
  • src/interpreter/fvm3.rs
  • src/rpc/methods/eth/filter/mod.rs
  • src/tool/subcommands/archive_cmd.rs
  • src/state_manager/mod.rs

@sudo-shashank sudo-shashank requested a review from hanabi1224 May 7, 2026 02:09
@sudo-shashank sudo-shashank added this pull request to the merge queue May 7, 2026
Merged via the queue into main with commit 4cea4c5 May 7, 2026
34 checks passed
@sudo-shashank sudo-shashank deleted the shashank/refactor-tipset-lookup branch May 7, 2026 08:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

RPC requires calibnet RPC checks to run on CI

Projects

None yet

Development

Successfully merging this pull request may close these issues.

refactor: make tipset_by_height and load_child_tipset return anyhow::Result<Option<Tipset>> to distinguish error from miss

2 participants