Single-pass body analysis with AllocMap coherence checks by cds-amal · Pull Request #121 · runtimeverification/stable-mir-json

cds-amal · 2026-02-20T04:46:59Z

What's this about?

So, #120 fixed the immediate alloc-id mismatch by carrying collected items forward instead of re-fetching them. That was the right call. But it left the underlying structure intact: three separate phases (mk_item, collect_unevaluated_constant_items, collect_interned_values), each with full access to TyCtxt, each free to call inst.body() or any other side-effecting rustc query whenever it felt like it. Nothing in the types prevented that, and the bug was a direct consequence: one phase called inst.body() a second time, rustc minted fresh AllocIds, and suddenly the alloc map had ids that didn't correspond to anything in the stored bodies.

The question is: how do we make that class of bug structurally impossible, rather than just fixed for the one case we caught?

The full decision record is in ADR-002.

The restructuring

The fix is to split the pipeline into phases with type signatures that enforce the boundary:

collect_and_analyze_items(HashMap<String, Item>)
  -> (CollectedCrate, DerivedInfo)

assemble_smir(CollectedCrate, DerivedInfo) -> SmirJson

CollectedCrate holds items and unevaluated consts (the output of talking to rustc). DerivedInfo holds calls, allocs, types, and spans (the output of walking bodies). assemble_smir takes both by value and does pure data transformation; it structurally cannot call inst.body() because it has no MonoItem or Instance to call it on. That's the whole point: if you can't reach the query, you can't accidentally call it.

The two body-walking visitors (InternedValueCollector and UnevaluatedConstCollector) are merged into a single BodyAnalyzer that walks each body exactly once. The fixpoint loop for transitive unevaluated constant discovery is integrated: when BodyAnalyzer finds an unevaluated const, it records it; the outer loop creates the new Item (the one place inst.body() is called) and enqueues it.

But what about catching regressions?

Turns out the existing integration tests normalize away alloc_ids (via the jq filter), so they literally cannot catch this class of bug. The golden files don't contain alloc ids at all; you could scramble every id in the output and the tests would still pass.

AllocMap replaces the bare HashMap<AllocId, ...> with a newtype that, under #[cfg(debug_assertions)], tracks every insertion and flags duplicates. After the collect/analyze phase completes, verify_coherence walks every stored Item body with an AllocIdCollector visitor and asserts that each referenced AllocId exists in the map. This catches both "walked a stale body" (missing ids) and "walked the same body twice" (duplicate insertions) at dev time; zero cost in release builds.

Other things that fell out of this

Static items now store their body in MonoItemKind::MonoItemStatic (collected once in mk_item), so the analysis phase never goes back to rustc for static bodies
get_item_details takes the pre-collected body as a parameter instead of calling inst.body() independently
The items_clone full HashMap clone is replaced by a HashSet of original item names (which is all the static fixup actually needed)
we uncovered and fixed a very old bug

What's deleted

InternedValueCollector, UnevaluatedConstCollector, collect_interned_values, collect_unevaluated_constant_items, the InternedValues type alias, and items_clone. Good riddance.

Downstream impact

The tighter allocs representation has already shown positive downstream effects in KMIR: the proof engine can now decode allocations inline (resolving to concrete values like StringVal("123")) instead of deferring them as opaque thunks. @dkcumming 's offset-u8 test went from thunking through #decodeConstant(constantKindAllo...) to directly producing toAlloc(allocId(0)), StringVal("123"). The test's expected output needed updating, but the new failure mode is semantically grounded in actual data rather than deferred interpretation.

Test plan

cargo build compiles
cargo clippy clean
cargo fmt clean
make integration-test passes (all 28 tests, identical output)
KMIR downstream: test_prove_rs[offset-u8-fail] expected output updated

jberthold

Great refactoring, makes a lot of sense.
We have to find out where the tests from the rustc suite are going wrong and why, but this is a good direction.

src/printer.rs

cds-amal · 2026-02-25T05:43:20Z

The early-return bug in `visit_terminator`

Fyi: @jberthold , @dkcumming :)

The symptom

Three UI tests (issue-58435-ice-with-assoc-const.rs, closure-to-fn-coercion.rs, ufcs-polymorphic-paths.rs) hit the verify_coherence assertion: alloc IDs referenced in stored bodies were missing from the alloc map. These failures were pre-existing (they occur on every commit since verify_coherence was introduced), not regressions from removing the static-item fixup.

The pattern all three tests share

Each test stores a function pointer in a constant:

// issue-58435
const ID: fn(&S<T>) -> &S<T> = |s| s;

// closure-to-fn-coercion
const FOO: fn(u8) -> u8 = |v: u8| { v };
const BAR: [fn(&mut u32); 5] = [ |_| {}, |v| *v += 1, ... ];

// ufcs-polymorphic-paths
// (many function constants used as call targets)

When rustc evaluates these constants, the resulting value is an Allocated constant: a memory allocation containing a pointer (with provenance) to the actual closure or function. In MIR, when one of these constants appears as the func operand of a Call terminator, the constant's kind is ConstantKind::Allocated, not ConstantKind::ZeroSized. (Regular direct calls, like foo() where foo is a named function, use ZeroSized constants: the function identity is encoded entirely in the type, and the value is zero-sized. But calling through a const that holds a function pointer produces an Allocated constant because the pointer value is actual data.)

The bug

BodyAnalyzer::visit_terminator (printer.rs, formerly line 1046) had this logic for Call terminators:

Call { func: Constant(ConstOperand { const_: cnst, .. }), .. } => {
    if *cnst.kind() != stable_mir::ty::ConstantKind::ZeroSized {
        return;  // <-- the bug
    }
    let inst = fn_inst_for_ty(cnst.ty(), true)
        .expect("Direct calls to functions must resolve to an instance");
    fn_inst_sym(self.tcx, Some(cnst.ty()), Some(&inst))
}

The intent was clear: if the call target isn't a ZeroSized function constant, skip the link-map resolution (you can't resolve an indirect call to a specific symbol name). The problem is that return exits visit_terminator entirely, which means self.super_terminator(term, loc) on line 1070 never runs.

super_terminator is the MIR visitor's default recursion method. It walks the terminator's operands, which is how visit_mir_const gets called on constants nested inside the terminator. By skipping it, the early return made the entire terminator subtree invisible to every collector: alloc collection, type collection, and span collection all missed everything inside that terminator's operands and arguments.

The AllocIdCollector used by verify_coherence doesn't override visit_terminator at all, so it uses the default implementation, which always calls super_terminator. It therefore recurses into the operands normally, finds the Allocated constant's provenance, and reports the alloc IDs. BodyAnalyzer never sees them because it bailed out before recursing. Hence the coherence violation: IDs in the stored body that aren't in the alloc map.

Why only these three tests?

The pattern requires a Call terminator whose function operand is a non-ZeroSized constant. This means: (a) a function pointer stored in a const item or associated const, (b) used as the direct call target in MIR. Most function calls in Rust use ZeroSized constants (the function's type alone identifies it); you only get Allocated call targets when the callee is computed from a const-evaluated value. This is relatively uncommon, which is why only 3 out of 75+ UI tests triggered the bug.

The fix

Replace return with None:

Call { func: Constant(ConstOperand { const_: cnst, .. }), .. } => {
    if *cnst.kind() != stable_mir::ty::ConstantKind::ZeroSized {
        None
    } else {
        let inst = fn_inst_for_ty(cnst.ty(), true)
            .expect("Direct calls to functions must resolve to an instance");
        fn_inst_sym(self.tcx, Some(cnst.ty()), Some(&inst))
    }
}

Now the match arm produces None for the link-map entry (no symbol to record for an indirect call), but execution falls through to update_link_map (which is a no-op for None) and then to self.super_terminator(term, loc), which recurses normally into the terminator's operands. Alloc collection, type collection, and span collection all proceed as expected.

Why `verify_coherence` was the right tool here

This bug predates the declarative pipeline work; it's been present since the original BodyAnalyzer implementation. It was never caught because the old code didn't have a coherence check, and the missing allocations only affect programs with const-evaluated function pointer calls (an unusual but valid pattern).

The AllocMap coherence check, introduced as part of the pipeline restructuring, made this immediately visible: it walks the stored bodies independently of BodyAnalyzer, compares the alloc IDs it finds against the alloc map, and asserts on any mismatch. The assertion message names the specific missing AllocIds, which pointed directly at the gap. Without coherence checking, these programs would have produced silently incomplete JSON output (missing allocations, missing types, missing spans for everything inside the affected terminators).

dkcumming · 2026-02-26T07:20:23Z

This is honestly fantastic work, and it was great to go over it all and see the improvements! I think the only thing left to do is update the passing.tsv and failing.tsv for the ui test suite

So, the context: 9a78109 ("Avoid inst.body() duplicate call") fixed the immediate alloc-id mismatch by carrying collected items forward instead of re-fetching them. That was the right call, but it left the three-phase pipeline structure intact (mk_item, then collect_unevaluated_constant_items, then collect_interned_values). Each phase could still freely call inst.body() or other side-effecting rustc queries, and nothing in the types prevented it. The fix for this is to restructure the pipeline so side-effecting rustc queries are confined to a single function (mk_item), and everything downstream operates on pre-collected data: collect_and_analyze_items(HashMap<String, Item>) -> (CollectedCrate, DerivedInfo) assemble_smir(CollectedCrate, DerivedInfo) -> SmirJson CollectedCrate holds items and unevaluated consts (the output of rustc interaction). DerivedInfo holds calls, allocs, types, and spans (the output of body analysis). assemble_smir takes both by value and does pure data transformation; it structurally cannot call inst.body() because it has no MonoItem or Instance to call it on. That's the whole point: if you can't reach the query, you can't accidentally call it. The two body-walking visitors (InternedValueCollector and UnevaluatedConstCollector) are merged into a single BodyAnalyzer that walks each body exactly once. The fixpoint loop for transitive unevaluated constant discovery is integrated: when BodyAnalyzer finds an unevaluated const, it records it; the outer loop creates the new Item (the one place inst.body() is called) and enqueues it. But what about catching regressions? Turns out the existing integration tests normalize away alloc_ids (via the jq filter), so they can't catch this class of bug at all. AllocMap replaces the bare HashMap<AllocId, ...> with a newtype that, under #[cfg(debug_assertions)], tracks every insertion and flags duplicates. After collect/analyze completes, verify_coherence walks every stored Item body and asserts that each referenced AllocId exists in the map. This catches both "walked a stale body" (missing ids) and "walked the same body twice" (duplicate insertions) at dev time; zero cost in release builds. A few other cleanups that fell out of this: static items now store their body in MonoItemKind::MonoItemStatic (collected once in mk_item), so the analysis phase never goes back to rustc for static bodies. get_item_details takes the pre-collected body as a parameter instead of calling inst.body() independently. The items_clone HashMap is replaced by a HashSet of original item names (which is all the static fixup actually needed). Deleted: InternedValueCollector, UnevaluatedConstCollector, collect_interned_values, collect_unevaluated_constant_items, the InternedValues type alias, and items_clone. All 28 integration tests produce identical output.

Begin formal version tracking at 0.2.0. The changelog covers all notable changes since the initial commit, with PR references. Also includes a cargo fmt pass on printer.rs.

The same check already runs as the first step inside assemble_smir, which is the function that actually consumes the data. No mutation happens between the two call sites, so the one in collect_smir was redundant.

The fixup block added statics discovered through allocation provenance that weren't in the original mono item set. It was broken in two ways: it violated the collection/assembly phase boundary (calling mk_item after verify_coherence had already run), and it misclassified statics as MonoItem::Fn, losing their eval_initializer() data. The block never triggered across the full integration test suite. If a genuine missing-static scenario exists, verify_coherence will now catch it: it walks every stored body, extracts AllocIds from provenance, and asserts each one exists in the alloc map. This produces a clear, actionable assertion (naming the specific missing AllocIds) rather than silently emitting a misclassified item. Also removes the now-dead original_item_names field from CollectedCrate and the unused AllocMap::iter method.

BodyAnalyzer::visit_terminator had an early `return` when a Call terminator's function operand was a non-ZeroSized constant (i.e., an indirect call through a const-evaluated function pointer). The intent was to skip link-map resolution for indirect calls, but `return` exited the entire method, skipping self.super_terminator(). That meant the MIR visitor never recursed into the terminator's operands, so collect_alloc, the type collector, and the span collector all missed everything inside that terminator. The bug has been present since aff2dd0 ("Map function types to names and update output format", July 2024) and affects programs that call through function pointers stored in constants (e.g., `const ID: fn(...) = |s| s;` used as a call target). Three UI tests hit this pattern: issue-58435-ice-with-assoc-const, closure-to-fn-coercion, and ufcs-polymorphic-paths. Fix: replace `return` with `None` so the match arm produces no link-map entry but falls through to super_terminator for normal recursion. Caught by verify_coherence, which walks bodies independently of BodyAnalyzer and found alloc IDs that the analyzer never collected.

dkcumming

Fantastic!

ZEINO2022 · 2026-02-27T06:20:27Z

You have done a wonderful, systematic, and proper job.

…ntimeverification#124, runtimeverification#126, runtimeverification#127 Several merged PRs were missing from the changelog or lacked PR links: - runtimeverification#127: mutability field on PtrType/RefType in TypeMetadata - runtimeverification#124: ADR-001 (index-first graph architecture) - runtimeverification#121: existing entries for 3-phase pipeline, AllocMap coherence, and dead fixup removal now link to the PR - runtimeverification#126: existing entries for UI test runner rewrite and provenance resolution fixes now link to the PR

dkcumming mentioned this pull request Feb 20, 2026

Avoid inst.body() duplicate call which reallocates AllocIds #120

Merged

cds-amal force-pushed the dc/declarative-spike branch from 69cd6a5 to 6f6c567 Compare February 20, 2026 17:40

cds-amal changed the title ~~Declarative spike~~ Declarative collect/analyze/assemble pipeline with AllocMap coherence Feb 20, 2026

cds-amal marked this pull request as ready for review February 20, 2026 17:45

cds-amal mentioned this pull request Feb 21, 2026

Isolate rustc internals and make toolchain bumps less hurt #123

Draft

cds-amal changed the title ~~Declarative collect/analyze/assemble pipeline with AllocMap coherence~~ Single-pass body analysis with AllocMap coherence checks Feb 22, 2026

jberthold reviewed Feb 23, 2026

View reviewed changes

src/printer.rs Outdated Show resolved Hide resolved

src/printer.rs Outdated Show resolved Hide resolved

cds-amal requested a review from jberthold February 24, 2026 15:55

dkcumming self-assigned this Feb 26, 2026

cds-amal added 10 commits February 26, 2026 06:42

Bump version to 0.2.0 and add CHANGELOG

49caf33

Begin formal version tracking at 0.2.0. The changelog covers all notable changes since the initial commit, with PR references. Also includes a cargo fmt pass on printer.rs.

Add ADR-002: declarative pipeline with AllocMap coherence

8decd61

Rename ADR-002: drop pr- prefix from filename

3201e33

year: off by one

d6ac2c5

Remove duplicate verify_coherence call from collect_smir

b4db58c

The same check already runs as the first step inside assemble_smir, which is the function that actually consumes the data. No mutation happens between the two call sites, so the one in collect_smir was redundant.

chore(changelog): update

fbbd599

Update failing / passing tsvs

306640c

cds-amal force-pushed the dc/declarative-spike branch from c22b1ea to 306640c Compare February 26, 2026 11:43

dkcumming added the automerge label Feb 26, 2026

dkcumming approved these changes Feb 26, 2026

View reviewed changes

dkcumming merged commit cab07e2 into runtimeverification:master Feb 26, 2026
5 checks passed

cds-amal deleted the dc/declarative-spike branch February 27, 2026 01:29

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Single-pass body analysis with AllocMap coherence checks#121

Single-pass body analysis with AllocMap coherence checks#121
dkcumming merged 10 commits intoruntimeverification:masterfrom
cds-rs:dc/declarative-spike

cds-amal commented Feb 20, 2026 •

edited by dkcumming

Loading

Uh oh!

jberthold left a comment

Uh oh!

Uh oh!

Uh oh!

cds-amal commented Feb 25, 2026 •

edited

Loading

Uh oh!

dkcumming commented Feb 26, 2026

Uh oh!

dkcumming left a comment

Uh oh!

Uh oh!

ZEINO2022 commented Feb 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

cds-amal commented Feb 20, 2026 • edited by dkcumming Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What's this about?

The restructuring

But what about catching regressions?

Other things that fell out of this

What's deleted

Downstream impact

Test plan

Uh oh!

jberthold left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

cds-amal commented Feb 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

The early-return bug in visit_terminator

The symptom

The pattern all three tests share

The bug

Why only these three tests?

The fix

Why verify_coherence was the right tool here

Uh oh!

dkcumming commented Feb 26, 2026

Uh oh!

dkcumming left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

ZEINO2022 commented Feb 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

cds-amal commented Feb 20, 2026 •

edited by dkcumming

Loading

cds-amal commented Feb 25, 2026 •

edited

Loading

The early-return bug in `visit_terminator`

Why `verify_coherence` was the right tool here