Single-pass body analysis with AllocMap coherence checks#121
Single-pass body analysis with AllocMap coherence checks#121dkcumming merged 10 commits intoruntimeverification:masterfrom
Conversation
69cd6a5 to
6f6c567
Compare
jberthold
left a comment
There was a problem hiding this comment.
Great refactoring, makes a lot of sense.
We have to find out where the tests from the rustc suite are going wrong and why, but this is a good direction.
The early-return bug in
|
|
This is honestly fantastic work, and it was great to go over it all and see the improvements! I think the only thing left to do is update the |
So, the context: 9a78109 ("Avoid inst.body() duplicate call") fixed the immediate alloc-id mismatch by carrying collected items forward instead of re-fetching them. That was the right call, but it left the three-phase pipeline structure intact (mk_item, then collect_unevaluated_constant_items, then collect_interned_values). Each phase could still freely call inst.body() or other side-effecting rustc queries, and nothing in the types prevented it. The fix for this is to restructure the pipeline so side-effecting rustc queries are confined to a single function (mk_item), and everything downstream operates on pre-collected data: collect_and_analyze_items(HashMap<String, Item>) -> (CollectedCrate, DerivedInfo) assemble_smir(CollectedCrate, DerivedInfo) -> SmirJson CollectedCrate holds items and unevaluated consts (the output of rustc interaction). DerivedInfo holds calls, allocs, types, and spans (the output of body analysis). assemble_smir takes both by value and does pure data transformation; it structurally cannot call inst.body() because it has no MonoItem or Instance to call it on. That's the whole point: if you can't reach the query, you can't accidentally call it. The two body-walking visitors (InternedValueCollector and UnevaluatedConstCollector) are merged into a single BodyAnalyzer that walks each body exactly once. The fixpoint loop for transitive unevaluated constant discovery is integrated: when BodyAnalyzer finds an unevaluated const, it records it; the outer loop creates the new Item (the one place inst.body() is called) and enqueues it. But what about catching regressions? Turns out the existing integration tests normalize away alloc_ids (via the jq filter), so they can't catch this class of bug at all. AllocMap replaces the bare HashMap<AllocId, ...> with a newtype that, under #[cfg(debug_assertions)], tracks every insertion and flags duplicates. After collect/analyze completes, verify_coherence walks every stored Item body and asserts that each referenced AllocId exists in the map. This catches both "walked a stale body" (missing ids) and "walked the same body twice" (duplicate insertions) at dev time; zero cost in release builds. A few other cleanups that fell out of this: static items now store their body in MonoItemKind::MonoItemStatic (collected once in mk_item), so the analysis phase never goes back to rustc for static bodies. get_item_details takes the pre-collected body as a parameter instead of calling inst.body() independently. The items_clone HashMap is replaced by a HashSet of original item names (which is all the static fixup actually needed). Deleted: InternedValueCollector, UnevaluatedConstCollector, collect_interned_values, collect_unevaluated_constant_items, the InternedValues type alias, and items_clone. All 28 integration tests produce identical output.
Begin formal version tracking at 0.2.0. The changelog covers all notable changes since the initial commit, with PR references. Also includes a cargo fmt pass on printer.rs.
The same check already runs as the first step inside assemble_smir, which is the function that actually consumes the data. No mutation happens between the two call sites, so the one in collect_smir was redundant.
The fixup block added statics discovered through allocation provenance that weren't in the original mono item set. It was broken in two ways: it violated the collection/assembly phase boundary (calling mk_item after verify_coherence had already run), and it misclassified statics as MonoItem::Fn, losing their eval_initializer() data. The block never triggered across the full integration test suite. If a genuine missing-static scenario exists, verify_coherence will now catch it: it walks every stored body, extracts AllocIds from provenance, and asserts each one exists in the alloc map. This produces a clear, actionable assertion (naming the specific missing AllocIds) rather than silently emitting a misclassified item. Also removes the now-dead original_item_names field from CollectedCrate and the unused AllocMap::iter method.
BodyAnalyzer::visit_terminator had an early `return` when a Call terminator's function operand was a non-ZeroSized constant (i.e., an indirect call through a const-evaluated function pointer). The intent was to skip link-map resolution for indirect calls, but `return` exited the entire method, skipping self.super_terminator(). That meant the MIR visitor never recursed into the terminator's operands, so collect_alloc, the type collector, and the span collector all missed everything inside that terminator. The bug has been present since aff2dd0 ("Map function types to names and update output format", July 2024) and affects programs that call through function pointers stored in constants (e.g., `const ID: fn(...) = |s| s;` used as a call target). Three UI tests hit this pattern: issue-58435-ice-with-assoc-const, closure-to-fn-coercion, and ufcs-polymorphic-paths. Fix: replace `return` with `None` so the match arm produces no link-map entry but falls through to super_terminator for normal recursion. Caught by verify_coherence, which walks bodies independently of BodyAnalyzer and found alloc IDs that the analyzer never collected.
c22b1ea to
306640c
Compare
|
You have done a wonderful, systematic, and proper job. |
…ntimeverification#124, runtimeverification#126, runtimeverification#127 Several merged PRs were missing from the changelog or lacked PR links: - runtimeverification#127: mutability field on PtrType/RefType in TypeMetadata - runtimeverification#124: ADR-001 (index-first graph architecture) - runtimeverification#121: existing entries for 3-phase pipeline, AllocMap coherence, and dead fixup removal now link to the PR - runtimeverification#126: existing entries for UI test runner rewrite and provenance resolution fixes now link to the PR
What's this about?
So, #120 fixed the immediate alloc-id mismatch by carrying collected items forward instead of re-fetching them. That was the right call. But it left the underlying structure intact: three separate phases (mk_item, collect_unevaluated_constant_items, collect_interned_values), each with full access to
TyCtxt, each free to callinst.body()or any other side-effecting rustc query whenever it felt like it. Nothing in the types prevented that, and the bug was a direct consequence: one phase calledinst.body()a second time, rustc minted fresh AllocIds, and suddenly the alloc map had ids that didn't correspond to anything in the stored bodies.The question is: how do we make that class of bug structurally impossible, rather than just fixed for the one case we caught?
The full decision record is in
ADR-002.The restructuring
The fix is to split the pipeline into phases with type signatures that enforce the boundary:
CollectedCrateholds items and unevaluated consts (the output of talking to rustc).DerivedInfoholds calls, allocs, types, and spans (the output of walking bodies).assemble_smirtakes both by value and does pure data transformation; it structurally cannot callinst.body()because it has noMonoItemorInstanceto call it on. That's the whole point: if you can't reach the query, you can't accidentally call it.The two body-walking visitors (
InternedValueCollectorandUnevaluatedConstCollector) are merged into a singleBodyAnalyzerthat walks each body exactly once. The fixpoint loop for transitive unevaluated constant discovery is integrated: whenBodyAnalyzerfinds an unevaluated const, it records it; the outer loop creates the newItem(the one placeinst.body()is called) and enqueues it.But what about catching regressions?
Turns out the existing integration tests normalize away
alloc_ids (via the jq filter), so they literally cannot catch this class of bug. The golden files don't contain alloc ids at all; you could scramble every id in the output and the tests would still pass.AllocMapreplaces the bareHashMap<AllocId, ...>with a newtype that, under#[cfg(debug_assertions)], tracks every insertion and flags duplicates. After the collect/analyze phase completes,verify_coherencewalks every storedItembody with anAllocIdCollectorvisitor and asserts that each referencedAllocIdexists in the map. This catches both "walked a stale body" (missing ids) and "walked the same body twice" (duplicate insertions) at dev time; zero cost in release builds.Other things that fell out of this
MonoItemKind::MonoItemStatic(collected once inmk_item), so the analysis phase never goes back to rustc for static bodiesget_item_detailstakes the pre-collected body as a parameter instead of callinginst.body()independentlyitems_clonefull HashMap clone is replaced by aHashSetof original item names (which is all the static fixup actually needed)What's deleted
InternedValueCollector,UnevaluatedConstCollector,collect_interned_values,collect_unevaluated_constant_items, theInternedValuestype alias, anditems_clone. Good riddance.Downstream impact
The tighter allocs representation has already shown positive downstream effects in KMIR: the proof engine can now decode allocations inline (resolving to concrete values like
StringVal("123")) instead of deferring them as opaque thunks. @dkcumming 'soffset-u8test went from thunking through#decodeConstant(constantKindAllo...)to directly producingtoAlloc(allocId(0)), StringVal("123"). The test's expected output needed updating, but the new failure mode is semantically grounded in actual data rather than deferred interpretation.Test plan
cargo buildcompilescargo clippycleancargo fmtcleanmake integration-testpasses (all 28 tests, identical output)test_prove_rs[offset-u8-fail]expected output updated