feat(rewriter): instruction-level offset map (#143 DWARF Phase 2 inc 2)#202
Conversation
Second increment of DWARF Phase 2. The rewriter changes operand
values (function/global/table/memory/type indices) whose LEB128
encodings shift byte length, so intra-function byte offsets drift
during rewriting — DWARF .debug_line programs cannot be remapped by
function-base relocation alone. This adds the instruction-level
old->new offset map that captures that drift.
- InstrOffset { old, new } + InstrOffsetMap { entries } with a
translate(old) -> Option<new> lookup
- rewrite_function_body_with_offsets: parallel entry point that
returns (Function, InstrOffsetMap); shares a private core with
the existing rewrite_function_body (unchanged signature, zero
caller churn in merger.rs)
- offsets collected via reader.into_iter_with_offsets() for OLD
positions + per-instruction encoded-length measurement for NEW
positions (identical bytes to Function::instruction, via
wasm_encoder::Encode)
- opt-in: plain path pays nothing; output is byte-identical whether
or not offsets are collected (pinned by a test)
4 new tests:
- instr_offset_map_tracks_leb_growth_from_index_remap: call 0->200
grows the operand LEB 1->2 bytes; divergence accumulates exactly
[0,1,1,2,2,2] across call/drop/call/drop/const/end
- instr_offset_map_is_identity_when_no_leb_length_change: 0->1 keeps
1-byte LEB, new == old everywhere
- instr_offset_map_translate_hits_and_misses
- with_offsets_emits_identical_function_bytes
No new LS-N: pure infrastructure, output byte-identical, no new
hazard surface. The wrong-DWARF-address hazard materializes only when
increment 3 consumes this map; its LS-N lands there.
Increment 3 (gimli .debug_line/.debug_info rewrite) composes this
intra-function map with the per-function base from the v0.16.0
component-provenance v2 code_range.
295 lib tests green, clippy + fmt clean.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Mythos delta-pass requiredThis PR modifies one or more Tier-5 source files (per Before merge, run the Mythos discover protocol on the
Why this gate exists: LS-A-10 The gate check on this PR will pass once the label is |
LS-N verification gate
Approved Failed LS entries(none) Missing regression tests
Updated automatically by |
Mythos delta-pass (auto)✅ NO FINDINGS across 1 Tier-5 file(s)
Auto-run via |
DWARF Phase 2 increment 2 (#143, #202): rewriter instruction-level offset map — rewrite_function_body_with_offsets + InstrOffsetMap capture the intra-function byte drift caused by LEB128 operand-length changes during index remapping. The second of two anchors DWARF address remapping needs (increment 1 / v0.16.0 supplied the per- function base via component-provenance v2 code ranges). Increment 3 (gimli .debug_line/.debug_info rewrite composing both maps) follows in a later release. Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Summary
Second increment of #143 DWARF Phase 2. meld's rewriter changes operand values (function/global/table/memory/type indices) whose LEB128 encodings shift byte length, so intra-function byte offsets drift during rewriting —
.debug_lineprograms can't be remapped by function-base relocation alone. This adds the instruction-level old→new offset map that captures that drift.Changes
InstrOffset { old, new }+InstrOffsetMap { entries }withtranslate(old) -> Option<new>rewrite_function_body_with_offsets— parallel entry point returning(Function, InstrOffsetMap), sharing a private core with the existingrewrite_function_body(unchanged signature, zero caller churn in merger.rs)reader.into_iter_with_offsets(); NEW positions from per-instruction encoded-length measurement (identical bytes toFunction::instruction, both viawasm_encoder::Encode)Why offsets are relative to the instruction stream
Both
oldandneware relative to the start of the function body's instruction stream (the byte after the locals vector). Increment 3 composes these intra-function offsets with the per-function base from the v0.16.0 component-provenance v2code_rangeto translate DWARF code addresses input→fused.Tests (4 new)
instr_offset_map_tracks_leb_growth_from_index_remap—call 0→call 200grows the operand LEB 1→2 bytes; divergence accumulates exactly[0,1,1,2,2,2]across call/drop/call/drop/const/endinstr_offset_map_is_identity_when_no_leb_length_change— 0→1 keeps a 1-byte LEB;new == oldeverywhereinstr_offset_map_translate_hits_and_misseswith_offsets_emits_identical_function_bytes— offset collection must not perturb emitted codeNo new LS-N
Pure infrastructure; output is byte-identical (proven). The wrong-DWARF-address hazard materializes only when increment 3 consumes this map — its LS-N lands there.
Test plan
rewriter.rs(Tier-5)🤖 Generated with Claude Code