Fix UI test runner flag handling and recursive provenance resolution#126
Merged
dkcumming merged 6 commits intoruntimeverification:masterfrom Feb 28, 2026
Merged
Conversation
dkcumming
approved these changes
Feb 28, 2026
Collaborator
dkcumming
left a comment
There was a problem hiding this comment.
Nice! Awesome improvements to the scripts, and awesome solving of the bugs.
I started to make a new commit that was going to enforce that $RUST_REPO was checked out to the same commit we require for stable_mir_json so the tests are compatible. It happens for docker but not locally. When doing it I got some scope creep when I starting thinking we should have a utils.sh file for shared commands - so instead let's merge this and worry about things like that later!
…llection The `collect_alloc` function asserted that VTable and Static allocations must have a builtin-deref pointer type. This is wrong: vtable pointers carry a raw `*const ()` type that doesn't pass `builtin_deref(true)`, and static allocations can have non-pointer types depending on the static's definition. Removing these assertions fixes 6 UI tests that were previously panicking during alloc collection (4 thin-box tests, const-trait-to-trait, issue-11205).
Rewrite run_ui_tests.sh: flag-based CLI (--verbose, --save-generated-output, --save-debug-output), build once then invoke binary directly (eliminates cargo freshness overhead per test), arch-aware skip logic for cross-platform test lists. Add extract_test_flags() to both runners to parse //@ compile-flags:, //@ edition:, and //@ rustc-env: directives from test files and pass them through to the compiler. Previously these directives were silently ignored, causing tests to fail for the wrong reasons or pass despite being misconfigured. Update Makefile test-ui target for the new CLI.
get_prov_ty previously used field_for_offset (exact byte-offset match) to find the field at a given provenance offset, then returned that field's type directly. This broke on nested structs: a pointer at byte offset 56 inside TestDescAndFn sits inside a nested TestDesc field that starts at offset 0, not at offset 56 itself. Add field_containing_offset (range-based lookup: largest field start <= target) and make the struct and tuple branches recurse into the containing field with a relative offset, walking down through the field hierarchy until reaching the actual pointer type. Replace panicking unwraps/asserts with graceful eprintln + return None for layout failures, non-rigid types, and unexpected non-zero offsets on builtin_deref types. Add debug_log_println! instrumentation at key resolution points (zero cost without --features debug_log). Moves 18 UI tests from failing to passing (compile-flags tests that now work with the runner fix, plus tests that previously panicked in get_prov_ty).
16785a5 to
8c64cb6
Compare
cds-amal
added a commit
that referenced
this pull request
Feb 28, 2026
Add ensure_rustc_commit.sh: a helper sourced by both UI test scripts that reads the expected commit from rust-toolchain.toml [metadata] and ensures the rust checkout (regular or bare+worktree) is at that commit. Both run_ui_tests.sh and remake_ui_tests.sh now use RUST_SRC_DIR (set by ensure_rustc_commit.sh) instead of using the raw directory argument directly. CI workflow updated to use yq for TOML parsing. Cherry-picked from d021298 (spike/hex-rustc), adapted for the PR #126 test script rework on this branch.
cds-amal
added a commit
to cds-rs/stable-mir-json
that referenced
this pull request
Mar 1, 2026
…ntimeverification#124, runtimeverification#126, runtimeverification#127 Several merged PRs were missing from the changelog or lacked PR links: - runtimeverification#127: mutability field on PtrType/RefType in TypeMetadata - runtimeverification#124: ADR-001 (index-first graph architecture) - runtimeverification#121: existing entries for 3-phase pipeline, AllocMap coherence, and dead fixup removal now link to the PR - runtimeverification#126: existing entries for UI test runner rewrite and provenance resolution fixes now link to the PR
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR rewrites the UI test runner to actually respect
//@ compile-flags:directives (previously silently ignored), and fixes a panic in provenance type resolution that crashed on nested struct fields.run_ui_tests.sh: flag-based CLI, build-once-then-invoke (eliminates cargo freshness overhead per test), arch-aware skip logic, optional--save-debug-outputfor failure triageextract_test_flags()to both runners: parses//@ compile-flags:,//@ edition:, and//@ rustc-env:directives from test filesget_prov_tyto recursively walk nested struct/tuple fields instead of requiring an exact byte-offset match at the top levelget_prov_tywith graceful fallbacksUI test results: 19 failing -> 2 failing, 2871 passing -> 2888 passing.
What was wrong
Two independent problems, both surfaced by investigating the 19 UI test failures on the
masterbranch.1. Test runner ignoring compile-flags directives
The UI test runners (
run_ui_tests.sh,remake_ui_tests.sh) never extracted//@ compile-flags:,//@ edition:, or//@ rustc-env:directives from test files. Every test ran with bare-Zno-codegenregardless of what the test required. 17 of the 19 failures were compilation errors caused by missing flags (e.g.,--edition 2021,--test,-Zunstable-options,-Ccodegen-units=1).These tests were "failing" for the wrong reason: they weren't testing stable-mir-json at all, just failing to compile.
2. Provenance type resolution panicking on nested structs
get_prov_tyusedfield_for_offset(exact byte-offset match) to find which field a provenance pointer belongs to, then returned that field's type directly. This works when the pointer sits at the start of a top-level field, but breaks on nested structs.Concrete example:
TestDescAndFnhas fields at offsets 0 (TestDesc) and 128 (TestFn). A pointer at byte offset 56 sits insideTestDesc(which itself has a&strfield at relative offset 56). The old code looked for a field starting at exactly offset 56 in the top-level struct, found nothing, and panicked.The fix adds
field_containing_offset(range-based: finds the field with the largest start offset <= target) and makes the struct and tuple branches recurse into the containing field with a relative offset, walking down through the field hierarchy until reaching the actual pointer type.Test runner changes
The rewritten
run_ui_tests.sh:--verbose,--save-generated-output,--save-debug-outputreplace positional args and env varscargo buildupfront, then invokes the binary directly with properDYLD_LIBRARY_PATH/LD_LIBRARY_PATHsetup (cuts ~2-3s of cargo freshness checking per test)//@ only-<arch>directive parsingextract_test_flags()parses//@ compile-flags:,//@ edition:, and//@ rustc-env:directives--save-debug-outputcaptures stderr on failure totests/ui/debug/<test_name>.stderrand prints a 4-line snippet inlineremake_ui_tests.shgets the sameextract_test_flags()function so regeneration respects directives too.Remaining failures
tests/ui/issues/issue-26641.rstests/ui/sanitizer/cfi/drop-in-place.rs-Ccodegen-units=1in the test file's//@ compile-flags:directive; every sibling CFI test has it, this one doesn't. Upstream test bug, not ours.Test plan
cargo buildcleanmake integration-test