Skip to content

compare-output-by-lines diff of stdout from //@ check-run-results is unclear #148442

@WaffleLapkin

Description

@WaffleLapkin

Summary

When in compare_output_by_lines "mode" (either because of //@ compare-output-by-lines or a runner being set for the target) the diff of stdout produced by a failing //@ check-run-results is misleading and confusing.

Command used

tests/ui/tmp.rs:

//@ compare-output-by-lines
//@ check-run-results
//@ run-pass

fn main() {
    for i in 0..5 {
        println!("{i}");
    }
}

tests/ui/tmp.run.stdout:

0
1
a
3
4

command:

; x t tests/ui/tmp.rs
Building bootstrap
    Finished `dev` profile [unoptimized] target(s) in 0.08s
/chonky/wffl/build/rust-r/x86_64-unknown-linux-gnu/ci-llvm/bin/llvm-strip does not exist; skipping copy
Building stage1 compiler artifacts (stage0 -> stage1, x86_64-unknown-linux-gnu)
    Finished `release` profile [optimized + debuginfo] target(s) in 0.17s
Creating a sysroot for stage1 compiler (use `rustup toolchain link 'name' build/host/stage1`)
Building stage1 lld-wrapper (stage0 -> stage1, x86_64-unknown-linux-gnu)
    Finished `release` profile [optimized + debuginfo] target(s) in 0.09s
Building stage1 library artifacts (stage1 -> stage1, x86_64-unknown-linux-gnu)
    Finished `release` profile [optimized + debuginfo] target(s) in 0.03s
Building stage1 compiletest (stage0 -> stage1, x86_64-unknown-linux-gnu)
    Finished `release` profile [optimized + debuginfo] target(s) in 0.11s
Testing stage1 with compiletest suite=ui mode=ui (x86_64-unknown-linux-gnu)

running 1 tests

[ui] tests/ui/tmp.rs ... F


failures:

---- [ui] tests/ui/tmp.rs stdout ----
Saved the actual run.stdout to `/home/wffl/prog/rust-r/build/x86_64-unknown-linux-gnu/test/ui/tmp/tmp.run.stdout`
diff of run.stdout:

1	0
2	1
-	a
4	3
5	4


The actual run.stdout differed from the expected run.stdout

error: 1 errors occurred comparing run output.
status: exit status: 0
command: cd "/home/wffl/prog/rust-r/build/x86_64-unknown-linux-gnu/test/ui/tmp" && RUSTC="/home/wffl/prog/rust-r/build/x86_64-unknown-linux-gnu/stage1/bin/rustc" RUST_TEST_THREADS="16" "/home/wffl/prog/rust-r/build/x86_64-unknown-linux-gnu/test/ui/tmp/a"
--- stdout -------------------------------
0
1
2
3
4
------------------------------------------
stderr: none

---- [ui] tests/ui/tmp.rs stdout end ----

failures:
    [ui] tests/ui/tmp.rs

test result: FAILED. 0 passed; 1 failed; 0 ignored; 0 measured; 19983 filtered out; finished in 94.76ms

Some tests failed in compiletest suite=ui mode=ui host=x86_64-unknown-linux-gnu target=x86_64-unknown-linux-gnu
Build completed unsuccessfully in 0:00:03

blessing:

; x t tests/ui/tmp.rs --bless
Building bootstrap
    Finished `dev` profile [unoptimized] target(s) in 0.08s
/chonky/wffl/build/rust-r/x86_64-unknown-linux-gnu/ci-llvm/bin/llvm-strip does not exist; skipping copy
Building stage1 compiler artifacts (stage0 -> stage1, x86_64-unknown-linux-gnu)
    Finished `release` profile [optimized + debuginfo] target(s) in 0.16s
Creating a sysroot for stage1 compiler (use `rustup toolchain link 'name' build/host/stage1`)
Building stage1 lld-wrapper (stage0 -> stage1, x86_64-unknown-linux-gnu)
    Finished `release` profile [optimized + debuginfo] target(s) in 0.09s
Building stage1 library artifacts (stage1 -> stage1, x86_64-unknown-linux-gnu)
    Finished `release` profile [optimized + debuginfo] target(s) in 0.02s
Building stage1 compiletest (stage0 -> stage1, x86_64-unknown-linux-gnu)
    Finished `release` profile [optimized + debuginfo] target(s) in 0.10s
Testing stage1 with compiletest suite=ui mode=ui (x86_64-unknown-linux-gnu)

running 1 tests
.

test result: ok. 1 passed; 0 failed; 0 ignored; 0 measured; 19983 filtered out; finished in 97.04ms

Build completed successfully in 0:00:04
: rust-r (master+); bat tests/ui/tmp.run.stdout -p
0
1
3
4

Expected behaviour

I expected the diff of stdout to show that 2 was replaced by a. And when blessing I expected the 2 to be present in the resulting stdout.

Actual behaviour

Diff only shows the deletion of a, but not the addition of 2. Blessing simply applies the diff, making the test less precise (it doesn't test for the presence of 2).

Bootstrap configuration (bootstrap.toml)

config

change-id = 147157

# Includes one of the default files in src/bootstrap/defaults
profile = "compiler"

[llvm]
# LLVM assertions may help debugging issues with LLVM/codegen
#assertions = true

# Use CI llvm instead of building one locally
download-ci-llvm = true

[rust]
# Make debugging of the compiler possible
debug = true
backtrace = true

# Incremental generally makes compilation faster (?)
incremental = true

# I don't know how people work on rustc without this.
# While developing it is inevitable that there will be warnings,
# but I still want to be able to test my changes!!
deny-warnings = false

[build]
# Silly funny thing :3
description = "wffl cmplr-r"

low-priority = true

# A couple[^1] tests use absolute paths to the build dirtectory.
# To make them runnable, the test strips the absolute path to build dir
# ... Which does not work if it's a symlink.
#
# I need the build directory to be a symlink, because I want it to be
# outside the default partition, as to not be backupped.
#
# As a compromise I keep `./build` as a symlink, but set `build-dir`
# to its *target*. That way I can find things in the build dir
# with `./build/...`, the tests work, and build dir is not backupped.
#
# [^1]: 4 in total, `tests/ui/crate-loading/crateresolve{1,2}.rs`
#       and `tests/ui/error-codes/E{0464,0523}.rs`.
# build-dir = "/chonky/wffl/build/rust-r"

Operating system

irrelevant

HEAD

c9537a9

Additional context

I found this at work (tasking.com). We had a failing test that I for the longest time couldn't fix or understand why it fails. In the end turned out that one of the lines had a different line number on it. And by-line comparison was implied by a runner existing (we are targeting an embedded platform emulator for some tests):

// Wrapper tools set by `runner` might provide extra output on failure,
// for example a WebAssembly runtime might print the stack trace of an
// `unreachable` instruction by default.
//
// Also, some tests like `ui/parallel-rustc` have non-deterministic
// orders of output, so we need to compare by lines.
let compare_output_by_lines =
self.props.compare_output_by_lines || self.config.runner.is_some();

I think this option is quite footgun-y -- it allows you to easily make tests which ignore parts of the output that you might care about. Especially with the way bless works.

I feel like the two use cases require different modes, neither of which is the current implementation:

  • for parallel rustc we should have the check that expected lines are both a subset and a superset of actual lines (i.e. that the sets of lines are equal) (i.e. order independent line comparison)
  • when runners it might be enough to skip inconsistencies at the start/end, i.e. actual.contains(full_expected_stdout)
    • I'm not aware of the full wasm context, but at least in my case the runner only adds to the end of execution
    • --bless should write the full output, unnecessary bits should be removed by hand (or via normalization rules)

Metadata

Metadata

Assignees

No one assigned

    Labels

    A-compiletestArea: The compiletest test runnerA-compiletest-compare-modesArea: compiletest compare-modesC-bugCategory: This is a bug.T-bootstrapRelevant to the bootstrap subteam: Rust's build system (x.py and src/bootstrap)needs-triageThis issue may need triage. Remove it if it has been sufficiently triaged.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions