-
-
Notifications
You must be signed in to change notification settings - Fork 14.2k
Description
I have a program that crashes when it causes a panic in core.
Scope and reproducer
- This is reproducible starting with
nightly-2024-07-30. (yes, a year and a half ago!) - Most of this report is based on tests with the
x86_64-unknown-linux-gnutarget, but we did get a quick positive result on an Arm Mac. (which I think would beaarch64-apple-darwin?) - We've only figured out how to make it happen with
build-stdusing publicly available tools, but it also reproduces with Google's internal non-Cargo bootstrapping process.
To reproduce it, clone https://github.com/mvanbem-goog/rust-sigill-repro.git and run cargo run -Z build-std=std.
The program is really tiny. The linked repo also provides a Cargo.toml without any deps and a rust-toolchain.toml pinned to a nightly that exhibits the behavior.
fn main() {
println!("Panicking locally");
std::panic::catch_unwind(|| {
panic!("panic locally");
})
.unwrap_err();
println!("Panicking in core");
std::panic::catch_unwind(|| {
let mut buf = [0u8; 1];
buf.copy_from_slice(&[1, 2]);
})
.unwrap_err();
}
You can also phrase this as a test with #[should_panic], which is the form we first detected.
#[test]
#[should_panic]
fn never_crashes() {
panic!("panic locally");
}
#[test]
#[should_panic]
fn might_crash() {
let mut buf = [0u8; 1];
buf.copy_from_slice(&[1, 2]);
}
Expected behavior
The program should run and exit successfully. It should panic twice and catch both of the panics.
Panicking locally
thread 'main' (277763) panicked at src/main.rs:4:9:
panic locally
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
Panicking in core
thread 'main' (277763) panicked at src/main.rs:11:13:
copy_from_slice: source slice length (2) does not match destination slice length (1)
Actual behavior
The first panic is caught as expected, but the second panic site doesn't actually perform a panic. Instead, it executes an illegal instruction, raising SIGILL and ending the process.
Panicking locally
thread 'main' (149293) panicked at src/main.rs:4:5:
panic locally
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
Panicking in core
Illegal instruction (core dumped) cargo run -Z build-std=std
This goes way back
The case presented so far reproduces on a random subset of nightlies going back a year and a half. Try editing rust-toolchain.toml (or just run cargo +nightly-${specific_date} if you already installed the associated rust-src component). If you go back far enough, you will have to drop the crate to edition 2021 and also specify a --target to make build-std work.
A few examples:
nightly-2025-12-19: crashesnightly-2025-12-18: OKnightly-2025-12-17: crashesnightly-2025-12-16: crashesnightly-2025-12-15: OKnightly-2025-12-14: OKnightly-2025-12-13: crashes
Skipping into the distant past:
nightly-2024-07-31: crashesnightly-2024-07-30: OK
This seems to be tied to StableCrateId and/or name mangling
With one more step, this can be made to reproduce on every nightly build we've tried back to, but not before, nightly-2024-07-30.
The instructions are the same, but set the -C metadata rustc flag through the RUSTFLAGS environment variable to any unique string. Each different string seems to take a fresh sample for whether the built binary will succeed or crash. The behavior is chaotic, but deterministic. It's totally consistent for any (nightly compiler, metadata) pair, but metadata strings have different effects with different compilers. Any nightly compiler in range can be made to pass or fail this test with the right metadata string.
The one use (at least as far as I could tell) of the -C metadata flag is as an extra entropy source for assigning StableCrateIds, which end up appearing as a disambiguator in v0 mangled symbols. We had initially been using the RUSTC_FORCE_RUSTC_VERSION environment variable to exercise this (which also feeds into StableCrateIds), but -C metadata seems to be a much smaller hammer. Note if you are playing with RUSTC_FORCE_RUSTC_VERSION then cargo clean seems to be required to actually trigger a build-std rebuild.
Here's my shell one-liner to try ten different metadata strings:
$ rm -f results; for metadata in foo_{1..10}; do RUSTFLAGS=-Cmetadata=$metadata cargo run -Z build-std=std --target=x86_64-unknown-linux-gnu && echo OK >>results || echo failed >>results; done; cat results
With nightly-2025-12-19 (which crashes otherwise): failed, OK, OK, OK, OK, failed, OK, failed, OK, OK
With nightly-2025-12-18 (which succeeds otherwise): failed, OK, failed, failed, OK, OK, OK, failed, OK, failed
And this still happens all the way back at nightly-2024-07-30: failed, failed, OK, OK, failed, failed, failed, OK, OK, failed
But it doesn't happen at nightly-2024-07-29: OK, OK, OK, OK, OK, OK, OK, OK, OK, OK
A little bit of debugging
Running a bad binary under gdb:
Program received signal SIGILL, Illegal instruction.
core::slice::copy_from_slice_impl<u8> (dest=..., src=...)
at /usr/local/google/home/mvanbem/.rustup/toolchains/nightly-2025-12-18-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/slice/mod.rs:5128
5128 len_mismatch_fail(dest.len(), src.len());
(gdb) bt
#0 core::slice::copy_from_slice_impl<u8> (dest=..., src=...)
at /usr/local/google/home/mvanbem/.rustup/toolchains/nightly-2025-12-18-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/slice/mod.rs:5128
#1 0x0000555555750e2d in core::slice::{impl#0}::copy_from_slice<u8> (self=..., src=...)
at /usr/local/google/home/mvanbem/.rustup/toolchains/nightly-2025-12-18-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/slice/mod.rs:3897
#2 0x00005555555f2b59 in rust_sigill_repro::main::{closure#1} () at src/main.rs:10
#3 0x00005555555f2d0a in std::panicking::catch_unwind::do_call<rust_sigill_repro::main::{closure_env#1}, ()> (data=0x7fffffffd270)
at /usr/local/google/home/mvanbem/.rustup/toolchains/nightly-2025-12-18-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/panicking.rs:581
#4 0x00005555555f2edb in __rust_try ()
#5 0x00005555555f2bf6 in std::panicking::catch_unwind<(), rust_sigill_repro::main::{closure_env#1}> (f=...)
at /usr/local/google/home/mvanbem/.rustup/toolchains/nightly-2025-12-18-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/panicking.rs:544
#6 0x00005555555f3166 in std::panic::catch_unwind<rust_sigill_repro::main::{closure_env#1}, ()> (f=...)
at /usr/local/google/home/mvanbem/.rustup/toolchains/nightly-2025-12-18-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/panic.rs:359
#7 0x00005555555f30ac in rust_sigill_repro::main () at src/main.rs:8
I would expect execution to get to that point, but it's supposed to panic instead.
Disassembly snippet from objdump -d on a bad binary piped through rustfilt:
00000000001fcd40 <core::slice::copy_from_slice_impl::<u8>>:
1fcd40: 48 83 ec 78 sub $0x78,%rsp
1fcd44: 48 89 3c 24 mov %rdi,(%rsp)
1fcd48: 48 89 74 24 08 mov %rsi,0x8(%rsp)
1fcd4d: 48 89 54 24 10 mov %rdx,0x10(%rsp)
1fcd52: 48 89 4c 24 18 mov %rcx,0x18(%rsp)
1fcd57: 48 89 7c 24 20 mov %rdi,0x20(%rsp)
1fcd5c: 48 89 74 24 28 mov %rsi,0x28(%rsp)
1fcd61: 48 89 54 24 30 mov %rdx,0x30(%rsp)
1fcd66: 48 89 4c 24 38 mov %rcx,0x38(%rsp)
1fcd6b: 48 39 ce cmp %rcx,%rsi
1fcd6e: 75 7b jne 1fcdeb <core::slice::copy_from_slice_impl::<u8>+0xab>
1fcd70: 48 8b 44 24 08 mov 0x8(%rsp),%rax
1fcd75: 48 8b 0c 24 mov (%rsp),%rcx
1fcd79: 48 8b 54 24 10 mov 0x10(%rsp),%rdx
1fcd7e: 48 8b 74 24 18 mov 0x18(%rsp),%rsi
1fcd83: 48 89 54 24 68 mov %rdx,0x68(%rsp)
1fcd88: 48 89 74 24 70 mov %rsi,0x70(%rsp)
1fcd8d: 48 89 4c 24 58 mov %rcx,0x58(%rsp)
1fcd92: 48 89 44 24 60 mov %rax,0x60(%rsp)
1fcd97: 48 89 54 24 40 mov %rdx,0x40(%rsp)
1fcd9c: 48 89 4c 24 48 mov %rcx,0x48(%rsp)
1fcda1: 48 89 44 24 50 mov %rax,0x50(%rsp)
1fcda6: e8 a5 ef ff ff call 1fbd50 <core::ub_checks::check_language_ub>
1fcdab: a8 01 test $0x1,%al
1fcdad: 75 02 jne 1fcdb1 <core::slice::copy_from_slice_impl::<u8>+0x71>
1fcdaf: eb 22 jmp 1fcdd3 <core::slice::copy_from_slice_impl::<u8>+0x93>
1fcdb1: 4c 8b 44 24 08 mov 0x8(%rsp),%r8
1fcdb6: 48 8b 34 24 mov (%rsp),%rsi
1fcdba: 48 8b 7c 24 10 mov 0x10(%rsp),%rdi
1fcdbf: b9 01 00 00 00 mov $0x1,%ecx
1fcdc4: 4c 8d 0d 15 2e 01 00 lea 0x12e15(%rip),%r9 # 20fbe0 <core::fmt::num::DECIMAL_PAIRS+0x1da0>
1fcdcb: 48 89 ca mov %rcx,%rdx
1fcdce: e8 dd f6 ff ff call 1fc4b0 <core::ptr::copy_nonoverlapping::precondition_check>
1fcdd3: 48 8b 54 24 08 mov 0x8(%rsp),%rdx
1fcdd8: 48 8b 74 24 10 mov 0x10(%rsp),%rsi
1fcddd: 48 8b 3c 24 mov (%rsp),%rdi
1fcde1: e8 aa 00 00 00 call 1fce90 <memcpy@plt>
1fcde6: 48 83 c4 78 add $0x78,%rsp
1fcdea: c3 ret
1fcdeb: 0f 0b ud2
1fcded: cc int3
1fcdee: cc int3
1fcdef: cc int3
The branch at 0x1fcd6e looks like the bounds check that we expect to be tripping. Its target is 0x1fcdeb, the ud2 instruction. It looks like we've managed to miscompile libcore. Or at least this specific generic instantiation of this function.
Here's the same thing from a good binary for comparison:
00000000001e9890 <core::slice::copy_from_slice_impl::<u8>>:
1e9890: 48 81 ec 88 00 00 00 sub $0x88,%rsp
1e9897: 48 89 7c 24 08 mov %rdi,0x8(%rsp)
1e989c: 48 89 74 24 10 mov %rsi,0x10(%rsp)
1e98a1: 48 89 54 24 18 mov %rdx,0x18(%rsp)
1e98a6: 48 89 4c 24 20 mov %rcx,0x20(%rsp)
1e98ab: 4c 89 44 24 28 mov %r8,0x28(%rsp)
1e98b0: 48 89 7c 24 30 mov %rdi,0x30(%rsp)
1e98b5: 48 89 74 24 38 mov %rsi,0x38(%rsp)
1e98ba: 48 89 54 24 40 mov %rdx,0x40(%rsp)
1e98bf: 48 89 4c 24 48 mov %rcx,0x48(%rsp)
1e98c4: 48 39 ce cmp %rcx,%rsi
1e98c7: 0f 85 84 00 00 00 jne 1e9951 <core::slice::copy_from_slice_impl::<u8>+0xc1>
1e98cd: 48 8b 44 24 10 mov 0x10(%rsp),%rax
1e98d2: 48 8b 4c 24 08 mov 0x8(%rsp),%rcx
1e98d7: 48 8b 54 24 18 mov 0x18(%rsp),%rdx
1e98dc: 48 8b 74 24 20 mov 0x20(%rsp),%rsi
1e98e1: 48 89 54 24 78 mov %rdx,0x78(%rsp)
1e98e6: 48 89 b4 24 80 00 00 mov %rsi,0x80(%rsp)
1e98ed: 00
1e98ee: 48 89 4c 24 68 mov %rcx,0x68(%rsp)
1e98f3: 48 89 44 24 70 mov %rax,0x70(%rsp)
1e98f8: 48 89 54 24 50 mov %rdx,0x50(%rsp)
1e98fd: 48 89 4c 24 58 mov %rcx,0x58(%rsp)
1e9902: 48 89 44 24 60 mov %rax,0x60(%rsp)
1e9907: e8 d4 7b ff ff call 1e14e0 <core::ub_checks::check_language_ub>
1e990c: a8 01 test $0x1,%al
1e990e: 75 02 jne 1e9912 <core::slice::copy_from_slice_impl::<u8>+0x82>
1e9910: eb 23 jmp 1e9935 <core::slice::copy_from_slice_impl::<u8>+0xa5>
1e9912: 4c 8b 44 24 10 mov 0x10(%rsp),%r8
1e9917: 48 8b 74 24 08 mov 0x8(%rsp),%rsi
1e991c: 48 8b 7c 24 18 mov 0x18(%rsp),%rdi
1e9921: b9 01 00 00 00 mov $0x1,%ecx
1e9926: 4c 8d 0d 43 37 02 00 lea 0x23743(%rip),%r9 # 20d070 <core::fmt::num::DECIMAL_PAIRS+0xc90>
1e992d: 48 89 ca mov %rcx,%rdx
1e9930: e8 4b 8e ff ff call 1e2780 <core::ptr::copy_nonoverlapping::precondition_check>
1e9935: 48 8b 54 24 10 mov 0x10(%rsp),%rdx
1e993a: 48 8b 74 24 18 mov 0x18(%rsp),%rsi
1e993f: 48 8b 7c 24 08 mov 0x8(%rsp),%rdi
1e9944: e8 47 1b 01 00 call 1fb490 <memcpy@plt>
1e9949: 48 81 c4 88 00 00 00 add $0x88,%rsp
1e9950: c3 ret
1e9951: 48 8b 54 24 28 mov 0x28(%rsp),%rdx
1e9956: 48 8b 74 24 20 mov 0x20(%rsp),%rsi
1e995b: 48 8b 7c 24 10 mov 0x10(%rsp),%rdi
1e9960: ff 15 ca 0c 03 00 call *0x30cca(%rip) # 21a630 <_DYNAMIC+0xc4b0>
1e9966: cc int3
1e9967: cc int3
1e9968: cc int3
1e9969: cc int3
1e996a: cc int3
1e996b: cc int3
1e996c: cc int3
1e996d: cc int3
1e996e: cc int3
1e996f: cc int3
Same branch, but the target prepares a call.