-
Notifications
You must be signed in to change notification settings - Fork 12.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Miscompilation with target-cpu=znver1 (AMD Ryzen 1000/2000 series) on Windows + LLVM 9. #63959
Comments
Here is the verbose build output:
The errors are the same with a clean build, but I used a subsequent attempt to cut down on the log size. |
A simplified test case is simply adding [package]
name = "cpu-bug"
version = "0.1.0"
authors = ["novacrazy <novacrazy@gmail.com>"]
edition = "2018"
[dependencies]
cargo_metadata = "0.8.2"
[profile.release] # My release profile
opt-level = 3
lto = 'fat'
incremental = false
debug-assertions = false
codegen-units = 1 extern crate cargo_metadata;
fn main() {
println!("Hello, world!");
} $env:RUSTFLAGS = "-C target-cpu=znver1"
cargo run --release |
|
I can also trigger this on the [profile.dev]
opt-level = 3
codegen-units = 1 with
|
Tagging as P-high for now. Not sure who to assign to. |
(Sound this be labeled as I-unsound?) |
Is it possible to get a backtrace for the segfault? I don't have a windows system (or a zen system for that matter) to reproduce this on. I don't know whether Windows has assertion-enabled builds, but if it does, it might be worth calling https://github.com/kennytm/rustup-toolchain-install-master with the |
Probably not. The crash is in the compiler process, not in code it output, and presumably in C++ code at that, so there's no reason to expect there's anything going wrong with any safe Rust code. While it's the scary sort of crash that sounds like it could hypothetically also result in completely bogus machine code being generated, there's no evidence of this actually happening / being possible. I mean I guess we could decide to tag stuff I-unsound merely because "something's gone really wrong in the C++ and that could have arbitrarily bad consequences", but if we do that we should also blanket tag all LLVM assertion failures as I-unsound, but we don't currently do that nor do I think it would be useful. |
Could not reproduce with #63959 (comment) or #63959 (comment) on on Zen 2000 based system using Linux GNU and by cross compiling to Windows GNU, I'll check native Windows GNU toolchain later. |
Happens with both MSVC and GNU builds on Windows for me. How would I go about enabling backtraces on a Windows build of rustc? EDIT: Would rustc even produce a backtrace on segfault? I know I've seen proper backtraces with ICEs, but this is different.
|
It's not |
On Windows |
Make sure you running the real |
Hmm, I could swear I could debug Click here to expand
I'll return with debug Rust build if I somehow manage to build LLVM in debug mode on PC with 16 GiB of RAM... |
What led you to the conclusion of a corrupt stack? It looks fairly reasonable to me. |
I think the wrapper switched to I don't think anything like this is possible on Windows (without manually loading the executable you want to run in your address space, of course). |
You don't need to, this is not in LLVM, it's in |
Process gave me exit code for stack corruption before, took quick look at the trace and it didn't make any sense to me.
Yeah, it hit me later. Running In case you find it useful here is trace from debug build and disassembly: https://gist.github.com/mati865/e93d3bf12408df00ecf47327fa196af7 Assembly for |
@mati865 Since AFAICT the bug happens during the macro expansion in Not even names need to be resolvable, other than invoking So, for example, you can remove all dependencies of
EDIT: wait, no, it must be code compiled with Could you try to run |
I've been struggling few past 2 days to build Rust because of #61561 so I'm unable to progress on this issue. |
Any luck with this? I'm still stuck on 07e0c36 because of it. |
I think I'll stop editing my previous comment (#63959 (comment)). In the final version I've numbered the Initially I attributed the sensitivity to scoping to MIR drop order, but with So this could be a stack layout overlap issue? Maybe there's something different about stack frames on Windows which can cause this? Not sure where to go from here. I likely won't spend more time on this myself, but if someone wants me to run some testcases or dump some data using the Ryzen laptop I have access to, I can help with that. |
@rustbot ping icebreakers-llvm |
Hey LLVM ICE-breakers! This bug has been identified as a good cc @hdhoang @heyrutvik @jryans @mmilenko @nagisa @nikic @rkruppe @SiavoshZarrasvand @spastorino @vertexclique @vgxbj |
I'm happy to take this one on. Although I lack the target CPU myself. Can I still progress using cross-compilation, or is that a requirement? |
@SiavoshZarrasvand I'd suggest trying my reduced testcase with If you can get the assertion failure and not a SIGILL or some other crash, then I assume that's enough to work with that testcase and dive into LLVM internals responsible for it etc. |
If that doesn't work, I have a server with the needed hardware, and I could probably spin up a VM if that helps. |
@ethanhs That would definitely do the trick. Pretty sure I should be able to get it to repeat on my hardware though. Let me try during the weekend and confirm on Monday. |
I built the initial example on an ASUS ROG with Ubuntu 18.04. The commands I used to build and run on wine were:
What should I change to force the error? Somehow it compiles and runs fine for me.
|
@SiavoshZarrasvand Those testcases aren't useful for reproduction (especially outside of Windows on AMD Ryzen CPUs) as they crash You should only use #63959 (comment) which relies on an assertion failure instead of a crash (so if you get a crash that likely means your CPU doesn't support certain |
That worked. I needed to switch to nightly and used the following to compile Running it in wine produces following error (which I believe is what would be expected) |
I dropped the ball while initially looking at the ASM diff, but then @Speedy37 pointed out that Both use
@nagisa confirmed that Also, this finally explains the relevancy of the size! |
I've just updated #63959 (comment) (I know, I said I wouldn't) with a small change to make this reproduce on a Linux Intel IvyBridge i7 laptop (i.e. it has AVX). The only reason windows was relevant was the fact that it has callee-saved |
@eddyb It is good that you posted here as it reminds me to update my code on my next debugging session with this issue. |
With this setup I was able to get target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"
define win64cc void @opaque() #0 {
ret void
}
define i32 @main() #0 {
start:
%dummy0 = alloca [22 x i64], align 8
%dummy1 = alloca [22 x i64], align 8
%dummy2 = alloca [22 x i64], align 8
%data = alloca <2 x i64>, align 8
br label %fake-loop
fake-loop: ; preds = %fake-loop, %start
%dummy0.cast = bitcast [22 x i64]* %dummy0 to i8*
%dummy1.cast = bitcast [22 x i64]* %dummy1 to i8*
call void @llvm.memcpy.p0i8.p0i8.i64(i8* nonnull align 8 %dummy1.cast, i8* nonnull align 8 %dummy0.cast, i64 176, i1 false)
%dummy1.cast.copy = bitcast [22 x i64]* %dummy1 to i8*
%dummy2.cast = bitcast [22 x i64]* %dummy2 to i8*
call void @llvm.lifetime.start.p0i8(i64 176, i8* nonnull %dummy2.cast)
call void @llvm.memcpy.p0i8.p0i8.i64(i8* nonnull align 8 %dummy2.cast, i8* nonnull align 8 %dummy1.cast.copy, i64 176, i1 false)
call win64cc void @opaque()
store <2 x i64> <i64 1010101010101010101, i64 2020202020202020202>, <2 x i64>* %data, align 8
%opaque-false = icmp eq i8 0, 1
br i1 %opaque-false, label %fake-loop, label %exit
exit: ; preds = %fake-loop
%data.cast = bitcast <2 x i64>* %data to i64*
%0 = load i64, i64* %data.cast, align 8
%1 = icmp eq i64 %0, 1010101010101010101
%2 = select i1 %1, i32 0, i32 -1
ret i32 %2
}
; Function Attrs: argmemonly nounwind
declare void @llvm.memcpy.p0i8.p0i8.i64(i8* nocapture writeonly, i8* nocapture readonly, i64, i1 immarg) #1
; Function Attrs: argmemonly nounwind
declare void @llvm.lifetime.start.p0i8(i64 immarg, i8* nocapture) #1
attributes #0 = { "target-cpu"="znver1" }
attributes #1 = { argmemonly nounwind } (click to show assembly output as well) .text
.file "bad.ll"
.globl opaque # -- Begin function opaque
.p2align 4, 0x90
.type opaque,@function
opaque: # @opaque
.cfi_startproc
# %bb.0:
retq
.Lfunc_end0:
.size opaque, .Lfunc_end0-opaque
.cfi_endproc
# -- End function
.section .rodata.cst16,"aM",@progbits,16
.p2align 4 # -- Begin function main
.LCPI1_0:
.quad 1010101010101010101 # 0xe04998456557eb5
.quad 2020202020202020202 # 0x1c093308acaafd6a
.text
.globl main
.p2align 4, 0x90
.type main,@function
main: # @main
.cfi_startproc
# %bb.0: # %start
subq $584, %rsp # imm = 0x248
.cfi_def_cfa_offset 592
vmovaps .LCPI1_0(%rip), %xmm6 # xmm6 = [1010101010101010101,2020202020202020202]
xorl %esi, %esi
.p2align 4, 0x90
.LBB1_1: # %fake-loop
# =>This Inner Loop Header: Depth=1
vmovups 552(%rsp), %ymm0
vmovups 536(%rsp), %ymm1
vmovups 408(%rsp), %ymm6
vmovups 472(%rsp), %ymm2
vmovups 504(%rsp), %ymm3
vmovups %ymm0, 192(%rsp)
vmovups %ymm1, 176(%rsp)
vmovups 440(%rsp), %ymm1
vmovups %ymm3, 144(%rsp)
vmovups %ymm2, 112(%rsp)
vmovups %ymm6, 48(%rsp)
vmovups %ymm3, 320(%rsp)
vmovups %ymm2, 288(%rsp)
vmovups %ymm6, 224(%rsp)
vmovups %ymm1, 80(%rsp)
vmovups %ymm1, 256(%rsp)
vmovups 192(%rsp), %ymm5
vmovups 176(%rsp), %ymm4
vmovups %ymm5, 368(%rsp)
vmovups %ymm4, 352(%rsp)
vzeroupper
callq opaque
vmovaps %xmm6, 32(%rsp)
testb %sil, %sil
jne .LBB1_1
# %bb.2: # %exit
movabsq $1010101010101010101, %rcx # imm = 0xE04998456557EB5
xorl %eax, %eax
cmpq %rcx, 32(%rsp)
sete %al
decl %eax
addq $584, %rsp # imm = 0x248
.cfi_def_cfa_offset 8
retq
.Lfunc_end1:
.size main, .Lfunc_end1-main
.cfi_endproc
# -- End function
.section ".note.GNU-stack","",@progbits
EDIT: reported as https://bugs.llvm.org/show_bug.cgi?id=44140 |
Did you ever learn why this only popped up on AMD Zen targets? The linked LLVM patch doesn't seem to touch target-specific code (that I see). I'm mostly asking this out curiosity. |
@novacrazy IIRC someone (@mati865?) speculated a while back that the cost tables for AIUI, the bug relies on If LLVM uses only 128-bit ( |
Yeah, we talked about it on Discord.
https://reviews.llvm.org/D70699 has single test and it uses |
https://reviews.llvm.org/D70699 has landed in llvm-project: llvm/llvm-project@9283681, but not yet in rust-lang's fork of llvm-project. |
🎉 thank you all for your hard work on fixing this! |
The fix will be available in the next nightly. |
On any recent MSVC nightly, compiling with
release
profile withRUSTFLAGS = "-C target-cpu=native"
results in eitherSTATUS_ACCESS_VIOLATION
orSTATUS_HEAP_CORRUPTION
depending on the crate. Many crates work, but others don't. Among those that fail some use SIMD.target-cpu=native
resolves totarget-cpu=znver1
on my machine.This seems to be related #63361 and the LLVM upgrade, again. It does not happen when
target-cpu
is not set.Everything works on 07e0c36 but fails after 38798c6, same as the aforementioned issue. I am not sure how to reproduce it in a single crate, but I will look into it.
LLVM 9 just doesn't like AMD.
Although, another issue of mine: bytecodealliance/cranelift#900 also fails in a similar manner before the LLVM upgrade, so it's worth noting.
The text was updated successfully, but these errors were encountered: