Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Miscompilation with target-cpu=znver1 (AMD Ryzen 1000/2000 series) on Windows + LLVM 9. #63959

Closed
novacrazy opened this issue Aug 27, 2019 · 74 comments · Fixed by #66882
Closed
Labels
A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. C-bug Category: This is a bug. I-crash Issue: The compiler crashes (SIGSEGV, SIGABRT, etc). Use I-ICE instead when the compiler panics. ICEBreaker-LLVM Bugs identified for the LLVM ICE-breaker group P-high High priority regression-from-stable-to-stable Performance or correctness regression from one stable version to another. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@novacrazy
Copy link

novacrazy commented Aug 27, 2019

On any recent MSVC nightly, compiling with release profile with RUSTFLAGS = "-C target-cpu=native" results in either STATUS_ACCESS_VIOLATION or STATUS_HEAP_CORRUPTION depending on the crate. Many crates work, but others don't. Among those that fail some use SIMD. target-cpu=native resolves to target-cpu=znver1 on my machine.

This seems to be related #63361 and the LLVM upgrade, again. It does not happen when target-cpu is not set.

Everything works on 07e0c36 but fails after 38798c6, same as the aforementioned issue. I am not sure how to reproduce it in a single crate, but I will look into it.

LLVM 9 just doesn't like AMD.

Although, another issue of mine: bytecodealliance/cranelift#900 also fails in a similar manner before the LLVM upgrade, so it's worth noting.

@novacrazy
Copy link
Author

Here is the verbose build output:

PS F:\code\projects\active\raygon\private\raygon-test> cargo +nightly-msvc build --verbose --release
       Fresh unicode-xid v0.2.0
       Fresh semver-parser v0.7.0
       Fresh cc v1.0.40
       Fresh autocfg v0.1.6
       Fresh lazy_static v1.3.0
       Fresh nodrop v0.1.13
       Fresh unicode-xid v0.1.0
       Fresh cfg-if v0.1.9
       Fresh scopeguard v1.0.0
       Fresh version_check v0.1.5
       Fresh rustc-demangle v0.1.16
       Fresh ppv-lite86 v0.2.5
       Fresh ieee754 v0.2.6
       Fresh itoa v0.4.4
       Fresh either v1.5.2
       Fresh adler32 v1.0.3
       Fresh rand_core v0.4.2
       Fresh copyless v0.1.4
       Fresh bytecount v0.4.0
       Fresh color_quant v1.0.1
       Fresh lzw v0.10.0
       Fresh quote v0.3.15
       Fresh glob v0.2.11
       Fresh scoped_threadpool v0.1.9
       Fresh inflections v1.1.1
       Fresh take_mut v0.2.2
       Fresh rle-decode-fast v1.0.1
       Fresh arc-swap v0.3.11
       Fresh bytesize v1.0.0
       Fresh linked-hash-map v0.5.2
       Fresh regex-syntax v0.6.11
       Fresh float-ord v0.2.0
       Fresh crossbeam v0.2.12
       Fresh tobj v0.1.10
       Fresh crossbeam-utils v0.6.6
       Fresh c2-chacha v0.2.2
       Fresh fast-math v0.1.1
       Fresh inflate v0.3.4
       Fresh inflate v0.4.5
       Fresh lock_api v0.3.1
       Fresh thread_local v0.3.6
       Fresh proc-macro2 v1.0.1
       Fresh libc v0.2.62
       Fresh arrayvec v0.4.11
       Fresh proc-macro2 v0.4.30
       Fresh winapi v0.3.7
       Fresh getrandom v0.1.11
       Fresh rand_core v0.3.1
       Fresh peg v0.5.7
       Fresh gif v0.9.2
       Fresh gif v0.10.2
       Fresh quote v1.0.2
       Fresh quote v0.6.13
       Fresh backtrace-sys v0.1.31
       Fresh ryu v1.0.0
       Fresh winapi-util v0.1.2
       Fresh num_cpus v1.10.1
       Fresh crossbeam-queue v0.1.2
       Fresh rand_core v0.5.0
       Fresh byteorder v1.3.2
       Fresh bitflags v1.1.0
       Fresh rand v0.4.6
       Fresh packed_simd v0.3.3
       Fresh remove_dir_all v0.5.2
       Fresh typenum v1.10.0
       Fresh rand_os v0.1.3
       Fresh rand_jitter v0.1.4
       Fresh crc32fast v1.2.0
       Fresh time v0.1.42
       Fresh crossbeam-channel v0.3.9
       Fresh dirs v1.0.5
       Fresh clocksource v0.5.0
       Fresh atty v0.2.13
       Fresh num-format v0.4.0
       Fresh syn v1.0.4
       Fresh num-traits v0.2.8
       Fresh backtrace v0.3.35
       Fresh syn v0.15.44
       Fresh same-file v1.0.5
       Fresh rand_chacha v0.2.1
       Fresh pulldown-cmark v0.2.0
       Fresh tempdir v0.3.7
       Fresh deflate v0.7.20
       Fresh rand_chacha v0.1.1
       Fresh rand_hc v0.1.0
       Fresh rand_xorshift v0.1.1
       Fresh rand_xoshiro v0.3.1
       Fresh rand_isaac v0.1.1
       Fresh rand_pcg v0.1.2
       Fresh memchr v2.2.1
       Fresh slog v2.5.2
       Fresh rand v0.3.23
       Fresh log v0.4.8
       Fresh base64 v0.10.1
       Fresh libflate v0.1.27
       Fresh serde_derive v1.0.99
       Fresh proc-macro-hack v0.5.9
       Fresh deepsize_derive v0.1.1 (F:\code\projects\active\raygon\private\deps\deepsize\deepsize_derive)
       Fresh error-chain v0.12.1
       Fresh num-integer v0.1.41
       Fresh num-traits v0.1.43
       Fresh raygon-core v0.1.0 (F:\code\projects\active\raygon\private\raygon-core)
       Fresh walkdir v2.2.9
       Fresh rand v0.7.0
       Fresh num-derive v0.2.5
       Fresh png v0.15.0
       Fresh rand v0.6.5
       Fresh approx v0.3.2
       Fresh gltf-derive v0.12.0
       Fresh lifecycle-derive v0.1.0 (F:\code\projects\active\raygon\private\deps\lifecycle-derive)
       Fresh term v0.5.2
       Fresh slog-scope v4.1.2 (F:\code\projects\active\raygon\private\deps\slog-scope)
       Fresh aho-corasick v0.7.6
       Fresh fbxcel v0.4.4
       Fresh slog-async v2.3.0
       Fresh log v0.3.9
       Fresh random_color v0.4.4
       Fresh serde v1.0.99
       Fresh paste-impl v0.1.6
       Fresh num-iter v0.1.39
       Fresh enum_primitive v0.1.1
       Fresh rand_distr v0.2.1
       Fresh num-rational v0.1.42
       Fresh tiff v0.3.1
       Fresh num-rational v0.2.2
       Fresh lifecycle v0.1.0 (F:\code\projects\active\raygon\private\deps\lifecycle)
       Fresh chrono v0.4.7
       Fresh cgmath v0.17.0
       Fresh regex v1.2.1
       Fresh slog-stdlog v3.0.5
       Fresh semver v0.9.0
       Fresh half v1.3.0
       Fresh paste v0.1.6
       Fresh serde_json v1.0.40
       Fresh smallvec v0.6.10
       Fresh generic-array v0.13.2
       Fresh png v0.11.0
       Fresh bitflags_serde_shim v0.2.1
       Fresh png v0.14.1
       Fresh slog-term v2.4.1
       Fresh rustc_version v0.2.3
       Fresh deepsize v0.1.2 (F:\code\projects\active\raygon\private\deps\deepsize)
   Compiling cargo_metadata v0.6.4
       Fresh expr v0.1.0 (F:\code\projects\active\expr)
       Fresh numeric-array v0.4.1
   Compiling gltf-json v0.12.0
       Fresh serde_shims v0.2.1
     Running `rustc --crate-name cargo_metadata C:\Users\novacrazy\.cargo\registry\src\github.com-1ecc6299db9ec823\cargo_metadata-0.6.4\src\lib.rs --error-format=json --json=diagnostic-rendered-ansi,artifacts --crate-type lib --emit=dep-info,metadata,link -C opt-level=3 -C codegen-units=1 --cfg "feature=\"backtrace\"" --cfg "feature=\"default\"" -C metadata=ef5f1281a4361110 -C extra-filename=-ef5f1281a4361110 --out-dir F:\code\projects\active\raygon\private\target\release\deps -L dependency=F:\code\projects\active\raygon\private\target\release\deps --extern error_chain=F:\code\projects\active\raygon\private\target\release\deps\liberror_chain-f51a2bcf3c86d64a.rmeta --extern semver=F:\code\projects\active\raygon\private\target\release\deps\libsemver-c876c4f33d575b41.rmeta --extern serde=F:\code\projects\active\raygon\private\target\release\deps\libserde-8f60af8725232991.rmeta --extern serde_derive=F:\code\projects\active\raygon\private\target\release\deps\serde_derive-27a652a44d0131a6.dll --extern serde_json=F:\code\projects\active\raygon\private\target\release\deps\libserde_json-08490cd048b5ef30.rmeta --cap-lints allow -C target-cpu=native`
     Running `rustc --edition=2018 --crate-name gltf_json C:\Users\novacrazy\.cargo\registry\src\github.com-1ecc6299db9ec823\gltf-json-0.12.0\src\lib.rs --error-format=json --json=diagnostic-rendered-ansi,artifacts --crate-type lib --emit=dep-info,metadata,link -C opt-level=3 -C codegen-units=1 --cfg "feature=\"default\"" --cfg "feature=\"extras\"" --cfg "feature=\"names\"" -C metadata=9ff3e20903dd6c8c -C extra-filename=-9ff3e20903dd6c8c --out-dir F:\code\projects\active\raygon\private\target\release\deps -L dependency=F:\code\projects\active\raygon\private\target\release\deps --extern gltf_derive=F:\code\projects\active\raygon\private\target\release\deps\gltf_derive-4645197b1e4f4fc9.dll --extern serde=F:\code\projects\active\raygon\private\target\release\deps\libserde-8f60af8725232991.rmeta --extern serde_derive=F:\code\projects\active\raygon\private\target\release\deps\serde_derive-27a652a44d0131a6.dll --extern serde_json=F:\code\projects\active\raygon\private\target\release\deps\libserde_json-08490cd048b5ef30.rmeta --cap-lints allow -C target-cpu=native`
   Compiling raygon-geometry v0.1.0 (F:\code\projects\active\raygon\private\raygon-geometry)
       Fresh memoffset v0.5.1
       Fresh parking_lot_core v0.6.2
     Running `rustc --edition=2018 --crate-name raygon_geometry raygon-geometry\src\lib.rs --error-format=json --json=diagnostic-rendered-ansi,artifacts --crate-type lib --emit=dep-info,metadata,link -C opt-level=3 -C codegen-units=1 -C metadata=39ac77e6fba3b64e -C extra-filename=-39ac77e6fba3b64e --out-dir F:\code\projects\active\raygon\private\target\release\deps -L dependency=F:\code\projects\active\raygon\private\target\release\deps --extern copyless=F:\code\projects\active\raygon\private\target\release\deps\libcopyless-dd1647a18b2d8872.rmeta --extern deepsize=F:\code\projects\active\raygon\private\target\release\deps\libdeepsize-feae989679bc200b.rmeta --extern expr=F:\code\projects\active\raygon\private\target\release\deps\libexpr-803b5b1429934623.rmeta --extern fast_math=F:\code\projects\active\raygon\private\target\release\deps\libfast_math-63589e7c60336843.rmeta --extern half=F:\code\projects\active\raygon\private\target\release\deps\libhalf-6ff4dff974c834ba.rmeta --extern ieee754=F:\code\projects\active\raygon\private\target\release\deps\libieee754-ce756a49f0860dff.rmeta --extern num_traits=F:\code\projects\active\raygon\private\target\release\deps\libnum_traits-7a18e0f1ba17eb3e.rmeta --extern packed_simd=F:\code\projects\active\raygon\private\target\release\deps\libpacked_simd-67697b4169867fe9.rmeta --extern raygon_core=F:\code\projects\active\raygon\private\target\release\deps\libraygon_core-89ce21278a0834d2.rmeta --extern serde=F:\code\projects\active\raygon\private\target\release\deps\libserde-8f60af8725232991.rmeta -C target-cpu=native`
       Fresh crossbeam-epoch v0.7.2
   Compiling parking_lot v0.9.0
   Compiling crossbeam-deque v0.6.3
   Compiling crossbeam-deque v0.7.1
     Running `rustc --edition=2018 --crate-name parking_lot C:\Users\novacrazy\.cargo\registry\src\github.com-1ecc6299db9ec823\parking_lot-0.9.0\src\lib.rs --error-format=json --json=diagnostic-rendered-ansi,artifacts --crate-type lib --emit=dep-info,metadata,link -C opt-level=3 -C codegen-units=1 --cfg "feature=\"default\"" --cfg "feature=\"nightly\"" -C metadata=f5c96506dcb815d8 -C extra-filename=-f5c96506dcb815d8 --out-dir F:\code\projects\active\raygon\private\target\release\deps -L dependency=F:\code\projects\active\raygon\private\target\release\deps --extern lock_api=F:\code\projects\active\raygon\private\target\release\deps\liblock_api-759d6fb025c0d123.rmeta --extern parking_lot_core=F:\code\projects\active\raygon\private\target\release\deps\libparking_lot_core-065b8a044fbf420f.rmeta --cap-lints allow -C target-cpu=native --cfg has_sized_atomics --cfg has_checked_instant`
     Running `rustc --crate-name crossbeam_deque C:\Users\novacrazy\.cargo\registry\src\github.com-1ecc6299db9ec823\crossbeam-deque-0.6.3\src\lib.rs --error-format=json --json=diagnostic-rendered-ansi,artifacts --crate-type lib --emit=dep-info,metadata,link -C opt-level=3 -C codegen-units=1 -C metadata=7ee3d9f5a4b9f93d -C extra-filename=-7ee3d9f5a4b9f93d --out-dir F:\code\projects\active\raygon\private\target\release\deps -L dependency=F:\code\projects\active\raygon\private\target\release\deps --extern crossbeam_epoch=F:\code\projects\active\raygon\private\target\release\deps\libcrossbeam_epoch-27a78e0dea75f2cf.rmeta --extern crossbeam_utils=F:\code\projects\active\raygon\private\target\release\deps\libcrossbeam_utils-74ccfb0ef90abe3b.rmeta --cap-lints allow -C target-cpu=native`
     Running `rustc --crate-name crossbeam_deque C:\Users\novacrazy\.cargo\registry\src\github.com-1ecc6299db9ec823\crossbeam-deque-0.7.1\src\lib.rs --error-format=json --json=diagnostic-rendered-ansi,artifacts --crate-type lib --emit=dep-info,metadata,link -C opt-level=3 -C codegen-units=1 -C metadata=5980eed31e4953ff -C extra-filename=-5980eed31e4953ff --out-dir F:\code\projects\active\raygon\private\target\release\deps -L dependency=F:\code\projects\active\raygon\private\target\release\deps --extern crossbeam_epoch=F:\code\projects\active\raygon\private\target\release\deps\libcrossbeam_epoch-27a78e0dea75f2cf.rmeta --extern crossbeam_utils=F:\code\projects\active\raygon\private\target\release\deps\libcrossbeam_utils-74ccfb0ef90abe3b.rmeta --cap-lints allow -C target-cpu=native`
   Compiling rayon-core v1.5.0
     Running `rustc --crate-name rayon_core C:\Users\novacrazy\.cargo\registry\src\github.com-1ecc6299db9ec823\rayon-core-1.5.0\src\lib.rs --error-format=json --json=diagnostic-rendered-ansi,artifacts --crate-type lib --emit=dep-info,metadata,link -C opt-level=3 -C codegen-units=1 -C metadata=25c3622dd04447e5 -C extra-filename=-25c3622dd04447e5 --out-dir F:\code\projects\active\raygon\private\target\release\deps -L dependency=F:\code\projects\active\raygon\private\target\release\deps --extern crossbeam_deque=F:\code\projects\active\raygon\private\target\release\deps\libcrossbeam_deque-7ee3d9f5a4b9f93d.rmeta --extern crossbeam_queue=F:\code\projects\active\raygon\private\target\release\deps\libcrossbeam_queue-ef5b77e6d85fb6eb.rmeta --extern crossbeam_utils=F:\code\projects\active\raygon\private\target\release\deps\libcrossbeam_utils-74ccfb0ef90abe3b.rmeta --extern lazy_static=F:\code\projects\active\raygon\private\target\release\deps\liblazy_static-edeb315d8a13eb54.rmeta --extern num_cpus=F:\code\projects\active\raygon\private\target\release\deps\libnum_cpus-0a772d2f62861421.rmeta --cap-lints allow -C target-cpu=native`
   Compiling crossbeam v0.7.2
     Running `rustc --crate-name crossbeam C:\Users\novacrazy\.cargo\registry\src\github.com-1ecc6299db9ec823\crossbeam-0.7.2\src\lib.rs --error-format=json --json=diagnostic-rendered-ansi,artifacts --crate-type lib --emit=dep-info,metadata,link -C opt-level=3 -C codegen-units=1 --cfg "feature=\"crossbeam-channel\"" --cfg "feature=\"crossbeam-deque\"" --cfg "feature=\"crossbeam-queue\"" --cfg "feature=\"default\"" --cfg "feature=\"nightly\"" --cfg "feature=\"std\"" -C metadata=a425dcdcd1cdbe61 -C extra-filename=-a425dcdcd1cdbe61 --out-dir F:\code\projects\active\raygon\private\target\release\deps -L dependency=F:\code\projects\active\raygon\private\target\release\deps --extern cfg_if=F:\code\projects\active\raygon\private\target\release\deps\libcfg_if-d89f1e8e289aff1a.rmeta --extern crossbeam_channel=F:\code\projects\active\raygon\private\target\release\deps\libcrossbeam_channel-fa0e9e7c0ad3946c.rmeta --extern crossbeam_deque=F:\code\projects\active\raygon\private\target\release\deps\libcrossbeam_deque-5980eed31e4953ff.rmeta --extern crossbeam_epoch=F:\code\projects\active\raygon\private\target\release\deps\libcrossbeam_epoch-27a78e0dea75f2cf.rmeta --extern crossbeam_queue=F:\code\projects\active\raygon\private\target\release\deps\libcrossbeam_queue-ef5b77e6d85fb6eb.rmeta --extern crossbeam_utils=F:\code\projects\active\raygon\private\target\release\deps\libcrossbeam_utils-74ccfb0ef90abe3b.rmeta --cap-lints allow -C target-cpu=native`
   Compiling slog-stdlog v4.0.0
     Running `rustc --edition=2018 --crate-name slog_stdlog C:\Users\novacrazy\.cargo\registry\src\github.com-1ecc6299db9ec823\slog-stdlog-4.0.0\lib.rs --error-format=json --json=diagnostic-rendered-ansi,artifacts --crate-type lib --emit=dep-info,metadata,link -C opt-level=3 -C codegen-units=1 -C metadata=8819e32849620aaf -C extra-filename=-8819e32849620aaf --out-dir F:\code\projects\active\raygon\private\target\release\deps -L dependency=F:\code\projects\active\raygon\private\target\release\deps --extern crossbeam=F:\code\projects\active\raygon\private\target\release\deps\libcrossbeam-a425dcdcd1cdbe61.rmeta --extern log=F:\code\projects\active\raygon\private\target\release\deps\liblog-a5cb25dad3acaed3.rmeta --extern slog=F:\code\projects\active\raygon\private\target\release\deps\libslog-9dd0303d4e15c44a.rmeta --extern slog_scope=F:\code\projects\active\raygon\private\target\release\deps\libslog_scope-db6d0649eb15794a.rmeta --cap-lints allow -C target-cpu=native`
   Compiling slog-envlogger v2.2.0
     Running `rustc --crate-name slog_envlogger C:\Users\novacrazy\.cargo\registry\src\github.com-1ecc6299db9ec823\slog-envlogger-2.2.0\src/lib.rs --color always --crate-type lib --emit=dep-info,metadata,link -C opt-level=3 -C codegen-units=1 --cfg "feature=\"default\"" --cfg "feature=\"regex\"" -C metadata=e27a333ddc8f0523 -C extra-filename=-e27a333ddc8f0523 --out-dir F:\code\projects\active\raygon\private\target\release\deps -L dependency=F:\code\projects\active\raygon\private\target\release\deps --extern log=F:\code\projects\active\raygon\private\target\release\deps\liblog-a5cb25dad3acaed3.rmeta --extern regex=F:\code\projects\active\raygon\private\target\release\deps\libregex-6e0be316675b61d9.rmeta --extern slog=F:\code\projects\active\raygon\private\target\release\deps\libslog-9dd0303d4e15c44a.rmeta --extern slog_async=F:\code\projects\active\raygon\private\target\release\deps\libslog_async-9f929658b0d160d5.rmeta --extern slog_scope=F:\code\projects\active\raygon\private\target\release\deps\libslog_scope-db6d0649eb15794a.rmeta --extern slog_stdlog=F:\code\projects\active\raygon\private\target\release\deps\libslog_stdlog-8819e32849620aaf.rmeta --extern slog_term=F:\code\projects\active\raygon\private\target\release\deps\libslog_term-4a83c400864e5d10.rmeta --cap-lints allow -C target-cpu=native`
   Compiling rayon v1.1.0
     Running `rustc --crate-name rayon C:\Users\novacrazy\.cargo\registry\src\github.com-1ecc6299db9ec823\rayon-1.1.0\src\lib.rs --error-format=json --json=diagnostic-rendered-ansi,artifacts --crate-type lib --emit=dep-info,metadata,link -C opt-level=3 -C codegen-units=1 -C metadata=4fd697185ba64dd3 -C extra-filename=-4fd697185ba64dd3 --out-dir F:\code\projects\active\raygon\private\target\release\deps -L dependency=F:\code\projects\active\raygon\private\target\release\deps --extern crossbeam_deque=F:\code\projects\active\raygon\private\target\release\deps\libcrossbeam_deque-7ee3d9f5a4b9f93d.rmeta --extern either=F:\code\projects\active\raygon\private\target\release\deps\libeither-91f2bccb9927ab72.rmeta --extern rayon_core=F:\code\projects\active\raygon\private\target\release\deps\librayon_core-25c3622dd04447e5.rmeta --cap-lints allow -C target-cpu=native`
error: Could not compile `cargo_metadata`.

Caused by:
  process didn't exit successfully: `rustc --crate-name cargo_metadata C:\Users\novacrazy\.cargo\registry\src\github.com-1ecc6299db9ec823\cargo_metadata-0.6.4\src\lib.rs --error-format=json --json=diagnostic-rendered-ansi,artifacts --crate-type lib --emit=dep-info,metadata,link -C opt-level=3 -C codegen-units=1 --cfg "feature=\"backtrace\"" --cfg "feature=\"default\"" -C
metadata=ef5f1281a4361110 -C extra-filename=-ef5f1281a4361110 --out-dir F:\code\projects\active\raygon\private\target\release\deps -L dependency=F:\code\projects\active\raygon\private\target\release\deps --extern error_chain=F:\code\projects\active\raygon\private\target\release\deps\liberror_chain-f51a2bcf3c86d64a.rmeta --extern semver=F:\code\projects\active\raygon\private\target\release\deps\libsemver-c876c4f33d575b41.rmeta --extern serde=F:\code\projects\active\raygon\private\target\release\deps\libserde-8f60af8725232991.rmeta --extern serde_derive=F:\code\projects\active\raygon\private\target\release\deps\serde_derive-27a652a44d0131a6.dll --extern serde_json=F:\code\projects\active\raygon\private\target\release\deps\libserde_json-08490cd048b5ef30.rmeta --cap-lints allow -C target-cpu=native` (exit code: 0xc0000005, STATUS_ACCESS_VIOLATION)
warning: build failed, waiting for other jobs to finish...
error: Could not compile `gltf-json`.

Caused by:
  process didn't exit successfully: `rustc --edition=2018 --crate-name gltf_json C:\Users\novacrazy\.cargo\registry\src\github.com-1ecc6299db9ec823\gltf-json-0.12.0\src\lib.rs --error-format=json --json=diagnostic-rendered-ansi,artifacts --crate-type lib --emit=dep-info,metadata,link -C opt-level=3 -C codegen-units=1 --cfg "feature=\"default\"" --cfg "feature=\"extras\""
--cfg "feature=\"names\"" -C metadata=9ff3e20903dd6c8c -C extra-filename=-9ff3e20903dd6c8c --out-dir F:\code\projects\active\raygon\private\target\release\deps -L dependency=F:\code\projects\active\raygon\private\target\release\deps --extern gltf_derive=F:\code\projects\active\raygon\private\target\release\deps\gltf_derive-4645197b1e4f4fc9.dll --extern serde=F:\code\projects\active\raygon\private\target\release\deps\libserde-8f60af8725232991.rmeta --extern serde_derive=F:\code\projects\active\raygon\private\target\release\deps\serde_derive-27a652a44d0131a6.dll --extern serde_json=F:\code\projects\active\raygon\private\target\release\deps\libserde_json-08490cd048b5ef30.rmeta --cap-lints allow -C target-cpu=native` (exit code: 0xc0000005, STATUS_ACCESS_VIOLATION)
warning: build failed, waiting for other jobs to finish...
error: Could not compile `raygon-geometry`.

Caused by:
  process didn't exit successfully: `rustc --edition=2018 --crate-name raygon_geometry raygon-geometry\src\lib.rs --error-format=json --json=diagnostic-rendered-ansi,artifacts --crate-type lib --emit=dep-info,metadata,link -C opt-level=3 -C codegen-units=1 -C metadata=39ac77e6fba3b64e -C extra-filename=-39ac77e6fba3b64e --out-dir F:\code\projects\active\raygon\private\target\release\deps -L dependency=F:\code\projects\active\raygon\private\target\release\deps --extern copyless=F:\code\projects\active\raygon\private\target\release\deps\libcopyless-dd1647a18b2d8872.rmeta --extern deepsize=F:\code\projects\active\raygon\private\target\release\deps\libdeepsize-feae989679bc200b.rmeta --extern expr=F:\code\projects\active\raygon\private\target\release\deps\libexpr-803b5b1429934623.rmeta --extern fast_math=F:\code\projects\active\raygon\private\target\release\deps\libfast_math-63589e7c60336843.rmeta --extern half=F:\code\projects\active\raygon\private\target\release\deps\libhalf-6ff4dff974c834ba.rmeta --extern ieee754=F:\code\projects\active\raygon\private\target\release\deps\libieee754-ce756a49f0860dff.rmeta --extern num_traits=F:\code\projects\active\raygon\private\target\release\deps\libnum_traits-7a18e0f1ba17eb3e.rmeta --extern packed_simd=F:\code\projects\active\raygon\private\target\release\deps\libpacked_simd-67697b4169867fe9.rmeta --extern raygon_core=F:\code\projects\active\raygon\private\target\release\deps\libraygon_core-89ce21278a0834d2.rmeta --extern serde=F:\code\projects\active\raygon\private\target\release\deps\libserde-8f60af8725232991.rmeta -C target-cpu=native` (exit code: 0xc0000374, STATUS_HEAP_CORRUPTION)
warning: build failed, waiting for other jobs to finish...
error: build failed

The errors are the same with a clean build, but I used a subsequent attempt to cut down on the log size.

@novacrazy
Copy link
Author

novacrazy commented Aug 27, 2019

A simplified test case is simply adding cargo_metadata to an empty crate.

[package]
name = "cpu-bug"
version = "0.1.0"
authors = ["novacrazy <novacrazy@gmail.com>"]
edition = "2018"

[dependencies]
cargo_metadata = "0.8.2"

[profile.release] # My release profile
opt-level = 3
lto = 'fat'
incremental = false
debug-assertions = false
codegen-units = 1
extern crate cargo_metadata;

fn main() {
    println!("Hello, world!");
}
$env:RUSTFLAGS = "-C target-cpu=znver1"
cargo run --release

@jonas-schievink jonas-schievink added A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. C-bug Category: This is a bug. I-crash Issue: The compiler crashes (SIGSEGV, SIGABRT, etc). Use I-ICE instead when the compiler panics. I-nominated T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Aug 27, 2019
@novacrazy
Copy link
Author

codegen-units=1 seems to be partially responsible. Removing that fixes it. So it's not LTO at least.

@novacrazy
Copy link
Author

novacrazy commented Aug 27, 2019

I can also trigger this on the dev profile by changing:

[profile.dev]
opt-level = 3
codegen-units = 1

with RUSTFLAGS = "-C target-cpu=znver1"

opt-level=2 does not trigger it.

@nikomatsakis
Copy link
Contributor

Checking in from @rust-lang/compiler triage:

This seems to be related to our LLVM upgrade. The linked issue (#63361) was blamed on LLVM bug 42935 and fixed by @nikic via a LLVM submodule update (#63415).

cc @nikic and @nagisa -- Any thoughts on what's going on here?

@nikomatsakis
Copy link
Contributor

Tagging as P-high for now. Not sure who to assign to.

@nikomatsakis nikomatsakis added P-high High priority and removed I-nominated labels Aug 29, 2019
@Centril
Copy link
Contributor

Centril commented Aug 29, 2019

(Sound this be labeled as I-unsound?)

@nikic
Copy link
Contributor

nikic commented Aug 29, 2019

Is it possible to get a backtrace for the segfault? I don't have a windows system (or a zen system for that matter) to reproduce this on.

I don't know whether Windows has assertion-enabled builds, but if it does, it might be worth calling https://github.com/kennytm/rustup-toolchain-install-master with the -a argument and check if the toolchain this downloads triggers an assertion failure.

@hanna-kruppe
Copy link
Contributor

(Sound this be labeled as I-unsound?)

Probably not. The crash is in the compiler process, not in code it output, and presumably in C++ code at that, so there's no reason to expect there's anything going wrong with any safe Rust code. While it's the scary sort of crash that sounds like it could hypothetically also result in completely bogus machine code being generated, there's no evidence of this actually happening / being possible. I mean I guess we could decide to tag stuff I-unsound merely because "something's gone really wrong in the C++ and that could have arbitrarily bad consequences", but if we do that we should also blanket tag all LLVM assertion failures as I-unsound, but we don't currently do that nor do I think it would be useful.

@mati865
Copy link
Contributor

mati865 commented Aug 29, 2019

Could not reproduce with #63959 (comment) or #63959 (comment) on on Zen 2000 based system using Linux GNU and by cross compiling to Windows GNU, I'll check native Windows GNU toolchain later.

@novacrazy
Copy link
Author

novacrazy commented Aug 29, 2019

Happens with both MSVC and GNU builds on Windows for me.

How would I go about enabling backtraces on a Windows build of rustc? EDIT: Would rustc even produce a backtrace on segfault? I know I've seen proper backtraces with ICEs, but this is different.

9b91b9c10e3c87ed333a1e34c4f46ed68f1eee06-alt (just the alt version of the last nightly I had) does not appear to respond to RUST_BACKTRACE=1

@mati865
Copy link
Contributor

mati865 commented Aug 29, 2019

It's not rustc panic but LLVM segfault so you should use gdb with Windows GNU toolchain, no idea about MSVC.

@mati865
Copy link
Contributor

mati865 commented Aug 29, 2019

On Windows rustc exits with 0xc0000005 and GDB only prints: No stack.. There are no alternative builds for Windows GNU toolchain so I won't be able to do anything until I do debug build.

@nagisa
Copy link
Member

nagisa commented Aug 29, 2019

On Windows rustc exits with 0xc0000005 and GDB only prints: No stack.. There are no alternative builds for Windows GNU toolchain so I won't be able to do anything until I do debug build.

Make sure you running the real rustc and not the wrapper from rustup. What you’re seeing here is a typical symptom of failing to account for the wrapper.

@mati865
Copy link
Contributor

mati865 commented Aug 29, 2019

Hmm, I could swear I could debug rustc crash on Linux without caring about the wrapper.
Anyway the stack is corrupt:

Click here to expand
#0  0x000000006399edb4 in syn::path::parsing::<impl syn::path::Path>::get_ident () from \\?\F:\zen_test\target\release\deps\serde_derive-e49d39054da7e2ae.dll
#1  0x00000000638ceef1 in core::iter::traits::iterator::Iterator::try_for_each::call::{{closure}} ()
   from \\?\F:\zen_test\target\release\deps\serde_derive-e49d39054da7e2ae.dll
#2  0x00000000638c1582 in <core::iter::adapters::FilterMap<I,F> as core::iter::traits::iterator::Iterator>::next ()
   from \\?\F:\zen_test\target\release\deps\serde_derive-e49d39054da7e2ae.dll
#3  0x00000000638eddf8 in serde_derive::internals::attr::Variant::from_ast ()
   from \\?\F:\zen_test\target\release\deps\serde_derive-e49d39054da7e2ae.dll
#4  0x00000000638dbc20 in <core::iter::adapters::Map<I,F> as core::iter::traits::iterator::Iterator>::next ()
   from \\?\F:\zen_test\target\release\deps\serde_derive-e49d39054da7e2ae.dll
#5  0x00000000638e0c38 in serde_derive::internals::ast::Container::from_ast ()
   from \\?\F:\zen_test\target\release\deps\serde_derive-e49d39054da7e2ae.dll
#6  0x0000000063907224 in serde_derive::de::expand_derive_deserialize ()
   from \\?\F:\zen_test\target\release\deps\serde_derive-e49d39054da7e2ae.dll
#7  0x0000000063964e72 in serde_derive::derive_deserialize ()
   from \\?\F:\zen_test\target\release\deps\serde_derive-e49d39054da7e2ae.dll
#8  0x00000000639caa44 in proc_macro::bridge::client::__run_expand1::{{closure}}::{{closure}} () at src\libproc_macro\bridge/client.rs:358
#9  proc_macro::bridge::scoped_cell::ScopedCell<T>::set::{{closure}} ()
    at src\libproc_macro\bridge/scoped_cell.rs:79
#10 proc_macro::bridge::scoped_cell::ScopedCell<T>::replace ()
    at src\libproc_macro\bridge/scoped_cell.rs:74
#11 proc_macro::bridge::scoped_cell::ScopedCell<T>::set ()
    at src\libproc_macro\bridge/scoped_cell.rs:79
#12 proc_macro::bridge::client::<impl proc_macro::bridge::Bridge>::enter::{{closure}} () at src\libproc_macro\bridge/client.rs:309
#13 std::thread::local::LocalKey<T>::try_with ()
    at /rustc/17e73e801a75559eac5c932ff07bd9c8499a1364\src\libstd\thread/local.rs:262
#14 std::thread::local::LocalKey<T>::with ()
    at /rustc/17e73e801a75559eac5c932ff07bd9c8499a1364\src\libstd\thread/local.rs:239
#15 proc_macro::bridge::client::<impl proc_macro::bridge::Bridge>::enter ()
    at src\libproc_macro\bridge/client.rs:309
#16 proc_macro::bridge::client::__run_expand1::{{closure}} ()
    at src\libproc_macro\bridge/client.rs:351
#17 <std::panic::AssertUnwindSafe<F> as core::ops::function::FnOnce<()>>::call_once ()
    at /rustc/17e73e801a75559eac5c932ff07bd9c8499a1364\src\libstd/panic.rs:315
#18 std::panicking::try::do_call ()
    at /rustc/17e73e801a75559eac5c932ff07bd9c8499a1364\src\libstd/panicking.rs:296
#19 0x0000000063a28019 in __rust_maybe_catch_panic ()
    at src\libpanic_unwind\lib.rs:80
#20 0x00000000639d100e in std::panicking::try ()
    at /rustc/17e73e801a75559eac5c932ff07bd9c8499a1364\src\libstd/panicking.rs:275
#21 std::panic::catch_unwind ()
    at /rustc/17e73e801a75559eac5c932ff07bd9c8499a1364\src\libstd/panic.rs:394
#22 proc_macro::bridge::client::__run_expand1 ()
    at src\libproc_macro\bridge/client.rs:350
#23 0x0000000002ad0c6d in proc_macro::bridge::server::run_server ()
   from C:\Users\mateusz\.rustup\toolchains\nightly-x86_64-pc-windows-gnu\bin\rustc_driver-dc847589f2996fc8.dll
#24 0x0000000002bba233 in <syntax::ext::proc_macro::ProcMacroDerive as syntax::ext::base::MultiItemModifier>::expand ()
   from C:\Users\mateusz\.rustup\toolchains\nightly-x86_64-pc-windows-gnu\bin\rustc_driver-dc847589f2996fc8.dll
#25 0x0000000002bafa58 in syntax::ext::expand::MacroExpander::fully_expand_fragment ()
   from C:\Users\mateusz\.rustup\toolchains\nightly-x86_64-pc-windows-gnu\bin\rustc_driver-dc847589f2996fc8.dll
#26 0x0000000002baeabd in syntax::ext::expand::MacroExpander::expand_crate ()
   from C:\Users\mateusz\.rustup\toolchains\nightly-x86_64-pc-windows-gnu\bin\rustc_driver-dc847589f2996fc8.dll
#27 0x0000000000ff6312 in rustc_interface::passes::configure_and_expand_inner::{{closure}} ()
   from C:\Users\mateusz\.rustup\toolchains\nightly-x86_64-pc-windows-gnu\bin\rustc_driver-dc847589f2996fc8.dll
#28 0x0000000000fe89a7 in rustc::util::common::time ()
   from C:\Users\mateusz\.rustup\toolchains\nightly-x86_64-pc-windows-gnu\bin\rustc_driver-dc847589f2996fc8.dll
#29 0x0000000000f6859d in rustc_interface::passes::configure_and_expand_inner
    ()
   from C:\Users\mateusz\.rustup\toolchains\nightly-x86_64-pc-windows-gnu\bin\rustc_driver-dc847589f2996fc8.dll
#30 0x0000000000fcb48d in rustc_interface::passes::configure_and_expand::{{closure}} ()
   from C:\Users\mateusz\.rustup\toolchains\nightly-x86_64-pc-windows-gnu\bin\rustc_driver-dc847589f2996fc8.dll
#31 0x0000000000f9efff in rustc_data_structures::box_region::PinnedGenerator<I,A,R>::new ()
   from C:\Users\mateusz\.rustup\toolchains\nightly-x86_64-pc-windows-gnu\bin\rustc_driver-dc847589f2996fc8.dll
#32 0x0000000000f6ee01 in rustc_interface::queries::Query<T>::compute ()
   from C:\Users\mateusz\.rustup\toolchains\nightly-x86_64-pc-windows-gnu\bin\rustc_driver-dc847589f2996fc8.dll
#33 0x0000000000ff73da in rustc_interface::queries::<impl rustc_interface::interface::Compiler>::expansion ()
   from C:\Users\mateusz\.rustup\toolchains\nightly-x86_64-pc-windows-gnu\bin\rustc_driver-dc847589f2996fc8.dll
#34 0x0000000000e75a68 in rustc_interface::interface::run_compiler_in_existing_thread_pool ()
   from C:\Users\mateusz\.rustup\toolchains\nightly-x86_64-pc-windows-gnu\bin\rustc_driver-dc847589f2996fc8.dll
#35 0x0000000000e9c8af in std::thread::local::LocalKey<T>::with ()
   from C:\Users\mateusz\.rustup\toolchains\nightly-x86_64-pc-windows-gnu\bin\rustc_driver-dc847589f2996fc8.dll
#36 0x0000000000eb0eda in scoped_tls::ScopedKey<T>::set ()
   from C:\Users\mateusz\.rustup\toolchains\nightly-x86_64-pc-windows-gnu\bin\rustc_driver-dc847589f2996fc8.dll
#37 0x0000000000ecf781 in syntax::with_globals ()
   from C:\Users\mateusz\.rustup\toolchains\nightly-x86_64-pc-windows-gnu\bin\rustc_driver-dc847589f2996fc8.dll
#38 0x0000000000e4fb2d in std::sys_common::backtrace::__rust_begin_short_backtrace ()
   from C:\Users\mateusz\.rustup\toolchains\nightly-x86_64-pc-windows-gnu\bin\rustc_driver-dc847589f2996fc8.dll
#39 0x0000000066c75859 in __rust_maybe_catch_panic ()
    at src\libpanic_unwind\lib.rs:80
#40 0x0000000000e781b3 in core::ops::function::FnOnce::call_once{{vtable-shim}} ()
   from C:\Users\mateusz\.rustup\toolchains\nightly-x86_64-pc-windows-gnu\bin\rustc_driver-dc847589f2996fc8.dll
#41 0x0000000066c46836 in <alloc::boxed::Box<F> as core::ops::function::FnOnce<A>>::call_once ()
    at /rustc/17e73e801a75559eac5c932ff07bd9c8499a1364\src\liballoc/boxed.rs:922
#42 0x0000000066c72dd7 in <alloc::boxed::Box<F> as core::ops::function::FnOnce<A>>::call_once ()
    at /rustc/17e73e801a75559eac5c932ff07bd9c8499a1364\src\liballoc/boxed.rs:922
#43 std::sys_common::thread::start_thread ()
    at src\libstd\sys_common/thread.rs:13
#44 std::sys::windows::thread::Thread::new::thread_start ()
    at src\libstd\sys\windows/thread.rs:47
#45 0x00007ffcaec77bd4 in KERNEL32!BaseThreadInitThunk ()
   from C:\WINDOWS\System32\kernel32.dll
#46 0x00007ffcaedcce71 in ntdll!RtlUserThreadStart ()
   from C:\WINDOWS\SYSTEM32\ntdll.dll
#47 0x0000000000000000 in ?? ()
Backtrace stopped: previous frame inner to this frame (corrupt stack?)
Other threads are just waiting.

I'll return with debug Rust build if I somehow manage to build LLVM in debug mode on PC with 16 GiB of RAM...

@nagisa
Copy link
Member

nagisa commented Aug 29, 2019

What led you to the conclusion of a corrupt stack? It looks fairly reasonable to me.

@eddyb
Copy link
Member

eddyb commented Aug 30, 2019

Hmm, I could swear I could debug rustc crash on Linux without caring about the wrapper.

I think the wrapper switched to exec at some point (i.e. the process gets reused, without forking).
AFAIK, that would allow debugging to continue to the real rustc.

I don't think anything like this is possible on Windows (without manually loading the executable you want to run in your address space, of course).

@eddyb
Copy link
Member

eddyb commented Aug 30, 2019

I'll return with debug Rust build if I somehow manage to build LLVM in debug mode on PC with 16 GiB of RAM...

You don't need to, this is not in LLVM, it's in syn. You can probably reproduce with just cargo check (or rustc --emit=meta / rustc --pretty=expanded).

@mati865
Copy link
Contributor

mati865 commented Aug 30, 2019

What led you to the conclusion of a corrupt stack? It looks fairly reasonable to me.

Process gave me exit code for stack corruption before, took quick look at the trace and it didn't make any sense to me.
On second look I noticed the crash happened inside proc macro...

You don't need to, this is not in LLVM, it's in syn. You can probably reproduce with just cargo check (or rustc --emit=meta / rustc --pretty=expanded).

Yeah, it hit me later. Running cargo check on cargo-metadata crate with changes from #63959 (comment) reliably reproduces it.

In case you find it useful here is trace from debug build and disassembly: https://gist.github.com/mati865/e93d3bf12408df00ecf47327fa196af7

Assembly for znver1 and generic get_ident is the same so the problem is somewhere earlier. What is the best way to proceed here, compiler with assertions or tearing down cargo-metadata to something more handy?

@eddyb
Copy link
Member

eddyb commented Aug 30, 2019

@mati865 Since AFAICT the bug happens during the macro expansion in cargo-metadata, you should be able to get rid of most of it.

Not even names need to be resolvable, other than invoking serde_derive's macros.

So, for example, you can remove all dependencies of cargo-metadata, other than serde_derive, because they're not needed in the reproduction.

I'm worried this is a miscompilation of rustc/std itself, at this point.

EDIT: wait, no, it must be code compiled with -C target-cpu that's getting miscompiled, so it's all within serde_derive/syn.

Could you try to run ./x.py test --stage 1 src/test/ui with -C target-cpu=znver1 hardcoded somewhere? (presumably in src/tools/compiletest)

@mati865
Copy link
Contributor

mati865 commented Sep 1, 2019

I've been struggling few past 2 days to build Rust because of #61561 so I'm unable to progress on this issue.

@novacrazy
Copy link
Author

Any luck with this? I'm still stuck on 07e0c36 because of it.

@eddyb
Copy link
Member

eddyb commented Nov 11, 2019

I think I'll stop editing my previous comment (#63959 (comment)).

In the final version I've numbered the evil{1,2,3,4} variables, because they're all copies with the same value, except I can't seem to be able to simplify their weird scoping without the bug going away.

Initially I attributed the sensitivity to scoping to MIR drop order, but with Evil not needing drop, I think it's the MIR Storage{Live,Dead}, which end up in LLVM IR as llvm.lifetime.{start,end}.

So this could be a stack layout overlap issue? Maybe there's something different about stack frames on Windows which can cause this? Not sure where to go from here.
EDIT: looking at the assembly, the frame sizes are definitely different between Windows and Linux, but also the calling conventions and therefore the register usage.

I likely won't spend more time on this myself, but if someone wants me to run some testcases or dump some data using the Ryzen laptop I have access to, I can help with that.

cc @nagisa @nikic @rkruppe

@pnkfelix
Copy link
Member

@rustbot ping icebreakers-llvm

@rustbot
Copy link
Collaborator

rustbot commented Nov 14, 2019

Hey LLVM ICE-breakers! This bug has been identified as a good
"LLVM ICE-breaking candidate". In case it's useful, here are some
instructions for tackling these sorts of bugs. Maybe take a look?
Thanks! <3

cc @hdhoang @heyrutvik @jryans @mmilenko @nagisa @nikic @rkruppe @SiavoshZarrasvand @spastorino @vertexclique @vgxbj

@rustbot rustbot added the ICEBreaker-LLVM Bugs identified for the LLVM ICE-breaker group label Nov 14, 2019
@SiavoshZarrasvand
Copy link

SiavoshZarrasvand commented Nov 15, 2019

I'm happy to take this one on. Although I lack the target CPU myself. Can I still progress using cross-compilation, or is that a requirement?

@eddyb
Copy link
Member

eddyb commented Nov 15, 2019

@SiavoshZarrasvand I'd suggest trying my reduced testcase with -C target-cpu=znver1 either on Windows, or on Linux via cross-compilation + Wine (IIRC, @mati865 got that to work).

If you can get the assertion failure and not a SIGILL or some other crash, then I assume that's enough to work with that testcase and dive into LLVM internals responsible for it etc.

@ethanhs
Copy link

ethanhs commented Nov 16, 2019

If that doesn't work, I have a server with the needed hardware, and I could probably spin up a VM if that helps.

@SiavoshZarrasvand
Copy link

@ethanhs That would definitely do the trick. Pretty sure I should be able to get it to repeat on my hardware though. Let me try during the weekend and confirm on Monday.

@SiavoshZarrasvand
Copy link

I built the initial example on an ASUS ROG with Ubuntu 18.04. The commands I used to build and run on wine were:

cargo rustc --release --target=x86_64-pc-windows-gnu -- -C target-cpu=znver1 -C linker=x86_64-w64-mingw32-gcc

wine /path/to/target/release/cpu_bug.exe

What should I change to force the error? Somehow it compiles and runs fine for me.

# rustc --version
rustc 1.39.0 (4560ea788 2019-11-04)

@eddyb
Copy link
Member

eddyb commented Nov 17, 2019

@SiavoshZarrasvand Those testcases aren't useful for reproduction (especially outside of Windows on AMD Ryzen CPUs) as they crash rustc from a proc macro, which is messy.

You should only use #63959 (comment) which relies on an assertion failure instead of a crash (so if you get a crash that likely means your CPU doesn't support certain znver1 instructions).

@SiavoshZarrasvand
Copy link

SiavoshZarrasvand commented Nov 17, 2019

That worked. I needed to switch to nightly and used the following to compile
rustc main.rs -C opt-level=3 -C codegen-units=1 -C target-cpu=znver1 --edition=2018 --target=x86_64-pc-windows-gnu -C linker=x86_64-w64-mingw32-gcc

Running it in wine produces following error (which I believe is what would be expected)
thread 'main' panicked at 'assertion failed: `(left == right)` left: `([1, 1, 1, 1, 1, 1, 1, 1], [2, 2, 2, 2, 2, 2, 2, 2], [3, 3, 3, 3, 3, 3, 3, 3])`, right: `([16, 250, 50, 0, 0, 0, 0, 0], [32, 32, 32, 32, 32, 32, 32, 32], [3, 3, 3, 3, 3, 3, 3, 3])`', ./src/main.rs:37:13 note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace.

@eddyb
Copy link
Member

eddyb commented Nov 19, 2019

I dropped the ball while initially looking at the ASM diff, but then @Speedy37 pointed out that xmm6 was getting trashed (after looking at it in a debugger) so I took a closer look, ignoring (irrelevant) stack offsets.

Both use vmovaps xmmN, xmmword ptr [rip + .LCPI5_0] followed by a later use of xmmN to refer to that value, but both the N and the initialization point of xmmN differs:

  • non-windows initializes xmm0 right before using it
  • windows initializes xmm6 much earlier
    • it also saves/restores xmm6 before/after the body of the function (i.e. it's a "callee-saved" register?)
    • there is a call to opaque_iter_nextopaque_id in between the initialization of xmm6 and its use, suggesting LLVM has hoisted the initialization past a call because it's a callee-saved register
    • however, there is also a write to ymm6 in between, which trashes xmm6

@nagisa confirmed that xmm6 is the first callee-saved ("non-volatile registers") xmm register, so that fits, but why is LLVM taking advantage of that when ymm6 is also in use?
(EDIT: found the "xmm6-15 are callee-saved" part in the LLVM source)

Also, this finally explains the relevancy of the size!
One ymm register is 4 u64s, so you need at least 4*6+1 (25) u64s for ymm6 to be used.
Out of those 25 u64s, 3 of them are in data, and 22 are in padding.

@eddyb
Copy link
Member

eddyb commented Nov 19, 2019

I've just updated #63959 (comment) (I know, I said I wouldn't) with a small change to make this reproduce on a Linux Intel IvyBridge i7 laptop (i.e. it has AVX).

The only reason windows was relevant was the fact that it has callee-saved xmm registers at all, which we can also get by making opaque_iter_nextopaque_id an extern "win64" fn.

@SiavoshZarrasvand
Copy link

@eddyb It is good that you posted here as it reminds me to update my code on my next debugging session with this issue.

@eddyb
Copy link
Member

eddyb commented Nov 25, 2019

With this setup I was able to get bugpoint to reduce the LLVM IR, which I then cleaned up (mostly because bugpoint likes to replace values with undef) to get this:

target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"

define win64cc void @opaque() #0 {
  ret void
}

define i32 @main() #0 {
start:
  %dummy0 = alloca [22 x i64], align 8
  %dummy1 = alloca [22 x i64], align 8
  %dummy2 = alloca [22 x i64], align 8

  %data = alloca <2 x i64>, align 8

  br label %fake-loop

fake-loop:                                        ; preds = %fake-loop, %start
  %dummy0.cast = bitcast [22 x i64]* %dummy0 to i8*
  %dummy1.cast = bitcast [22 x i64]* %dummy1 to i8*
  call void @llvm.memcpy.p0i8.p0i8.i64(i8* nonnull align 8 %dummy1.cast, i8* nonnull align 8 %dummy0.cast, i64 176, i1 false)

  %dummy1.cast.copy = bitcast [22 x i64]* %dummy1 to i8*
  %dummy2.cast = bitcast [22 x i64]* %dummy2 to i8*
  call void @llvm.lifetime.start.p0i8(i64 176, i8* nonnull %dummy2.cast)
  call void @llvm.memcpy.p0i8.p0i8.i64(i8* nonnull align 8 %dummy2.cast, i8* nonnull align 8 %dummy1.cast.copy, i64 176, i1 false)

  call win64cc void @opaque()

  store <2 x i64> <i64 1010101010101010101, i64 2020202020202020202>, <2 x i64>* %data, align 8

  %opaque-false = icmp eq i8 0, 1
  br i1 %opaque-false, label %fake-loop, label %exit

exit:                                             ; preds = %fake-loop
  %data.cast = bitcast <2 x i64>* %data to i64*
  %0 = load i64, i64* %data.cast, align 8
  %1 = icmp eq i64 %0, 1010101010101010101
  %2 = select i1 %1, i32 0, i32 -1
  ret i32 %2
}

; Function Attrs: argmemonly nounwind
declare void @llvm.memcpy.p0i8.p0i8.i64(i8* nocapture writeonly, i8* nocapture readonly, i64, i1 immarg) #1

; Function Attrs: argmemonly nounwind
declare void @llvm.lifetime.start.p0i8(i64 immarg, i8* nocapture) #1

attributes #0 = { "target-cpu"="znver1" }
attributes #1 = { argmemonly nounwind }
(click to show assembly output as well)
	.text
	.file	"bad.ll"
	.globl	opaque                  # -- Begin function opaque
	.p2align	4, 0x90
	.type	opaque,@function
opaque:                                 # @opaque
	.cfi_startproc
# %bb.0:
	retq
.Lfunc_end0:
	.size	opaque, .Lfunc_end0-opaque
	.cfi_endproc
                                        # -- End function
	.section	.rodata.cst16,"aM",@progbits,16
	.p2align	4               # -- Begin function main
.LCPI1_0:
	.quad	1010101010101010101     # 0xe04998456557eb5
	.quad	2020202020202020202     # 0x1c093308acaafd6a
	.text
	.globl	main
	.p2align	4, 0x90
	.type	main,@function
main:                                   # @main
	.cfi_startproc
# %bb.0:                                # %start
	subq	$584, %rsp              # imm = 0x248
	.cfi_def_cfa_offset 592
	vmovaps	.LCPI1_0(%rip), %xmm6   # xmm6 = [1010101010101010101,2020202020202020202]
	xorl	%esi, %esi
	.p2align	4, 0x90
.LBB1_1:                                # %fake-loop
                                        # =>This Inner Loop Header: Depth=1
	vmovups	552(%rsp), %ymm0
	vmovups	536(%rsp), %ymm1
	vmovups	408(%rsp), %ymm6
	vmovups	472(%rsp), %ymm2
	vmovups	504(%rsp), %ymm3
	vmovups	%ymm0, 192(%rsp)
	vmovups	%ymm1, 176(%rsp)
	vmovups	440(%rsp), %ymm1
	vmovups	%ymm3, 144(%rsp)
	vmovups	%ymm2, 112(%rsp)
	vmovups	%ymm6, 48(%rsp)
	vmovups	%ymm3, 320(%rsp)
	vmovups	%ymm2, 288(%rsp)
	vmovups	%ymm6, 224(%rsp)
	vmovups	%ymm1, 80(%rsp)
	vmovups	%ymm1, 256(%rsp)
	vmovups	192(%rsp), %ymm5
	vmovups	176(%rsp), %ymm4
	vmovups	%ymm5, 368(%rsp)
	vmovups	%ymm4, 352(%rsp)
	vzeroupper
	callq	opaque
	vmovaps	%xmm6, 32(%rsp)
	testb	%sil, %sil
	jne	.LBB1_1
# %bb.2:                                # %exit
	movabsq	$1010101010101010101, %rcx # imm = 0xE04998456557EB5
	xorl	%eax, %eax
	cmpq	%rcx, 32(%rsp)
	sete	%al
	decl	%eax
	addq	$584, %rsp              # imm = 0x248
	.cfi_def_cfa_offset 8
	retq
.Lfunc_end1:
	.size	main, .Lfunc_end1-main
	.cfi_endproc
                                        # -- End function

	.section	".note.GNU-stack","",@progbits
  • xmm6-xmm15 are callee-saved on the win64 calling convention (used by opaque here)
  • LLVM takes advantage of xmm6 being callee-saved to hoist the <i64 1010101010101010101, i64 2020202020202020202> constant across the opaque call and out of the (fake) loop, keeping it around in xmm6
  • the two copies (%dummy1 -> %dummy2 and %dummy2 -> %dummy3) result in enough AVX registers being used (ymm0-ymm6) to overlap with xmm6, corrupting it
  • by the time of the actual store to %data, whatever happened to be in ymm6's lower half gets stored, instead of the hoisted constant

EDIT: reported as https://bugs.llvm.org/show_bug.cgi?id=44140
EDIT2: and someone wrote a patch already! https://reviews.llvm.org/D70699

@novacrazy
Copy link
Author

Did you ever learn why this only popped up on AMD Zen targets? The linked LLVM patch doesn't seem to touch target-specific code (that I see). I'm mostly asking this out curiosity.

@eddyb
Copy link
Member

eddyb commented Nov 26, 2019

@novacrazy IIRC someone (@mati865?) speculated a while back that the cost tables for znver1 were different enough to cause different instructions/registers to be used.
I'm guessing that would be the 256-bit AVX (ymm) registers.

AIUI, the bug relies on xmmN and ymmN being distinct registers in LLVM, but which overlap in hardware (and the patch adds overlap handling to one part of LLVM which was missing it).

If LLVM uses only 128-bit (xmm) registers for memcpys (or calls the C memcpy function), then there's no chance for the bug to occur.

@mati865
Copy link
Contributor

mati865 commented Nov 26, 2019

IIRC someone (@mati865?) speculated a while back that the cost tables for znver1 were different enough to cause different instructions/registers to be used.

Yeah, we talked about it on Discord.

Did you ever learn why this only popped up on AMD Zen targets? The linked LLVM patch doesn't seem to touch target-specific code (that I see). I'm mostly asking this out curiosity.

https://reviews.llvm.org/D70699 has single test and it uses -mcpu=znver1. So the bug was there for long time but was exposed in LLVM 9.
I think one of recent optimisations when paired with znver1 scheduler (znver2 uses the same scheduler right now) generated code that triggered the faulty optimisation.

@pnkfelix
Copy link
Member

pnkfelix commented Nov 29, 2019

https://reviews.llvm.org/D70699 has landed in llvm-project: llvm/llvm-project@9283681, but not yet in rust-lang's fork of llvm-project.

@ethanhs
Copy link

ethanhs commented Dec 7, 2019

🎉 thank you all for your hard work on fixing this!

@mati865
Copy link
Contributor

mati865 commented Dec 7, 2019

The fix will be available in the next nightly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. C-bug Category: This is a bug. I-crash Issue: The compiler crashes (SIGSEGV, SIGABRT, etc). Use I-ICE instead when the compiler panics. ICEBreaker-LLVM Bugs identified for the LLVM ICE-breaker group P-high High priority regression-from-stable-to-stable Performance or correctness regression from one stable version to another. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging a pull request may close this issue.