Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Build from ziglang.org fails on older processor (AMD A8-3500M) w/o SSE4.1 support #8562

Closed
frmdstryr opened this issue Apr 17, 2021 · 21 comments
Labels
arch-x86_64 bug Observed behavior contradicts documented or intended behavior os-linux zig build system
Milestone

Comments

@frmdstryr
Copy link
Contributor

frmdstryr commented Apr 17, 2021

Starting today running zig from snap just exits with "Illegal instruction", it has been working fine for several months...

$ snap install zig --classic --edge
zig (edge) 0.8.0-dev.1975+01a136585 from Jay Petacat (jayschwa) installed
(base) $ zig
Illegal instruction

With gdb...

$ sudo gdb --args zig
GNU gdb (Ubuntu 9.2-0ubuntu1~20.04) 9.2
Copyright (C) 2020 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from zig...
(No debugging symbols found in zig)
(gdb) run
Starting program: /snap/bin/zig 
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
[New Thread 0x7ffff5b37700 (LWP 121866)]
[New Thread 0x7ffff5336700 (LWP 121867)]
[New Thread 0x7ffff4b35700 (LWP 121868)]
[New Thread 0x7fffeffff700 (LWP 121869)]
[New Thread 0x7fffef7fe700 (LWP 121870)]
[New Thread 0x7fffeeffd700 (LWP 121871)]
[Thread 0x7fffeeffd700 (LWP 121871) exited]
[Thread 0x7fffef7fe700 (LWP 121870) exited]
[Thread 0x7fffeffff700 (LWP 121869) exited]
[Thread 0x7ffff4b35700 (LWP 121868) exited]
[Thread 0x7ffff5336700 (LWP 121867) exited]
[Thread 0x7ffff5b37700 (LWP 121866) exited]
process 121860 is executing new program: /snap/snapd/11588/usr/bin/snap
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
[New Thread 0x7ffff7d48700 (LWP 121872)]
[New Thread 0x7ffff7547700 (LWP 121873)]
[New Thread 0x7ffff6d46700 (LWP 121874)]
[New Thread 0x7ffff6545700 (LWP 121875)]
[New Thread 0x7ffff5d44700 (LWP 121876)]
[Detaching after vfork from child process 121877]
[New Thread 0x7ffff5363700 (LWP 121878)]
[Thread 0x7ffff5d44700 (LWP 121876) exited]
[Thread 0x7ffff6545700 (LWP 121875) exited]
[Thread 0x7ffff6d46700 (LWP 121874) exited]
[Thread 0x7ffff7d48700 (LWP 121872) exited]
[Thread 0x7ffff7d99740 (LWP 121860) exited]
[Thread 0x7ffff7547700 (LWP 121873) exited]
[New LWP 121860]
process 121860 is executing new program: /snap/snapd/11588/usr/lib/snapd/sna--Type <RET> for more, q to quit, c to continue without paging--
p-confine
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
process 121860 is executing new program: /snap/snapd/11588/usr/lib/snapd/snap-exec
[New LWP 121890]
[New LWP 121891]
[New LWP 121892]
[New LWP 121893]
[New LWP 121894]
[LWP 121894 exited]
[LWP 121893 exited]
[LWP 121892 exited]
[LWP 121891 exited]
[LWP 121890 exited]
process 121860 is executing new program: /snap/zig/3274/zig

Thread 14 "zig" received signal SIGILL, Illegal instruction.
0x0000000006d8838d in p_bracket ()
(gdb) bt
#0  0x0000000006d8838d in p_bracket ()
#1  0x0000000006d85262 in p_ere ()
#2  0x0000000006d85651 in p_ere ()
#3  0x0000000006d849ff in llvm_regcomp ()
#4  0x0000000006d38c85 in llvm::Regex::Regex(llvm::StringRef, llvm::Regex::RegexFlags) ()
#5  0x0000000005921130 in _GLOBAL__sub_I_PassBuilder.cpp ()
#6  0x0000000006e2d4e0 in libc_start_init ()
#7  0x0000000006e2d526 in __libc_start_main ()
#8  0x0000000000000001 in ?? ()
#9  0x00000000029fd200 in ?? ()
#10 0x0000000006e2d544 in libc_start_main_stage2 ()
#11 0x0000000006e2d526 in __libc_start_main ()
#12 0x0000000000000000 in ?? ()
(gdb) 

Edit: This is on an old laptop..

Linux hp 5.4.0-65-generic #73-Ubuntu SMP Mon Jan 18 17:25:17 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
$ cat /proc/cpuinfo 
processor       : 0
vendor_id       : AuthenticAMD
cpu family      : 18
model           : 1
model name      : AMD A8-3500M APU with Radeon(tm) HD Graphics
stepping        : 0
microcode       : 0x3000027
cpu MHz         : 886.550
cache size      : 1024 KB
physical id     : 0
siblings        : 4
core id         : 0
cpu cores       : 4
apicid          : 0
initial apicid  : 0
fpu             : yes
fpu_exception   : yes
cpuid level     : 6
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm 3dnowext 3dnow constant_tsc rep_good nopl nonstop_tsc cpuid extd_apicid aperfmperf pni monitor cx16 popcnt lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs skinit wdt cpb hw_pstate vmmcall arat npt lbrv svm_lock nrip_save pausefilter
bugs            : fxsave_leak sysret_ss_attrs null_seg spectre_v1 spectre_v2
bogomips        : 2994.61
TLB size        : 1536 4K pages
clflush size    : 64
cache_alignment : 64
address sizes   : 40 bits physical, 48 bits virtual
power management: ts ttp tm stc 100mhzsteps hwpstate cpb

@jedisct1
Copy link
Contributor

/cc @jayschwa

@jayschwa
Copy link
Sponsor Contributor

The cronjob had stopped working for about a month and I kicked it loose a day or two ago. I did not see a crash on my own system (Linux x86-64) at the time.

I'm not at the dev machine to poke at it at the moment, but does the same thing happen with the downloaded tarball from the website?

@g-w1
Copy link
Contributor

g-w1 commented Apr 17, 2021

In gdb can you see the cause of the illegal instruction? Is it ud2 or actually an illegal instruction?

@jayschwa
Copy link
Sponsor Contributor

jayschwa commented Apr 18, 2021

I'm unable to reproduce on x86-64 with version 0.8.0-dev.1975+01a136585.

The following is a hash to check if your installation is somehow corrupted:

$ sha256sum $(which zig)
4079e6764996bd4a66ae4ddd0f499a15b2a2b9b8f11bae42334005c4cd076015  /snap/bin/zig

@frmdstryr
Copy link
Contributor Author

The sha256 hash matches.

This is the instruction:

│    0x6d88389 <p_bracket+3481>      pxor   %xmm4,%xmm6                                                              │
│  -->0x6d8838d <p_bracket+3485>      pmovzxbd %xmm6,%xmm6                                                            │
│   0x6d88392 <p_bracket+3490>      pand   %xmm5,%xmm6 

@jayschwa
Copy link
Sponsor Contributor

It looks like that instruction was introduced in SSE4.1 and your older processor does not support it.

I guess the next thing to check is if Zig's default builds can have their target ratcheted down.

@jayschwa jayschwa changed the title Is the snap package currently broken? Build from ziglang.org fails on older processor (AMD A8-3500M) w/o SSE4.1 support Apr 18, 2021
@jayschwa
Copy link
Sponsor Contributor

It looks like the CI scripts are targeting the "baseline" CPU, so I would not expect it to emit SSE4.1 instructions in that case. I'm going to tentatively mark this as a build system bug.

@jayschwa jayschwa added bug Observed behavior contradicts documented or intended behavior zig build system labels Apr 18, 2021
@jayschwa jayschwa added this to the 0.8.0 milestone Apr 18, 2021
@andrewrk
Copy link
Member

andrewrk commented Apr 18, 2021

With the merge of the LLVM12 branch, I bumped up our CI tarballs to x86_64_v2. I did not expect anyone to have hardware old enough that this would be an issue. Since it is I will be happy to revert that change and put it back to x86_64 baseline. Thanks for the report.

@jayschwa
Copy link
Sponsor Contributor

With the merge of the LLVM12 branch, I bumped up our CI tarballs to x86_64_v2.

I only see x86_64_v2 used in the the macos_script, so it's not clear to me how that applies to the Linux builds. What am I missing?

@frmdstryr
Copy link
Contributor Author

frmdstryr commented Apr 18, 2021

Thanks for the explanation. The laptop is 10 years old so its not a big deal.. I can also just build it from source.

@andrewrk
Copy link
Member

The linux CI job downloads a tarball with prebuilt LLVM, clang, LLD that I made on my computer with x86_64_v2 and uploaded. It's a mistake that I did not also set the CPU to that for the zig code as well. The stack trace we see above is in LLVM code, so that checks out.

@heidezomp
Copy link
Contributor

I am using Zig on a similarly old netbook and was bitten by this as well. I just want to say thanks to Andrew for going out of your way to include folks who are still using old hardware 😄

@andrewrk andrewrk modified the milestones: 0.8.0, 0.8.1 Jun 4, 2021
@andrewrk andrewrk pinned this issue Jun 8, 2021
@mil
Copy link
Contributor

mil commented Jul 4, 2021

Same issue here - wrote lots of zig code on an old x86_64 thinkpad and zig starting saying illegal instruction with 0.8. Whereas 0.7.1 works a-ok. Will be happy to see binaries support for pre-SSE4 restored.

@dullbananas
Copy link

@andrewrk when will it be fixed?

@andrewrk
Copy link
Member

andrewrk commented Jul 6, 2021

With the release of 0.8.1. It will at least be after LLVM 12.0.1

@justjosias
Copy link
Contributor

Same issue on an old Thinkpad. Now that LLVM 12.0.1 is released, will this be fixed in master soon?

@wizzard0
Copy link

same issue with i386 0.8.0 release and pre-0.9.0 master built from ziglang/zig-bootstrap@a52b6be with qemu-UTM and iSH virtual CPUs

@wizzard0
Copy link

@justjosias @frmdstryr try ./build -j1 i386-linux-musl k6 or another old CPU

if you don't have llc handy then use the list from e.g. https://discuss.tvm.apache.org/t/how-can-i-replace-llvm-with-the-correct-target-of-my-cpu/8668?print

@dullbananas note this doesnt help for ish as it segfaults later, see ish-app/ish#1495

@ldearquer
Copy link

Same issue here, using AMD Phenom II X6 1100t, can't run 0.8.0 or master.

@andrewrk andrewrk removed this from the 0.8.1 milestone Aug 31, 2021
@andrewrk andrewrk added this to the 0.9.1 milestone Aug 31, 2021
@andrewrk
Copy link
Member

Fixed with the release of 0.8.1

@andrewrk andrewrk modified the milestones: 0.9.1, 0.8.1 Sep 14, 2021
@andrewrk andrewrk unpinned this issue Sep 14, 2021
@frmdstryr
Copy link
Contributor Author

Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
arch-x86_64 bug Observed behavior contradicts documented or intended behavior os-linux zig build system
Projects
None yet
Development

No branches or pull requests