Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Illegal instruction (core dumped) on Linux Virtual Machine (KVM) #63

Closed
Netherdrake opened this issue Mar 29, 2023 · 6 comments
Closed

Comments

@Netherdrake
Copy link

user@gpt4:~/gpt4all/chat$ ./gpt4all-lora-quantized-linux-x86 
main: seed = 1680120667
llama_model_load: loading model from 'gpt4all-lora-quantized.bin' - please wait ...
Illegal instruction (core dumped)

dmesg shows:

[  104.211520] systemd[1]: systemd 249.11-0ubuntu3.7 running in system mode (+PAM +AUDIT +SELINUX +APPARMOR +IMA +SMACK +SECCOMP +GCRYPT +GNUTLS +OPENSSL +ACL +BLKID +CURL +ELFUTILS +FIDO2 +IDN2 -IDN +IPTC +KMOD +LIBCRYPTSETUP +LIBFDISK +PCRE2 -PWQUALITY -P11KIT -QRENCODE +BZIP2 +LZ4 +XZ +ZLIB +ZSTD -XKBCOMMON +UTMP +SYSVINIT default-hierarchy=unified)
[  104.211578] systemd[1]: Detected virtualization kvm.
[  104.211582] systemd[1]: Detected architecture x86-64.
[ 5620.273116] show_signal: 22 callbacks suppressed
[ 5620.273119] traps: gpt4all-lora-qu[17654] trap invalid opcode ip:423d62 sp:7ffe451f4828 error:0 in gpt4all-lora-quantized-linux-x86[400000+55000]
[ 5647.501626] traps: gpt4all-lora-qu[17668] trap invalid opcode ip:423d62 sp:7fffdfc29678 error:0 in gpt4all-lora-quantized-linux-x86[400000+55000]

strace tail shows:

...
loading libs, reading gpt4all-lora-quantized.bin
...
brk(0x13d5000)                          = 0x13d5000
brk(0x13f6000)                          = 0x13f6000
read(3, "\0\0\340\245\244\2\0\0\0\321\220\3\0\0\0\341\276\266\3\0\0\0\342\236\226\3\0\0\0\345\272\247"..., 8191) = 8191
--- SIGILL {si_signo=SIGILL, si_code=ILL_ILLOPN, si_addr=0x423d62} ---
+++ killed by SIGILL (core dumped) +++
Illegal instruction (core dumped)

ILL_ILLOPN = Illegal operand. I suppose some CPU instruction is not available.

The CPU is AMD Epyc 7313, running Ubuntu 22.04 inside of a VM.

From the VM, the following cpu flags are enabled:

    Flags:               fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx lm rep_good nopl cpuid extd_apicid tsc_known_freq pni
                          cx16 x2apic hypervisor cmp_legacy 3dnowprefetch vmmcall
Virtualization features: 
  Hypervisor vendor:     KVM
  Virtualization type:   full

Unfortunately I'm not very experienced with VM's, however I would like to run GPT chat on a server.

Is it possible to get the source of gpt4all-lora-quantized-linux-x86 to recompile?

@pepega007xd
Copy link

Same problem here, the exectution stops in the ggml_type_sizef function on the vxorps %xmm0,%xmm0,%xmm0 instruction. I suppose this is because I'm running this on a very low-end CPU which doesn't support the AVX(?) instructions. Perhaps this is a bug in the ggml library, in the code for differentiating CPU features.

@qinidema
Copy link

@Netherdrake you need AVX support for this particular instruction. Then you need AVX2 too (see issue 82). Since your CPU supports it you need to somehow pass that too through your VM software.

@qinidema
Copy link

qinidema commented Mar 30, 2023

@pepega007xd

Perhaps this is a bug in the ggml library, in the code for differentiating CPU features.

Looks like you are right. See my other reply. This code is basically "return true;". CPU features are decided in compile-time, and not run-time.

@finaldie
Copy link

I have a similar issue in another project (privateGPT), and here is my solution as a reference.

My environment is CPU E5-2680v4 + PVE VM. The root case is application code cannot execute avx/avx2 instructions and crashes. We have two directions to deal with:

  1. [Not lucky] If CPU does not support avx/avx2, then refer to https://tech.amikelive.com/node-887/how-to-resolve-error-illegal-instruction-core-dumped-when-running-import-tensorflow-in-a-python-program/
  2. [Lucky] If CPU does support avx/avx2, but VM has no avx/avx2 flags, then simply pass avx/avx2 flags into VM.

My issue fits solution (2), and from the problem statement above, the CPU "AMD Epyc 7313" supports avx/avx2, the only issue is VM has no correct CPU flags, it could fit into solution (2). I don't know which VM provider you are using, may need to pass the flags accordingly.

Hope it helps a little.

image

@mibtim
Copy link

mibtim commented Jun 11, 2023

@niansa
Copy link
Collaborator

niansa commented Aug 10, 2023

Stale

@niansa niansa closed this as not planned Won't fix, can't repro, duplicate, stale Aug 10, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants