-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Description
Prerequisites
Please answer the following questions for yourself before submitting an issue.
- I am running the latest code. Development is very rapid so there are no tagged versions as of now.
- I carefully followed the README.md.
- I searched using keywords relevant to my issue to make sure that I am creating a new issue that is not already open (or closed).
- I reviewed the Discussions, and have a new bug or useful enhancement to share.
Expected Behavior
After installing the example openblas simple, openblas will be recognized in Llama.cpp with BLAS =1
flag.
Current Behavior
We have BLAS = 0
when running Llama.cpp
..................................................................................................
llama_new_context_with_model: kv self size = 1024.00 MB
llama_new_context_with_model: compute buffer total size = 153.47 MB
AVX = 0 | AVX2 = 0 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 0 | NEON = 1 | ARM_FMA = 1 | F16C = 0 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 0 | SSE3 = 0 | SSSE3 = 0 | VSX = 0 |
Environment and Context
Please provide detailed information about your computer setup. This is important in case the issue is not reproducible except for under certain specific conditions.
- Physical (or virtual) hardware you are using, e.g. for Linux:
Virtual (docker), macOS host
$ lscpu
root@c619ba76751b:/app# lscpu
Architecture: aarch64
CPU op-mode(s): 64-bit
Byte Order: Little Endian
CPU(s): 8
On-line CPU(s) list: 0-7
Thread(s) per core: 1
Core(s) per socket: 8
Socket(s): 1
Vendor ID: 0x00
Model: 0
Stepping: 0x0
BogoMIPS: 48.00
Vulnerability Itlb multihit: Not affected
Vulnerability L1tf: Not affected
Vulnerability Mds: Not affected
Vulnerability Meltdown: Not affected
Vulnerability Spec store bypass: Vulnerable
Vulnerability Spectre v1: Mitigation; __user pointer sanitization
Vulnerability Spectre v2: Not affected
Vulnerability Srbds: Not affected
Vulnerability Tsx async abort: Not affected
Flags: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 asimddp sha512 asimdfhm di
t uscat ilrcpc flagm sb paca pacg dcpodp flagm2 frint
- Operating System, e.g. for Linux:
Linux c619ba76751b 5.10.104-linuxkit #1 SMP PREEMPT Thu Mar 17 17:05:54 UTC 2022 aarch64 GNU/Linux
host:
Darwin Kernel Version 22.6.0
xnu-8796.141.3~6/RELEASE_ARM64_T6000 arm64
Apple MacBook M1 Pro macOS Ventura 13.5
- SDK version, e.g. for Linux:
Python 3.7.14
GNU Make 4.3
g++ (Debian 10.2.1-6) 10.2.1 20210110
Failure Information (for bugs)
Please help provide information about the failure if this is a bug. If it is not a bug, please remove the rest of this template.
Steps to Reproduce
Install command was
CMAKE_ARGS="-DLLAMA_BLAS=ON -DLLAMA_BLAS_VENDOR=OpenBLAS -DLLAMA_OPENBLAS=yes" pip install llama-cpp-python --verbose
The execution command from the logs was
Command:
/tmp/pip-build-env-b5uw3ahb/overlay/lib/python3.7/site-packages/cmake/data/bin/cmake /tmp/pip-install-wq1_6mxy/llama-cpp-python_b18d4faf00a94f709dd8e3edf4f449a5 -G Ninja -DCMAKE_MAKE_PROGRAM:FILEPATH=/tmp/pip-build-env-b5uw3ahb/overlay/lib/python3.7/site-packages/ninja/data/bin/ninja --no-warn-unused-cli -DCMAKE_INSTALL_PREFIX:PATH=/tmp/pip-install-wq1_6mxy/llama-cpp-python_b18d4faf00a94f709dd8e3edf4f449a5/_skbuild/linux-aarch64-3.7/cmake-install -DPYTHON_VERSION_STRING:STRING=3.7.14 -DSKBUILD:INTERNAL=TRUE -DCMAKE_MODULE_PATH:PATH=/tmp/pip-build-env-b5uw3ahb/overlay/lib/python3.7/site-packages/skbuild/resources/cmake -DPYTHON_EXECUTABLE:PATH=/usr/local/bin/python -DPYTHON_INCLUDE_DIR:PATH=/usr/local/include/python3.7m -DPYTHON_LIBRARY:PATH=/usr/local/lib/libpython3.7m.so -DPython_EXECUTABLE:PATH=/usr/local/bin/python -DPython_ROOT_DIR:PATH=/usr/local -DPython_FIND_REGISTRY:STRING=NEVER -DPython_INCLUDE_DIR:PATH=/usr/local/include/python3.7m -DPython3_EXECUTABLE:PATH=/usr/local/bin/python -DPython3_ROOT_DIR:PATH=/usr/local -DPython3_FIND_REGISTRY:STRING=NEVER -DPython3_INCLUDE_DIR:PATH=/usr/local/include/python3.7m -DCMAKE_MAKE_PROGRAM:FILEPATH=/tmp/pip-build-env-b5uw3ahb/overlay/lib/python3.7/site-packages/ninja/data/bin/ninja -DLLAMA_BLAS=ON -DLLAMA_BLAS_VENDOR=OpenBLAS -DLLAMA_OPENBLAS=yes -DCMAKE_BUILD_TYPE:STRING=Release -DLLAMA_BLAS=ON -DLLAMA_BLAS_VENDOR=OpenBLAS -DLLAMA_OPENBLAS=yes
I have manually compiled in the vendor
folder llama.cpp
using
make LLAMA_OPENBLAS=1
and it seems the flags where ok:
root@a7a114203a48:/app/llama-cpp-python/vendor/llama.cpp# make LLAMA_OPENBLAS=1
I llama.cpp build info:
I UNAME_S: Linux
I UNAME_P: unknown
I UNAME_M: aarch64
make: pkg-config: No such file or directory
I CFLAGS: -I. -O3 -std=c11 -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wdouble-promotion -Wshadow -Wstrict-prototypes -Wpointer-arith -Wmissing-prototypes -Werror=implicit-int -Wno-unused-function -pthread -mcpu=native -DGGML_USE_K_QUANTS -DGGML_USE_OPENBLAS
I CXXFLAGS: -I. -I./common -O3 -std=c++11 -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-multichar -Wno-format-truncation -pthread -mcpu=native -DGGML_USE_K_QUANTS
make: pkg-config: No such file or directory
I LDFLAGS:
I CC: cc (Debian 10.2.1-6) 10.2.1 20210110
I CXX: g++ (Debian 10.2.1-6) 10.2.1 20210110
g++ -I. -I./common -O3 -std=c++11 -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-multichar -Wno-format-truncation -pthread -mcpu=native -DGGML_USE_K_QUANTS -c common/common.cpp -o common.o
g++ -I. -I./common -O3 -std=c++11 -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-multichar -Wno-format-truncation -pthread -mcpu=native -DGGML_USE_K_QUANTS -c common/console.cpp -o console.o
g++ -I. -I./common -O3 -std=c++11 -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-multichar -Wno-format-truncation -pthread -mcpu=native -DGGML_USE_K_QUANTS -c common/grammar-parser.cpp -o grammar-parser.o
make: pkg-config: No such file or directory
g++ -I. -I./common -O3 -std=c++11 -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-multichar -Wno-format-truncation -pthread -mcpu=native -DGGML_USE_K_QUANTS examples/main/main.cpp ggml.o llama.o common.o console.o grammar-parser.o k_quants.o ggml-alloc.o -o main
==== Run ./main -h for help. ====
make: pkg-config: No such file or directory
g++ -I. -I./common -O3 -std=c++11 -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-multichar -Wno-format-truncation -pthread -mcpu=native -DGGML_USE_K_QUANTS examples/quantize/quantize.cpp ggml.o llama.o k_quants.o ggml-alloc.o -o quantize
make: pkg-config: No such file or directory
g++ -I. -I./common -O3 -std=c++11 -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-multichar -Wno-format-truncation -pthread -mcpu=native -DGGML_USE_K_QUANTS examples/quantize-stats/quantize-stats.cpp ggml.o llama.o k_quants.o ggml-alloc.o -o quantize-stats
make: pkg-config: No such file or directory
g++ -I. -I./common -O3 -std=c++11 -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-multichar -Wno-format-truncation -pthread -mcpu=native -DGGML_USE_K_QUANTS examples/perplexity/perplexity.cpp ggml.o llama.o common.o k_quants.o ggml-alloc.o -o perplexity
make: pkg-config: No such file or directory
g++ -I. -I./common -O3 -std=c++11 -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-multichar -Wno-format-truncation -pthread -mcpu=native -DGGML_USE_K_QUANTS examples/embedding/embedding.cpp ggml.o llama.o common.o k_quants.o ggml-alloc.o -o embedding
make: pkg-config: No such file or directory
g++ -I. -I./common -O3 -std=c++11 -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-multichar -Wno-format-truncation -pthread -mcpu=native -DGGML_USE_K_QUANTS pocs/vdot/vdot.cpp ggml.o k_quants.o ggml-alloc.o -o vdot
make: pkg-config: No such file or directory
g++ -I. -I./common -O3 -std=c++11 -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-multichar -Wno-format-truncation -pthread -mcpu=native -DGGML_USE_K_QUANTS examples/train-text-from-scratch/train-text-from-scratch.cpp ggml.o llama.o common.o k_quants.o ggml-alloc.o -o train-text-from-scratch
examples/train-text-from-scratch/train-text-from-scratch.cpp:504:2: warning: extra ‘;’ [-Wpedantic]
504 | };
| ^
examples/train-text-from-scratch/train-text-from-scratch.cpp:601:2: warning: extra ‘;’ [-Wpedantic]
601 | };
| ^
examples/train-text-from-scratch/train-text-from-scratch.cpp: In function ‘ggml_tensor* llama_build_train_graphs(my_llama_model*, ggml_allocr*, ggml_context*, ggml_cgraph*, ggml_cgraph*, ggml_cgraph*, ggml_tensor**, ggml_tensor*, ggml_tensor*, int, int, bool, bool)’:
examples/train-text-from-scratch/train-text-from-scratch.cpp:739:68: warning: ‘kv_scale’ may be used uninitialized in this function [-Wmaybe-uninitialized]
739 | struct ggml_tensor * t16_1 = ggml_scale_inplace (ctx, t16_0, kv_scale); set_name(t16_1, "t16_1"); assert_shape_4d(t16_1, N, N, n_head, n_batch);
| ~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~
make: pkg-config: No such file or directory
g++ -I. -I./common -O3 -std=c++11 -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-multichar -Wno-format-truncation -pthread -mcpu=native -DGGML_USE_K_QUANTS examples/convert-llama2c-to-ggml/convert-llama2c-to-ggml.cpp ggml.o llama.o k_quants.o ggml-alloc.o -o convert-llama2c-to-ggml
make: pkg-config: No such file or directory
g++ -I. -I./common -O3 -std=c++11 -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-multichar -Wno-format-truncation -pthread -mcpu=native -DGGML_USE_K_QUANTS examples/simple/simple.cpp ggml.o llama.o common.o k_quants.o ggml-alloc.o -o simple
make: pkg-config: No such file or directory
g++ -I. -I./common -O3 -std=c++11 -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-multichar -Wno-format-truncation -pthread -mcpu=native -DGGML_USE_K_QUANTS examples/save-load-state/save-load-state.cpp ggml.o llama.o common.o k_quants.o ggml-alloc.o -o save-load-state
make: pkg-config: No such file or directory
g++ -I. -I./common -O3 -std=c++11 -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-multichar -Wno-format-truncation -pthread -mcpu=native -DGGML_USE_K_QUANTS -Iexamples/server examples/server/server.cpp ggml.o llama.o common.o grammar-parser.o k_quants.o ggml-alloc.o -o server
make: pkg-config: No such file or directory
g++ --shared -I. -I./common -O3 -std=c++11 -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-multichar -Wno-format-truncation -pthread -mcpu=native -DGGML_USE_K_QUANTS examples/embd-input/embd-input-lib.cpp ggml.o llama.o common.o k_quants.o ggml-alloc.o -o libembdinput.so
make: pkg-config: No such file or directory
g++ -I. -I./common -O3 -std=c++11 -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-multichar -Wno-format-truncation -pthread -mcpu=native -DGGML_USE_K_QUANTS examples/embd-input/embd-input-test.cpp ggml.o llama.o common.o k_quants.o ggml-alloc.o -o embd-input-test -L. -lembdinput
make: pkg-config: No such file or directory
g++ -I. -I./common -O3 -std=c++11 -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-multichar -Wno-format-truncation -pthread -mcpu=native -DGGML_USE_K_QUANTS examples/gguf/gguf.cpp ggml.o llama.o k_quants.o ggml-alloc.o -o gguf
make: pkg-config: No such file or directory
g++ -I. -I./common -O3 -std=c++11 -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-multichar -Wno-format-truncation -pthread -mcpu=native -DGGML_USE_K_QUANTS examples/llama-bench/llama-bench.cpp ggml.o llama.o common.o k_quants.o ggml-alloc.o -o llama-bench
make: pkg-config: No such file or directory
g++ -I. -I./common -O3 -std=c++11 -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-multichar -Wno-format-truncation -pthread -mcpu=native -DGGML_USE_K_QUANTS examples/baby-llama/baby-llama.cpp ggml.o llama.o common.o k_quants.o ggml-alloc.o -o baby-llama
make: pkg-config: No such file or directory
g++ -I. -I./common -O3 -std=c++11 -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-multichar -Wno-format-truncation -pthread -mcpu=native -DGGML_USE_K_QUANTS examples/beam-search/beam-search.cpp ggml.o llama.o common.o k_quants.o ggml-alloc.o -o beam-search
make: pkg-config: No such file or directory
cc -I. -O3 -std=c11 -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wdouble-promotion -Wshadow -Wstrict-prototypes -Wpointer-arith -Wmissing-prototypes -Werror=implicit-int -Wno-unused-function -pthread -mcpu=native -DGGML_USE_K_QUANTS -DGGML_USE_OPENBLAS -c tests/test-c.c -o tests/test-c.o
but I'm still getting at execution
AVX = 0 | AVX2 = 0 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 0 | NEON = 1 | ARM_FMA = 1 | F16C = 0 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 0 | SSE3 = 0 | SSSE3 = 0 | VSX = 0 |
Failure Logs
No failure of execution; blas not recognized; We have BLAS = 0
when running Llama.cpp