Skip to content

Openblas / Docker / macOS #662

@loretoparisi

Description

@loretoparisi

Prerequisites

Please answer the following questions for yourself before submitting an issue.

  • I am running the latest code. Development is very rapid so there are no tagged versions as of now.
  • I carefully followed the README.md.
  • I searched using keywords relevant to my issue to make sure that I am creating a new issue that is not already open (or closed).
  • I reviewed the Discussions, and have a new bug or useful enhancement to share.

Expected Behavior

After installing the example openblas simple, openblas will be recognized in Llama.cpp with BLAS =1 flag.

Current Behavior

We have BLAS = 0 when running Llama.cpp

..................................................................................................
llama_new_context_with_model: kv self size  = 1024.00 MB
llama_new_context_with_model: compute buffer total size =  153.47 MB
AVX = 0 | AVX2 = 0 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 0 | NEON = 1 | ARM_FMA = 1 | F16C = 0 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 0 | SSE3 = 0 | SSSE3 = 0 | VSX = 0 | 

Environment and Context

Please provide detailed information about your computer setup. This is important in case the issue is not reproducible except for under certain specific conditions.

  • Physical (or virtual) hardware you are using, e.g. for Linux:
    Virtual (docker), macOS host

$ lscpu

root@c619ba76751b:/app# lscpu
Architecture:                    aarch64
CPU op-mode(s):                  64-bit
Byte Order:                      Little Endian
CPU(s):                          8
On-line CPU(s) list:             0-7
Thread(s) per core:              1
Core(s) per socket:              8
Socket(s):                       1
Vendor ID:                       0x00
Model:                           0
Stepping:                        0x0
BogoMIPS:                        48.00
Vulnerability Itlb multihit:     Not affected
Vulnerability L1tf:              Not affected
Vulnerability Mds:               Not affected
Vulnerability Meltdown:          Not affected
Vulnerability Spec store bypass: Vulnerable
Vulnerability Spectre v1:        Mitigation; __user pointer sanitization
Vulnerability Spectre v2:        Not affected
Vulnerability Srbds:             Not affected
Vulnerability Tsx async abort:   Not affected
Flags:                           fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 asimddp sha512 asimdfhm di
                                 t uscat ilrcpc flagm sb paca pacg dcpodp flagm2 frint
  • Operating System, e.g. for Linux:
Linux c619ba76751b 5.10.104-linuxkit #1 SMP PREEMPT Thu Mar 17 17:05:54 UTC 2022 aarch64 GNU/Linux

host:

Darwin Kernel Version 22.6.0
xnu-8796.141.3~6/RELEASE_ARM64_T6000 arm64
Apple MacBook M1 Pro macOS Ventura 13.5
  • SDK version, e.g. for Linux:
Python 3.7.14
GNU Make 4.3
g++ (Debian 10.2.1-6) 10.2.1 20210110

Failure Information (for bugs)

Please help provide information about the failure if this is a bug. If it is not a bug, please remove the rest of this template.

Steps to Reproduce

Install command was

CMAKE_ARGS="-DLLAMA_BLAS=ON -DLLAMA_BLAS_VENDOR=OpenBLAS -DLLAMA_OPENBLAS=yes" pip install llama-cpp-python --verbose

The execution command from the logs was

  Command:
      /tmp/pip-build-env-b5uw3ahb/overlay/lib/python3.7/site-packages/cmake/data/bin/cmake /tmp/pip-install-wq1_6mxy/llama-cpp-python_b18d4faf00a94f709dd8e3edf4f449a5 -G Ninja -DCMAKE_MAKE_PROGRAM:FILEPATH=/tmp/pip-build-env-b5uw3ahb/overlay/lib/python3.7/site-packages/ninja/data/bin/ninja --no-warn-unused-cli -DCMAKE_INSTALL_PREFIX:PATH=/tmp/pip-install-wq1_6mxy/llama-cpp-python_b18d4faf00a94f709dd8e3edf4f449a5/_skbuild/linux-aarch64-3.7/cmake-install -DPYTHON_VERSION_STRING:STRING=3.7.14 -DSKBUILD:INTERNAL=TRUE -DCMAKE_MODULE_PATH:PATH=/tmp/pip-build-env-b5uw3ahb/overlay/lib/python3.7/site-packages/skbuild/resources/cmake -DPYTHON_EXECUTABLE:PATH=/usr/local/bin/python -DPYTHON_INCLUDE_DIR:PATH=/usr/local/include/python3.7m -DPYTHON_LIBRARY:PATH=/usr/local/lib/libpython3.7m.so -DPython_EXECUTABLE:PATH=/usr/local/bin/python -DPython_ROOT_DIR:PATH=/usr/local -DPython_FIND_REGISTRY:STRING=NEVER -DPython_INCLUDE_DIR:PATH=/usr/local/include/python3.7m -DPython3_EXECUTABLE:PATH=/usr/local/bin/python -DPython3_ROOT_DIR:PATH=/usr/local -DPython3_FIND_REGISTRY:STRING=NEVER -DPython3_INCLUDE_DIR:PATH=/usr/local/include/python3.7m -DCMAKE_MAKE_PROGRAM:FILEPATH=/tmp/pip-build-env-b5uw3ahb/overlay/lib/python3.7/site-packages/ninja/data/bin/ninja -DLLAMA_BLAS=ON -DLLAMA_BLAS_VENDOR=OpenBLAS -DLLAMA_OPENBLAS=yes -DCMAKE_BUILD_TYPE:STRING=Release -DLLAMA_BLAS=ON -DLLAMA_BLAS_VENDOR=OpenBLAS -DLLAMA_OPENBLAS=yes

I have manually compiled in the vendor folder llama.cpp using

make LLAMA_OPENBLAS=1

and it seems the flags where ok:

root@a7a114203a48:/app/llama-cpp-python/vendor/llama.cpp# make LLAMA_OPENBLAS=1
I llama.cpp build info: 
I UNAME_S:  Linux
I UNAME_P:  unknown
I UNAME_M:  aarch64
make: pkg-config: No such file or directory
I CFLAGS:   -I.            -O3 -std=c11   -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wdouble-promotion -Wshadow -Wstrict-prototypes -Wpointer-arith -Wmissing-prototypes -Werror=implicit-int -Wno-unused-function -pthread -mcpu=native -DGGML_USE_K_QUANTS -DGGML_USE_OPENBLAS 
I CXXFLAGS: -I. -I./common -O3 -std=c++11 -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-multichar -Wno-format-truncation -pthread -mcpu=native -DGGML_USE_K_QUANTS
make: pkg-config: No such file or directory
I LDFLAGS:  
I CC:       cc (Debian 10.2.1-6) 10.2.1 20210110
I CXX:      g++ (Debian 10.2.1-6) 10.2.1 20210110

g++ -I. -I./common -O3 -std=c++11 -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-multichar -Wno-format-truncation -pthread -mcpu=native -DGGML_USE_K_QUANTS -c common/common.cpp -o common.o
g++ -I. -I./common -O3 -std=c++11 -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-multichar -Wno-format-truncation -pthread -mcpu=native -DGGML_USE_K_QUANTS -c common/console.cpp -o console.o
g++ -I. -I./common -O3 -std=c++11 -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-multichar -Wno-format-truncation -pthread -mcpu=native -DGGML_USE_K_QUANTS -c common/grammar-parser.cpp -o grammar-parser.o
make: pkg-config: No such file or directory
g++ -I. -I./common -O3 -std=c++11 -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-multichar -Wno-format-truncation -pthread -mcpu=native -DGGML_USE_K_QUANTS examples/main/main.cpp ggml.o llama.o common.o console.o grammar-parser.o k_quants.o ggml-alloc.o -o main 

====  Run ./main -h for help.  ====

make: pkg-config: No such file or directory
g++ -I. -I./common -O3 -std=c++11 -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-multichar -Wno-format-truncation -pthread -mcpu=native -DGGML_USE_K_QUANTS examples/quantize/quantize.cpp ggml.o llama.o k_quants.o ggml-alloc.o -o quantize 
make: pkg-config: No such file or directory
g++ -I. -I./common -O3 -std=c++11 -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-multichar -Wno-format-truncation -pthread -mcpu=native -DGGML_USE_K_QUANTS examples/quantize-stats/quantize-stats.cpp ggml.o llama.o k_quants.o ggml-alloc.o -o quantize-stats 
make: pkg-config: No such file or directory
g++ -I. -I./common -O3 -std=c++11 -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-multichar -Wno-format-truncation -pthread -mcpu=native -DGGML_USE_K_QUANTS examples/perplexity/perplexity.cpp ggml.o llama.o common.o k_quants.o ggml-alloc.o -o perplexity 
make: pkg-config: No such file or directory
g++ -I. -I./common -O3 -std=c++11 -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-multichar -Wno-format-truncation -pthread -mcpu=native -DGGML_USE_K_QUANTS examples/embedding/embedding.cpp ggml.o llama.o common.o k_quants.o ggml-alloc.o -o embedding 
make: pkg-config: No such file or directory
g++ -I. -I./common -O3 -std=c++11 -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-multichar -Wno-format-truncation -pthread -mcpu=native -DGGML_USE_K_QUANTS pocs/vdot/vdot.cpp ggml.o k_quants.o ggml-alloc.o -o vdot 
make: pkg-config: No such file or directory
g++ -I. -I./common -O3 -std=c++11 -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-multichar -Wno-format-truncation -pthread -mcpu=native -DGGML_USE_K_QUANTS examples/train-text-from-scratch/train-text-from-scratch.cpp ggml.o llama.o common.o k_quants.o ggml-alloc.o -o train-text-from-scratch 
examples/train-text-from-scratch/train-text-from-scratch.cpp:504:2: warning: extra ‘;’ [-Wpedantic]
  504 | };
      |  ^
examples/train-text-from-scratch/train-text-from-scratch.cpp:601:2: warning: extra ‘;’ [-Wpedantic]
  601 | };
      |  ^
examples/train-text-from-scratch/train-text-from-scratch.cpp: In function ‘ggml_tensor* llama_build_train_graphs(my_llama_model*, ggml_allocr*, ggml_context*, ggml_cgraph*, ggml_cgraph*, ggml_cgraph*, ggml_tensor**, ggml_tensor*, ggml_tensor*, int, int, bool, bool)’:
examples/train-text-from-scratch/train-text-from-scratch.cpp:739:68: warning: ‘kv_scale’ may be used uninitialized in this function [-Wmaybe-uninitialized]
  739 |             struct ggml_tensor * t16_1 = ggml_scale_inplace        (ctx, t16_0, kv_scale);          set_name(t16_1, "t16_1"); assert_shape_4d(t16_1, N, N, n_head, n_batch);
      |                                          ~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~
make: pkg-config: No such file or directory
g++ -I. -I./common -O3 -std=c++11 -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-multichar -Wno-format-truncation -pthread -mcpu=native -DGGML_USE_K_QUANTS examples/convert-llama2c-to-ggml/convert-llama2c-to-ggml.cpp ggml.o llama.o k_quants.o ggml-alloc.o -o convert-llama2c-to-ggml 
make: pkg-config: No such file or directory
g++ -I. -I./common -O3 -std=c++11 -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-multichar -Wno-format-truncation -pthread -mcpu=native -DGGML_USE_K_QUANTS examples/simple/simple.cpp ggml.o llama.o common.o k_quants.o ggml-alloc.o -o simple 
make: pkg-config: No such file or directory
g++ -I. -I./common -O3 -std=c++11 -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-multichar -Wno-format-truncation -pthread -mcpu=native -DGGML_USE_K_QUANTS examples/save-load-state/save-load-state.cpp ggml.o llama.o common.o k_quants.o ggml-alloc.o -o save-load-state 
make: pkg-config: No such file or directory
g++ -I. -I./common -O3 -std=c++11 -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-multichar -Wno-format-truncation -pthread -mcpu=native -DGGML_USE_K_QUANTS -Iexamples/server examples/server/server.cpp ggml.o llama.o common.o grammar-parser.o k_quants.o ggml-alloc.o -o server  
make: pkg-config: No such file or directory
g++ --shared -I. -I./common -O3 -std=c++11 -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-multichar -Wno-format-truncation -pthread -mcpu=native -DGGML_USE_K_QUANTS examples/embd-input/embd-input-lib.cpp ggml.o llama.o common.o k_quants.o ggml-alloc.o -o libembdinput.so 
make: pkg-config: No such file or directory
g++ -I. -I./common -O3 -std=c++11 -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-multichar -Wno-format-truncation -pthread -mcpu=native -DGGML_USE_K_QUANTS examples/embd-input/embd-input-test.cpp ggml.o llama.o common.o k_quants.o ggml-alloc.o -o embd-input-test  -L. -lembdinput
make: pkg-config: No such file or directory
g++ -I. -I./common -O3 -std=c++11 -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-multichar -Wno-format-truncation -pthread -mcpu=native -DGGML_USE_K_QUANTS examples/gguf/gguf.cpp ggml.o llama.o k_quants.o ggml-alloc.o -o gguf 
make: pkg-config: No such file or directory
g++ -I. -I./common -O3 -std=c++11 -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-multichar -Wno-format-truncation -pthread -mcpu=native -DGGML_USE_K_QUANTS examples/llama-bench/llama-bench.cpp ggml.o llama.o common.o k_quants.o ggml-alloc.o -o llama-bench 
make: pkg-config: No such file or directory
g++ -I. -I./common -O3 -std=c++11 -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-multichar -Wno-format-truncation -pthread -mcpu=native -DGGML_USE_K_QUANTS examples/baby-llama/baby-llama.cpp ggml.o llama.o common.o k_quants.o ggml-alloc.o -o baby-llama 
make: pkg-config: No such file or directory
g++ -I. -I./common -O3 -std=c++11 -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-multichar -Wno-format-truncation -pthread -mcpu=native -DGGML_USE_K_QUANTS examples/beam-search/beam-search.cpp ggml.o llama.o common.o k_quants.o ggml-alloc.o -o beam-search 
make: pkg-config: No such file or directory
cc -I.            -O3 -std=c11   -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wdouble-promotion -Wshadow -Wstrict-prototypes -Wpointer-arith -Wmissing-prototypes -Werror=implicit-int -Wno-unused-function -pthread -mcpu=native -DGGML_USE_K_QUANTS -DGGML_USE_OPENBLAS  -c tests/test-c.c -o tests/test-c.o

but I'm still getting at execution

AVX = 0 | AVX2 = 0 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 0 | NEON = 1 | ARM_FMA = 1 | F16C = 0 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 0 | SSE3 = 0 | SSSE3 = 0 | VSX = 0 | 

Failure Logs

No failure of execution; blas not recognized; We have BLAS = 0 when running Llama.cpp

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions