Openblas / Docker / macOS

# Prerequisites

Please answer the following questions for yourself before submitting an issue.

- [X] I am running the latest code. Development is very rapid so there are no tagged versions as of now.
- [X] I carefully followed the [README.md](https://github.com/abetlen/llama-cpp-python/blob/main/README.md).
- [X] I [searched using keywords relevant to my issue](https://docs.github.com/en/issues/tracking-your-work-with-issues/filtering-and-searching-issues-and-pull-requests) to make sure that I am creating a new issue that is not already open (or closed).
- [X] I reviewed the [Discussions](https://github.com/abetlen/llama-cpp-python/discussions), and have a new bug or useful enhancement to share.

# Expected Behavior

After installing the example openblas simple, openblas will be recognized in Llama.cpp with `BLAS =1` flag.

# Current Behavior
We have `BLAS = 0` when running Llama.cpp

```
..................................................................................................
llama_new_context_with_model: kv self size  = 1024.00 MB
llama_new_context_with_model: compute buffer total size =  153.47 MB
AVX = 0 | AVX2 = 0 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 0 | NEON = 1 | ARM_FMA = 1 | F16C = 0 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 0 | SSE3 = 0 | SSSE3 = 0 | VSX = 0 | 
```



# Environment and Context

Please provide detailed information about your computer setup. This is important in case the issue is not reproducible except for under certain specific conditions.

* Physical (or virtual) hardware you are using, e.g. for Linux:
Virtual (docker), macOS host

`$ lscpu`

```
root@c619ba76751b:/app# lscpu
Architecture:                    aarch64
CPU op-mode(s):                  64-bit
Byte Order:                      Little Endian
CPU(s):                          8
On-line CPU(s) list:             0-7
Thread(s) per core:              1
Core(s) per socket:              8
Socket(s):                       1
Vendor ID:                       0x00
Model:                           0
Stepping:                        0x0
BogoMIPS:                        48.00
Vulnerability Itlb multihit:     Not affected
Vulnerability L1tf:              Not affected
Vulnerability Mds:               Not affected
Vulnerability Meltdown:          Not affected
Vulnerability Spec store bypass: Vulnerable
Vulnerability Spectre v1:        Mitigation; __user pointer sanitization
Vulnerability Spectre v2:        Not affected
Vulnerability Srbds:             Not affected
Vulnerability Tsx async abort:   Not affected
Flags:                           fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 asimddp sha512 asimdfhm di
                                 t uscat ilrcpc flagm sb paca pacg dcpodp flagm2 frint
```

* Operating System, e.g. for Linux:

```
Linux c619ba76751b 5.10.104-linuxkit #1 SMP PREEMPT Thu Mar 17 17:05:54 UTC 2022 aarch64 GNU/Linux
```

host:
```
Darwin Kernel Version 22.6.0
xnu-8796.141.3~6/RELEASE_ARM64_T6000 arm64
Apple MacBook M1 Pro macOS Ventura 13.5
```

* SDK version, e.g. for Linux:

```
Python 3.7.14
GNU Make 4.3
g++ (Debian 10.2.1-6) 10.2.1 20210110
```

# Failure Information (for bugs)

Please help provide information about the failure if this is a bug. If it is not a bug, please remove the rest of this template.

# Steps to Reproduce

Install command was

```bash
CMAKE_ARGS="-DLLAMA_BLAS=ON -DLLAMA_BLAS_VENDOR=OpenBLAS -DLLAMA_OPENBLAS=yes" pip install llama-cpp-python --verbose
```

The execution command from the logs was

```
  Command:
      /tmp/pip-build-env-b5uw3ahb/overlay/lib/python3.7/site-packages/cmake/data/bin/cmake /tmp/pip-install-wq1_6mxy/llama-cpp-python_b18d4faf00a94f709dd8e3edf4f449a5 -G Ninja -DCMAKE_MAKE_PROGRAM:FILEPATH=/tmp/pip-build-env-b5uw3ahb/overlay/lib/python3.7/site-packages/ninja/data/bin/ninja --no-warn-unused-cli -DCMAKE_INSTALL_PREFIX:PATH=/tmp/pip-install-wq1_6mxy/llama-cpp-python_b18d4faf00a94f709dd8e3edf4f449a5/_skbuild/linux-aarch64-3.7/cmake-install -DPYTHON_VERSION_STRING:STRING=3.7.14 -DSKBUILD:INTERNAL=TRUE -DCMAKE_MODULE_PATH:PATH=/tmp/pip-build-env-b5uw3ahb/overlay/lib/python3.7/site-packages/skbuild/resources/cmake -DPYTHON_EXECUTABLE:PATH=/usr/local/bin/python -DPYTHON_INCLUDE_DIR:PATH=/usr/local/include/python3.7m -DPYTHON_LIBRARY:PATH=/usr/local/lib/libpython3.7m.so -DPython_EXECUTABLE:PATH=/usr/local/bin/python -DPython_ROOT_DIR:PATH=/usr/local -DPython_FIND_REGISTRY:STRING=NEVER -DPython_INCLUDE_DIR:PATH=/usr/local/include/python3.7m -DPython3_EXECUTABLE:PATH=/usr/local/bin/python -DPython3_ROOT_DIR:PATH=/usr/local -DPython3_FIND_REGISTRY:STRING=NEVER -DPython3_INCLUDE_DIR:PATH=/usr/local/include/python3.7m -DCMAKE_MAKE_PROGRAM:FILEPATH=/tmp/pip-build-env-b5uw3ahb/overlay/lib/python3.7/site-packages/ninja/data/bin/ninja -DLLAMA_BLAS=ON -DLLAMA_BLAS_VENDOR=OpenBLAS -DLLAMA_OPENBLAS=yes -DCMAKE_BUILD_TYPE:STRING=Release -DLLAMA_BLAS=ON -DLLAMA_BLAS_VENDOR=OpenBLAS -DLLAMA_OPENBLAS=yes
```

I have manually compiled in the `vendor` folder `llama.cpp` using 

```
make LLAMA_OPENBLAS=1
```

and it seems the flags where ok:
```
root@a7a114203a48:/app/llama-cpp-python/vendor/llama.cpp# make LLAMA_OPENBLAS=1
I llama.cpp build info: 
I UNAME_S:  Linux
I UNAME_P:  unknown
I UNAME_M:  aarch64
make: pkg-config: No such file or directory
I CFLAGS:   -I.            -O3 -std=c11   -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wdouble-promotion -Wshadow -Wstrict-prototypes -Wpointer-arith -Wmissing-prototypes -Werror=implicit-int -Wno-unused-function -pthread -mcpu=native -DGGML_USE_K_QUANTS -DGGML_USE_OPENBLAS 
I CXXFLAGS: -I. -I./common -O3 -std=c++11 -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-multichar -Wno-format-truncation -pthread -mcpu=native -DGGML_USE_K_QUANTS
make: pkg-config: No such file or directory
I LDFLAGS:  
I CC:       cc (Debian 10.2.1-6) 10.2.1 20210110
I CXX:      g++ (Debian 10.2.1-6) 10.2.1 20210110

g++ -I. -I./common -O3 -std=c++11 -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-multichar -Wno-format-truncation -pthread -mcpu=native -DGGML_USE_K_QUANTS -c common/common.cpp -o common.o
g++ -I. -I./common -O3 -std=c++11 -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-multichar -Wno-format-truncation -pthread -mcpu=native -DGGML_USE_K_QUANTS -c common/console.cpp -o console.o
g++ -I. -I./common -O3 -std=c++11 -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-multichar -Wno-format-truncation -pthread -mcpu=native -DGGML_USE_K_QUANTS -c common/grammar-parser.cpp -o grammar-parser.o
make: pkg-config: No such file or directory
g++ -I. -I./common -O3 -std=c++11 -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-multichar -Wno-format-truncation -pthread -mcpu=native -DGGML_USE_K_QUANTS examples/main/main.cpp ggml.o llama.o common.o console.o grammar-parser.o k_quants.o ggml-alloc.o -o main 

====  Run ./main -h for help.  ====

make: pkg-config: No such file or directory
g++ -I. -I./common -O3 -std=c++11 -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-multichar -Wno-format-truncation -pthread -mcpu=native -DGGML_USE_K_QUANTS examples/quantize/quantize.cpp ggml.o llama.o k_quants.o ggml-alloc.o -o quantize 
make: pkg-config: No such file or directory
g++ -I. -I./common -O3 -std=c++11 -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-multichar -Wno-format-truncation -pthread -mcpu=native -DGGML_USE_K_QUANTS examples/quantize-stats/quantize-stats.cpp ggml.o llama.o k_quants.o ggml-alloc.o -o quantize-stats 
make: pkg-config: No such file or directory
g++ -I. -I./common -O3 -std=c++11 -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-multichar -Wno-format-truncation -pthread -mcpu=native -DGGML_USE_K_QUANTS examples/perplexity/perplexity.cpp ggml.o llama.o common.o k_quants.o ggml-alloc.o -o perplexity 
make: pkg-config: No such file or directory
g++ -I. -I./common -O3 -std=c++11 -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-multichar -Wno-format-truncation -pthread -mcpu=native -DGGML_USE_K_QUANTS examples/embedding/embedding.cpp ggml.o llama.o common.o k_quants.o ggml-alloc.o -o embedding 
make: pkg-config: No such file or directory
g++ -I. -I./common -O3 -std=c++11 -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-multichar -Wno-format-truncation -pthread -mcpu=native -DGGML_USE_K_QUANTS pocs/vdot/vdot.cpp ggml.o k_quants.o ggml-alloc.o -o vdot 
make: pkg-config: No such file or directory
g++ -I. -I./common -O3 -std=c++11 -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-multichar -Wno-format-truncation -pthread -mcpu=native -DGGML_USE_K_QUANTS examples/train-text-from-scratch/train-text-from-scratch.cpp ggml.o llama.o common.o k_quants.o ggml-alloc.o -o train-text-from-scratch 
examples/train-text-from-scratch/train-text-from-scratch.cpp:504:2: warning: extra ‘;’ [-Wpedantic]
  504 | };
      |  ^
examples/train-text-from-scratch/train-text-from-scratch.cpp:601:2: warning: extra ‘;’ [-Wpedantic]
  601 | };
      |  ^
examples/train-text-from-scratch/train-text-from-scratch.cpp: In function ‘ggml_tensor* llama_build_train_graphs(my_llama_model*, ggml_allocr*, ggml_context*, ggml_cgraph*, ggml_cgraph*, ggml_cgraph*, ggml_tensor**, ggml_tensor*, ggml_tensor*, int, int, bool, bool)’:
examples/train-text-from-scratch/train-text-from-scratch.cpp:739:68: warning: ‘kv_scale’ may be used uninitialized in this function [-Wmaybe-uninitialized]
  739 |             struct ggml_tensor * t16_1 = ggml_scale_inplace        (ctx, t16_0, kv_scale);          set_name(t16_1, "t16_1"); assert_shape_4d(t16_1, N, N, n_head, n_batch);
      |                                          ~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~
make: pkg-config: No such file or directory
g++ -I. -I./common -O3 -std=c++11 -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-multichar -Wno-format-truncation -pthread -mcpu=native -DGGML_USE_K_QUANTS examples/convert-llama2c-to-ggml/convert-llama2c-to-ggml.cpp ggml.o llama.o k_quants.o ggml-alloc.o -o convert-llama2c-to-ggml 
make: pkg-config: No such file or directory
g++ -I. -I./common -O3 -std=c++11 -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-multichar -Wno-format-truncation -pthread -mcpu=native -DGGML_USE_K_QUANTS examples/simple/simple.cpp ggml.o llama.o common.o k_quants.o ggml-alloc.o -o simple 
make: pkg-config: No such file or directory
g++ -I. -I./common -O3 -std=c++11 -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-multichar -Wno-format-truncation -pthread -mcpu=native -DGGML_USE_K_QUANTS examples/save-load-state/save-load-state.cpp ggml.o llama.o common.o k_quants.o ggml-alloc.o -o save-load-state 
make: pkg-config: No such file or directory
g++ -I. -I./common -O3 -std=c++11 -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-multichar -Wno-format-truncation -pthread -mcpu=native -DGGML_USE_K_QUANTS -Iexamples/server examples/server/server.cpp ggml.o llama.o common.o grammar-parser.o k_quants.o ggml-alloc.o -o server  
make: pkg-config: No such file or directory
g++ --shared -I. -I./common -O3 -std=c++11 -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-multichar -Wno-format-truncation -pthread -mcpu=native -DGGML_USE_K_QUANTS examples/embd-input/embd-input-lib.cpp ggml.o llama.o common.o k_quants.o ggml-alloc.o -o libembdinput.so 
make: pkg-config: No such file or directory
g++ -I. -I./common -O3 -std=c++11 -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-multichar -Wno-format-truncation -pthread -mcpu=native -DGGML_USE_K_QUANTS examples/embd-input/embd-input-test.cpp ggml.o llama.o common.o k_quants.o ggml-alloc.o -o embd-input-test  -L. -lembdinput
make: pkg-config: No such file or directory
g++ -I. -I./common -O3 -std=c++11 -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-multichar -Wno-format-truncation -pthread -mcpu=native -DGGML_USE_K_QUANTS examples/gguf/gguf.cpp ggml.o llama.o k_quants.o ggml-alloc.o -o gguf 
make: pkg-config: No such file or directory
g++ -I. -I./common -O3 -std=c++11 -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-multichar -Wno-format-truncation -pthread -mcpu=native -DGGML_USE_K_QUANTS examples/llama-bench/llama-bench.cpp ggml.o llama.o common.o k_quants.o ggml-alloc.o -o llama-bench 
make: pkg-config: No such file or directory
g++ -I. -I./common -O3 -std=c++11 -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-multichar -Wno-format-truncation -pthread -mcpu=native -DGGML_USE_K_QUANTS examples/baby-llama/baby-llama.cpp ggml.o llama.o common.o k_quants.o ggml-alloc.o -o baby-llama 
make: pkg-config: No such file or directory
g++ -I. -I./common -O3 -std=c++11 -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-multichar -Wno-format-truncation -pthread -mcpu=native -DGGML_USE_K_QUANTS examples/beam-search/beam-search.cpp ggml.o llama.o common.o k_quants.o ggml-alloc.o -o beam-search 
make: pkg-config: No such file or directory
cc -I.            -O3 -std=c11   -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wdouble-promotion -Wshadow -Wstrict-prototypes -Wpointer-arith -Wmissing-prototypes -Werror=implicit-int -Wno-unused-function -pthread -mcpu=native -DGGML_USE_K_QUANTS -DGGML_USE_OPENBLAS  -c tests/test-c.c -o tests/test-c.o
```

but I'm still getting at execution 
```
AVX = 0 | AVX2 = 0 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 0 | NEON = 1 | ARM_FMA = 1 | F16C = 0 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 0 | SSE3 = 0 | SSSE3 = 0 | VSX = 0 | 
```

# Failure Logs
No failure of execution; blas not recognized; We have `BLAS = 0` when running Llama.cpp


```


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Openblas / Docker / macOS #662

Prerequisites

Expected Behavior

Current Behavior

Environment and Context

Failure Information (for bugs)

Steps to Reproduce

Failure Logs

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Openblas / Docker / macOS #662

Description

Prerequisites

Expected Behavior

Current Behavior

Environment and Context

Failure Information (for bugs)

Steps to Reproduce

Failure Logs

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions