Build fails with `ggml-vulkan.cpp:6880:80: error: cannot convert ‘ggml_tensor*’ to ‘float’` #7446

dreirund · 2024-05-21T21:08:34Z

I am on Artix GNU/Linux (rolling release), GCC 14.1.1, and I build ollama-vulkan which pulls in and uses llama.cpp from this git repository.

When building, I get the error
ggml-vulkan.cpp:6880:80: error: cannot convert ‘ggml_tensor*’ to ‘float’:

[...]
+ init_vars
+ case "${GOARCH}" in
+ ARCH=x86_64
+ LLAMACPP_DIR=../llama.cpp
+ CMAKE_DEFS=
+ CMAKE_TARGETS='--target ollama_llama_server'
+ echo ''
+ grep -- -g
+ CMAKE_DEFS='-DCMAKE_BUILD_TYPE=Release -DLLAMA_SERVER_VERBOSE=off '
+ case $(uname -s) in
++ uname -s
+ LIB_EXT=so
+ WHOLE_ARCHIVE=-Wl,--whole-archive
+ NO_WHOLE_ARCHIVE=-Wl,--no-whole-archive
+ GCC_ARCH=
+ '[' -z '50;52;61;70;75;80' ']'
+ echo 'OLLAMA_CUSTOM_CPU_DEFS="
  -DBUILD_TESTING=ON
  -DCMAKE_BUILD_TYPE=Release
  -DCMAKE_INSTALL_PREFIX=/usr
  -DLLAMA_ACCELERATE=ON
  -DLLAMA_ALL_WARNINGS=OFF
  -DLLAMA_ALL_WARNINGS_3RD_PARTY=OFF
  -DLLAMA_FATAL_WARNINGS=OFF
  -DLLAMA_AVX=ON -DLLAMA_AVX2=ON -DLLAMA_AVX512=ON -DLLAMA_AVX512_VBMI=ON -DLLAMA_AVX512_VNNI=ON -DLLAMA_F16C=ON -DLLAMA_FMA=ON
  -DLLAMA_BUILD_EXAMPLES=ON -DLLAMA_BUILD_SERVER=ON -DLLAMA_BUILD_TESTS=ON
  -DLLAMA_CPU_HBM=OFF -DLLAMA_CUBLAS=OFF -DLLAMA_CUDA=OFF -DLLAMA_HIPBLAS=OFF -DLLAMA_HIP_UMA=OFF -DLLAMA_METAL=OFF -DLLAMA_SYCL=OFF -DLLAMA_KOMPUTE=OFF
  -DLLAMA_LTO=OFF
  -DLLAMA_GPROF=OFF -DLLAMA_PERF=OFF -DLLAMA_SANITIZE_ADDRESS=OFF -DLLAMA_SANITIZE_THREAD=OFF -DLLAMA_SANITIZE_UNDEFINED=OFF 
  -DLLAMA_SERVER_SSL=ON -DLLAMA_SERVER_VERBOSE=ON
 -DLLAMA_VULKAN=ON -DLLAMA_VULKAN_CHECK_RESULTS=ON -DLLAMA_VULKAN_DEBUG=OFF -DLLAMA_VULKAN_RUN_TESTS=ON -DLLAMA_VULKAN_VALIDATE=OFF"'
OLLAMA_CUSTOM_CPU_DEFS="
  -DBUILD_TESTING=ON
  -DCMAKE_BUILD_TYPE=Release
  -DCMAKE_INSTALL_PREFIX=/usr
  -DLLAMA_ACCELERATE=ON
  -DLLAMA_ALL_WARNINGS=OFF
  -DLLAMA_ALL_WARNINGS_3RD_PARTY=OFF
  -DLLAMA_FATAL_WARNINGS=OFF
  -DLLAMA_AVX=ON -DLLAMA_AVX2=ON -DLLAMA_AVX512=ON -DLLAMA_AVX512_VBMI=ON -DLLAMA_AVX512_VNNI=ON -DLLAMA_F16C=ON -DLLAMA_FMA=ON
  -DLLAMA_BUILD_EXAMPLES=ON -DLLAMA_BUILD_SERVER=ON -DLLAMA_BUILD_TESTS=ON
  -DLLAMA_CPU_HBM=OFF -DLLAMA_CUBLAS=OFF -DLLAMA_CUDA=OFF -DLLAMA_HIPBLAS=OFF -DLLAMA_HIP_UMA=OFF -DLLAMA_METAL=OFF -DLLAMA_SYCL=OFF -DLLAMA_KOMPUTE=OFF
  -DLLAMA_LTO=OFF
  -DLLAMA_GPROF=OFF -DLLAMA_PERF=OFF -DLLAMA_SANITIZE_ADDRESS=OFF -DLLAMA_SANITIZE_THREAD=OFF -DLLAMA_SANITIZE_UNDEFINED=OFF 
  -DLLAMA_SERVER_SSL=ON -DLLAMA_SERVER_VERBOSE=ON
 -DLLAMA_VULKAN=ON -DLLAMA_VULKAN_CHECK_RESULTS=ON -DLLAMA_VULKAN_DEBUG=OFF -DLLAMA_VULKAN_RUN_TESTS=ON -DLLAMA_VULKAN_VALIDATE=OFF"
+ CMAKE_DEFS='
  -DBUILD_TESTING=ON
  -DCMAKE_BUILD_TYPE=Release
  -DCMAKE_INSTALL_PREFIX=/usr
  -DLLAMA_ACCELERATE=ON
  -DLLAMA_ALL_WARNINGS=OFF
  -DLLAMA_ALL_WARNINGS_3RD_PARTY=OFF
  -DLLAMA_FATAL_WARNINGS=OFF
  -DLLAMA_AVX=ON -DLLAMA_AVX2=ON -DLLAMA_AVX512=ON -DLLAMA_AVX512_VBMI=ON -DLLAMA_AVX512_VNNI=ON -DLLAMA_F16C=ON -DLLAMA_FMA=ON
  -DLLAMA_BUILD_EXAMPLES=ON -DLLAMA_BUILD_SERVER=ON -DLLAMA_BUILD_TESTS=ON
  -DLLAMA_CPU_HBM=OFF -DLLAMA_CUBLAS=OFF -DLLAMA_CUDA=OFF -DLLAMA_HIPBLAS=OFF -DLLAMA_HIP_UMA=OFF -DLLAMA_METAL=OFF -DLLAMA_SYCL=OFF -DLLAMA_KOMPUTE=OFF
  -DLLAMA_LTO=OFF
  -DLLAMA_GPROF=OFF -DLLAMA_PERF=OFF -DLLAMA_SANITIZE_ADDRESS=OFF -DLLAMA_SANITIZE_THREAD=OFF -DLLAMA_SANITIZE_UNDEFINED=OFF 
  -DLLAMA_SERVER_SSL=ON -DLLAMA_SERVER_VERBOSE=ON
 -DLLAMA_VULKAN=ON -DLLAMA_VULKAN_CHECK_RESULTS=ON -DLLAMA_VULKAN_DEBUG=OFF -DLLAMA_VULKAN_RUN_TESTS=ON -DLLAMA_VULKAN_VALIDATE=OFF -DCMAKE_POSITION_INDEPENDENT_CODE=on -DCMAKE_BUILD_TYPE=Release -DLLAMA_SERVER_VERBOSE=off '
+ BUILD_DIR=../build/linux/x86_64/cpu
+ echo 'Building custom CPU'
Building custom CPU
+ build
+ cmake -S ../llama.cpp -B ../build/linux/x86_64/cpu -DBUILD_TESTING=ON -DCMAKE_BUILD_TYPE=Release -DCMAKE_INSTALL_PREFIX=/usr -DLLAMA_ACCELERATE=ON -DLLAMA_ALL_WARNINGS=OFF -DLLAMA_ALL_WARNINGS_3RD_PARTY=OFF -DLLAMA_FATAL_WARNINGS=OFF -DLLAMA_AVX=ON -DLLAMA_AVX2=ON -DLLAMA_AVX512=ON -DLLAMA_AVX512_VBMI=ON -DLLAMA_AVX512_VNNI=ON -DLLAMA_F16C=ON -DLLAMA_FMA=ON -DLLAMA_BUILD_EXAMPLES=ON -DLLAMA_BUILD_SERVER=ON -DLLAMA_BUILD_TESTS=ON -DLLAMA_CPU_HBM=OFF -DLLAMA_CUBLAS=OFF -DLLAMA_CUDA=OFF -DLLAMA_HIPBLAS=OFF -DLLAMA_HIP_UMA=OFF -DLLAMA_METAL=OFF -DLLAMA_SYCL=OFF -DLLAMA_KOMPUTE=OFF -DLLAMA_LTO=OFF -DLLAMA_GPROF=OFF -DLLAMA_PERF=OFF -DLLAMA_SANITIZE_ADDRESS=OFF -DLLAMA_SANITIZE_THREAD=OFF -DLLAMA_SANITIZE_UNDEFINED=OFF -DLLAMA_SERVER_SSL=ON -DLLAMA_SERVER_VERBOSE=ON -DLLAMA_VULKAN=ON -DLLAMA_VULKAN_CHECK_RESULTS=ON -DLLAMA_VULKAN_DEBUG=OFF -DLLAMA_VULKAN_RUN_TESTS=ON -DLLAMA_VULKAN_VALIDATE=OFF -DCMAKE_POSITION_INDEPENDENT_CODE=on -DCMAKE_BUILD_TYPE=Release -DLLAMA_SERVER_VERBOSE=off
-- The C compiler identification is GNU 14.1.1
-- The CXX compiler identification is GNU 14.1.1
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /usr/bin/cc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /usr/bin/c++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Found Git: /usr/bin/git (found version "2.45.1")
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Success
-- Found Threads: TRUE
-- Found Vulkan: /lib/libvulkan.so (found version "1.3.285") found components: glslc glslangValidator
-- Vulkan found
-- ccache found, compilation results will be cached. Disable with LLAMA_CCACHE=OFF.
-- CMAKE_SYSTEM_PROCESSOR: x86_64
-- x86 detected
-- Found OpenSSL: /usr/lib/libcrypto.so (found version "3.3.0")
-- Configuring done (0.6s)
-- Generating done (0.1s)
-- Build files have been written to: /var/cache/makepkg/build/ollama-nogpu-git/src/ollama-vulkan/llm/build/linux/x86_64/cpu
+ cmake --build ../build/linux/x86_64/cpu --target ollama_llama_server -j8
[  6%] Generating build details from Git
[ 20%] Building C object CMakeFiles/ggml.dir/ggml-alloc.c.o
[ 20%] Building C object CMakeFiles/ggml.dir/ggml.c.o
[ 20%] Building C object CMakeFiles/ggml.dir/ggml-backend.c.o
[ 26%] Building C object CMakeFiles/ggml.dir/ggml-quants.c.o
[ 26%] Building CXX object CMakeFiles/ggml.dir/sgemm.cpp.o
[ 33%] Building CXX object CMakeFiles/ggml.dir/ggml-vulkan.cpp.o
-- Found Git: /usr/bin/git (found version "2.45.1")
[ 33%] Building CXX object common/CMakeFiles/build_info.dir/build-info.cpp.o
[ 33%] Built target build_info
/var/cache/makepkg/build/ollama-nogpu-git/src/ollama-vulkan/llm/llama.cpp/ggml-vulkan.cpp: In function ‘void ggml_vk_soft_max(ggml_backend_vk_context*, vk_context*, const ggml_tensor*, const ggml_tensor*, const ggml_tensor*, ggml_tensor*)’:
/var/cache/makepkg/build/ollama-nogpu-git/src/ollama-vulkan/llm/llama.cpp/ggml-vulkan.cpp:4288:119: note: ‘#pragma message: TODO: src2 is no longer used in soft_max - should be removed and ALiBi calculation should be updated’
 4288 | #pragma message("TODO: src2 is no longer used in soft_max - should be removed and ALiBi calculation should be updated")
      |                                                                                                                       ^
/var/cache/makepkg/build/ollama-nogpu-git/src/ollama-vulkan/llm/llama.cpp/ggml-vulkan.cpp:4289:73: note: ‘#pragma message: ref:  https://github.com/ggerganov/llama.cpp/pull/7192’
 4289 | #pragma message("ref:  https://github.com/ggerganov/llama.cpp/pull/7192")
      |                                                                         ^
/var/cache/makepkg/build/ollama-nogpu-git/src/ollama-vulkan/llm/llama.cpp/ggml-vulkan.cpp: In function ‘void ggml_vk_check_results_0(ggml_backend_vk_context*, ggml_compute_params*, ggml_tensor*)’:
/var/cache/makepkg/build/ollama-nogpu-git/src/ollama-vulkan/llm/llama.cpp/ggml-vulkan.cpp:6880:80: error: cannot convert ‘ggml_tensor*’ to ‘float’
 6880 |             tensor_clone = ggml_soft_max_ext(ggml_ctx, src0_clone, src1_clone, src2_clone, ((float *)tensor->op_params)[0], ((float *)tensor->op_params)[1]);
      |                                                                                ^~~~~~~~~~
      |                                                                                |
      |                                                                                ggml_tensor*
In file included from /var/cache/makepkg/build/ollama-nogpu-git/src/ollama-vulkan/llm/llama.cpp/ggml-vulkan.h:3,
                 from /var/cache/makepkg/build/ollama-nogpu-git/src/ollama-vulkan/llm/llama.cpp/ggml-vulkan.cpp:1:
/var/cache/makepkg/build/ollama-nogpu-git/src/ollama-vulkan/llm/llama.cpp/ggml.h:1446:35: note:   initializing argument 4 of ‘ggml_tensor* ggml_soft_max_ext(ggml_context*, ggml_tensor*, ggml_tensor*, float, float)’
 1446 |             float                 scale,
      |             ~~~~~~~~~~~~~~~~~~~~~~^~~~~
make[3]: *** [CMakeFiles/ggml.dir/build.make:132: CMakeFiles/ggml.dir/ggml-vulkan.cpp.o] Error 1
make[2]: *** [CMakeFiles/Makefile2:838: CMakeFiles/ggml.dir/all] Error 2
make[1]: *** [CMakeFiles/Makefile2:3322: ext_server/CMakeFiles/ollama_llama_server.dir/rule] Error 2
make: *** [Makefile:1336: ollama_llama_server] Error 2
llm/generate/generate_linux.go:3: running "bash": exit status 2

Regards!

The text was updated successfully, but these errors were encountered:

…p#7446). * Removed accidentally pasted content.

0cc4m · 2024-05-22T06:04:21Z

I already fixed that issue, so you just need to update the llama.cpp version ollama is using.

But also I see that in the package you are enablingLLAMA_VULKAN_CHECK_RESULTS and LLAMA_VULKAN_RUN_TESTS. Both of those are debug flags for development and should not be active in a production release.

dreirund · 2024-05-22T08:40:00Z

you are enablingLLAMA_VULKAN_CHECK_RESULTS and LLAMA_VULKAN_RUN_TESTS. Both of those are debug flags for development and should not be active in a production release.

Disabling them makes it build. Thanks. (But previously it did also build with this options set. So something seems to have broken anyway.)

Closing as "works for me".

…ganov/llama.cpp#7446 (comment)

0cc4m · 2024-05-22T08:44:27Z

you are enablingLLAMA_VULKAN_CHECK_RESULTS and LLAMA_VULKAN_RUN_TESTS. Both of those are debug flags for development and should not be active in a production release.

Disabling them makes it build. Thanks. (But previously it did also build with this options set. So something seems to have broken anyway.)

Closing as "works for me".

Yes, it will also build with those features using the latest code from this repo. You were using an older version, which contained a bug.

dreirund · 2024-05-23T10:25:08Z

You were using an older version, which contained a bug.

Yes, it is ↗ ollama, which manually uses an oder version.

dreirund added the bug-unconfirmed label May 21, 2024

archlinux-github pushed a commit to archlinux/aur that referenced this issue May 21, 2024

* Disabled vulkan build since it [currently fails](ggerganov/llama.cp…

c172cbd

…p#7446). * Removed accidentally pasted content.

dreirund closed this as not planned Won't fix, can't repro, duplicate, stale May 22, 2024

archlinux-github pushed a commit to archlinux/aur that referenced this issue May 22, 2024

Re-enabled Vulkan build by disabling testing options; reference: gger…

2c6e729

…ganov/llama.cpp#7446 (comment)

dreirund mentioned this issue May 22, 2024

Update llama.cpp to b2938 or newer to fix Vulkan build ollama/ollama#4573

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Build fails with `ggml-vulkan.cpp:6880:80: error: cannot convert ‘ggml_tensor*’ to ‘float’` #7446

Build fails with `ggml-vulkan.cpp:6880:80: error: cannot convert ‘ggml_tensor*’ to ‘float’` #7446

dreirund commented May 21, 2024 •

edited

0cc4m commented May 22, 2024

dreirund commented May 22, 2024

0cc4m commented May 22, 2024

dreirund commented May 23, 2024

Build fails with ggml-vulkan.cpp:6880:80: error: cannot convert ‘ggml_tensor*’ to ‘float’ #7446

Build fails with ggml-vulkan.cpp:6880:80: error: cannot convert ‘ggml_tensor*’ to ‘float’ #7446

Comments

dreirund commented May 21, 2024 • edited

0cc4m commented May 22, 2024

dreirund commented May 22, 2024

0cc4m commented May 22, 2024

dreirund commented May 23, 2024

Build fails with `ggml-vulkan.cpp:6880:80: error: cannot convert ‘ggml_tensor*’ to ‘float’` #7446

Build fails with `ggml-vulkan.cpp:6880:80: error: cannot convert ‘ggml_tensor*’ to ‘float’` #7446

dreirund commented May 21, 2024 •

edited