Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Build fails with ggml-vulkan.cpp:6880:80: error: cannot convert ‘ggml_tensor*’ to ‘float’ #7446

Closed
dreirund opened this issue May 21, 2024 · 4 comments

Comments

@dreirund
Copy link

dreirund commented May 21, 2024

I am on Artix GNU/Linux (rolling release), GCC 14.1.1, and I build ollama-vulkan which pulls in and uses llama.cpp from this git repository.

When building, I get the error
ggml-vulkan.cpp:6880:80: error: cannot convert ‘ggml_tensor*’ to ‘float’:

[...]
+ init_vars
+ case "${GOARCH}" in
+ ARCH=x86_64
+ LLAMACPP_DIR=../llama.cpp
+ CMAKE_DEFS=
+ CMAKE_TARGETS='--target ollama_llama_server'
+ echo ''
+ grep -- -g
+ CMAKE_DEFS='-DCMAKE_BUILD_TYPE=Release -DLLAMA_SERVER_VERBOSE=off '
+ case $(uname -s) in
++ uname -s
+ LIB_EXT=so
+ WHOLE_ARCHIVE=-Wl,--whole-archive
+ NO_WHOLE_ARCHIVE=-Wl,--no-whole-archive
+ GCC_ARCH=
+ '[' -z '50;52;61;70;75;80' ']'
+ echo 'OLLAMA_CUSTOM_CPU_DEFS="
  -DBUILD_TESTING=ON
  -DCMAKE_BUILD_TYPE=Release
  -DCMAKE_INSTALL_PREFIX=/usr
  -DLLAMA_ACCELERATE=ON
  -DLLAMA_ALL_WARNINGS=OFF
  -DLLAMA_ALL_WARNINGS_3RD_PARTY=OFF
  -DLLAMA_FATAL_WARNINGS=OFF
  -DLLAMA_AVX=ON -DLLAMA_AVX2=ON -DLLAMA_AVX512=ON -DLLAMA_AVX512_VBMI=ON -DLLAMA_AVX512_VNNI=ON -DLLAMA_F16C=ON -DLLAMA_FMA=ON
  -DLLAMA_BUILD_EXAMPLES=ON -DLLAMA_BUILD_SERVER=ON -DLLAMA_BUILD_TESTS=ON
  -DLLAMA_CPU_HBM=OFF -DLLAMA_CUBLAS=OFF -DLLAMA_CUDA=OFF -DLLAMA_HIPBLAS=OFF -DLLAMA_HIP_UMA=OFF -DLLAMA_METAL=OFF -DLLAMA_SYCL=OFF -DLLAMA_KOMPUTE=OFF
  -DLLAMA_LTO=OFF
  -DLLAMA_GPROF=OFF -DLLAMA_PERF=OFF -DLLAMA_SANITIZE_ADDRESS=OFF -DLLAMA_SANITIZE_THREAD=OFF -DLLAMA_SANITIZE_UNDEFINED=OFF 
  -DLLAMA_SERVER_SSL=ON -DLLAMA_SERVER_VERBOSE=ON
 -DLLAMA_VULKAN=ON -DLLAMA_VULKAN_CHECK_RESULTS=ON -DLLAMA_VULKAN_DEBUG=OFF -DLLAMA_VULKAN_RUN_TESTS=ON -DLLAMA_VULKAN_VALIDATE=OFF"'
OLLAMA_CUSTOM_CPU_DEFS="
  -DBUILD_TESTING=ON
  -DCMAKE_BUILD_TYPE=Release
  -DCMAKE_INSTALL_PREFIX=/usr
  -DLLAMA_ACCELERATE=ON
  -DLLAMA_ALL_WARNINGS=OFF
  -DLLAMA_ALL_WARNINGS_3RD_PARTY=OFF
  -DLLAMA_FATAL_WARNINGS=OFF
  -DLLAMA_AVX=ON -DLLAMA_AVX2=ON -DLLAMA_AVX512=ON -DLLAMA_AVX512_VBMI=ON -DLLAMA_AVX512_VNNI=ON -DLLAMA_F16C=ON -DLLAMA_FMA=ON
  -DLLAMA_BUILD_EXAMPLES=ON -DLLAMA_BUILD_SERVER=ON -DLLAMA_BUILD_TESTS=ON
  -DLLAMA_CPU_HBM=OFF -DLLAMA_CUBLAS=OFF -DLLAMA_CUDA=OFF -DLLAMA_HIPBLAS=OFF -DLLAMA_HIP_UMA=OFF -DLLAMA_METAL=OFF -DLLAMA_SYCL=OFF -DLLAMA_KOMPUTE=OFF
  -DLLAMA_LTO=OFF
  -DLLAMA_GPROF=OFF -DLLAMA_PERF=OFF -DLLAMA_SANITIZE_ADDRESS=OFF -DLLAMA_SANITIZE_THREAD=OFF -DLLAMA_SANITIZE_UNDEFINED=OFF 
  -DLLAMA_SERVER_SSL=ON -DLLAMA_SERVER_VERBOSE=ON
 -DLLAMA_VULKAN=ON -DLLAMA_VULKAN_CHECK_RESULTS=ON -DLLAMA_VULKAN_DEBUG=OFF -DLLAMA_VULKAN_RUN_TESTS=ON -DLLAMA_VULKAN_VALIDATE=OFF"
+ CMAKE_DEFS='
  -DBUILD_TESTING=ON
  -DCMAKE_BUILD_TYPE=Release
  -DCMAKE_INSTALL_PREFIX=/usr
  -DLLAMA_ACCELERATE=ON
  -DLLAMA_ALL_WARNINGS=OFF
  -DLLAMA_ALL_WARNINGS_3RD_PARTY=OFF
  -DLLAMA_FATAL_WARNINGS=OFF
  -DLLAMA_AVX=ON -DLLAMA_AVX2=ON -DLLAMA_AVX512=ON -DLLAMA_AVX512_VBMI=ON -DLLAMA_AVX512_VNNI=ON -DLLAMA_F16C=ON -DLLAMA_FMA=ON
  -DLLAMA_BUILD_EXAMPLES=ON -DLLAMA_BUILD_SERVER=ON -DLLAMA_BUILD_TESTS=ON
  -DLLAMA_CPU_HBM=OFF -DLLAMA_CUBLAS=OFF -DLLAMA_CUDA=OFF -DLLAMA_HIPBLAS=OFF -DLLAMA_HIP_UMA=OFF -DLLAMA_METAL=OFF -DLLAMA_SYCL=OFF -DLLAMA_KOMPUTE=OFF
  -DLLAMA_LTO=OFF
  -DLLAMA_GPROF=OFF -DLLAMA_PERF=OFF -DLLAMA_SANITIZE_ADDRESS=OFF -DLLAMA_SANITIZE_THREAD=OFF -DLLAMA_SANITIZE_UNDEFINED=OFF 
  -DLLAMA_SERVER_SSL=ON -DLLAMA_SERVER_VERBOSE=ON
 -DLLAMA_VULKAN=ON -DLLAMA_VULKAN_CHECK_RESULTS=ON -DLLAMA_VULKAN_DEBUG=OFF -DLLAMA_VULKAN_RUN_TESTS=ON -DLLAMA_VULKAN_VALIDATE=OFF -DCMAKE_POSITION_INDEPENDENT_CODE=on -DCMAKE_BUILD_TYPE=Release -DLLAMA_SERVER_VERBOSE=off '
+ BUILD_DIR=../build/linux/x86_64/cpu
+ echo 'Building custom CPU'
Building custom CPU
+ build
+ cmake -S ../llama.cpp -B ../build/linux/x86_64/cpu -DBUILD_TESTING=ON -DCMAKE_BUILD_TYPE=Release -DCMAKE_INSTALL_PREFIX=/usr -DLLAMA_ACCELERATE=ON -DLLAMA_ALL_WARNINGS=OFF -DLLAMA_ALL_WARNINGS_3RD_PARTY=OFF -DLLAMA_FATAL_WARNINGS=OFF -DLLAMA_AVX=ON -DLLAMA_AVX2=ON -DLLAMA_AVX512=ON -DLLAMA_AVX512_VBMI=ON -DLLAMA_AVX512_VNNI=ON -DLLAMA_F16C=ON -DLLAMA_FMA=ON -DLLAMA_BUILD_EXAMPLES=ON -DLLAMA_BUILD_SERVER=ON -DLLAMA_BUILD_TESTS=ON -DLLAMA_CPU_HBM=OFF -DLLAMA_CUBLAS=OFF -DLLAMA_CUDA=OFF -DLLAMA_HIPBLAS=OFF -DLLAMA_HIP_UMA=OFF -DLLAMA_METAL=OFF -DLLAMA_SYCL=OFF -DLLAMA_KOMPUTE=OFF -DLLAMA_LTO=OFF -DLLAMA_GPROF=OFF -DLLAMA_PERF=OFF -DLLAMA_SANITIZE_ADDRESS=OFF -DLLAMA_SANITIZE_THREAD=OFF -DLLAMA_SANITIZE_UNDEFINED=OFF -DLLAMA_SERVER_SSL=ON -DLLAMA_SERVER_VERBOSE=ON -DLLAMA_VULKAN=ON -DLLAMA_VULKAN_CHECK_RESULTS=ON -DLLAMA_VULKAN_DEBUG=OFF -DLLAMA_VULKAN_RUN_TESTS=ON -DLLAMA_VULKAN_VALIDATE=OFF -DCMAKE_POSITION_INDEPENDENT_CODE=on -DCMAKE_BUILD_TYPE=Release -DLLAMA_SERVER_VERBOSE=off
-- The C compiler identification is GNU 14.1.1
-- The CXX compiler identification is GNU 14.1.1
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /usr/bin/cc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /usr/bin/c++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Found Git: /usr/bin/git (found version "2.45.1")
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Success
-- Found Threads: TRUE
-- Found Vulkan: /lib/libvulkan.so (found version "1.3.285") found components: glslc glslangValidator
-- Vulkan found
-- ccache found, compilation results will be cached. Disable with LLAMA_CCACHE=OFF.
-- CMAKE_SYSTEM_PROCESSOR: x86_64
-- x86 detected
-- Found OpenSSL: /usr/lib/libcrypto.so (found version "3.3.0")
-- Configuring done (0.6s)
-- Generating done (0.1s)
-- Build files have been written to: /var/cache/makepkg/build/ollama-nogpu-git/src/ollama-vulkan/llm/build/linux/x86_64/cpu
+ cmake --build ../build/linux/x86_64/cpu --target ollama_llama_server -j8
[  6%] Generating build details from Git
[ 20%] Building C object CMakeFiles/ggml.dir/ggml-alloc.c.o
[ 20%] Building C object CMakeFiles/ggml.dir/ggml.c.o
[ 20%] Building C object CMakeFiles/ggml.dir/ggml-backend.c.o
[ 26%] Building C object CMakeFiles/ggml.dir/ggml-quants.c.o
[ 26%] Building CXX object CMakeFiles/ggml.dir/sgemm.cpp.o
[ 33%] Building CXX object CMakeFiles/ggml.dir/ggml-vulkan.cpp.o
-- Found Git: /usr/bin/git (found version "2.45.1")
[ 33%] Building CXX object common/CMakeFiles/build_info.dir/build-info.cpp.o
[ 33%] Built target build_info
/var/cache/makepkg/build/ollama-nogpu-git/src/ollama-vulkan/llm/llama.cpp/ggml-vulkan.cpp: In function ‘void ggml_vk_soft_max(ggml_backend_vk_context*, vk_context*, const ggml_tensor*, const ggml_tensor*, const ggml_tensor*, ggml_tensor*)’:
/var/cache/makepkg/build/ollama-nogpu-git/src/ollama-vulkan/llm/llama.cpp/ggml-vulkan.cpp:4288:119: note: ‘#pragma message: TODO: src2 is no longer used in soft_max - should be removed and ALiBi calculation should be updated’
 4288 | #pragma message("TODO: src2 is no longer used in soft_max - should be removed and ALiBi calculation should be updated")
      |                                                                                                                       ^
/var/cache/makepkg/build/ollama-nogpu-git/src/ollama-vulkan/llm/llama.cpp/ggml-vulkan.cpp:4289:73: note: ‘#pragma message: ref:  https://github.com/ggerganov/llama.cpp/pull/7192’
 4289 | #pragma message("ref:  https://github.com/ggerganov/llama.cpp/pull/7192")
      |                                                                         ^
/var/cache/makepkg/build/ollama-nogpu-git/src/ollama-vulkan/llm/llama.cpp/ggml-vulkan.cpp: In function ‘void ggml_vk_check_results_0(ggml_backend_vk_context*, ggml_compute_params*, ggml_tensor*)’:
/var/cache/makepkg/build/ollama-nogpu-git/src/ollama-vulkan/llm/llama.cpp/ggml-vulkan.cpp:6880:80: error: cannot convert ‘ggml_tensor*’ to ‘float’
 6880 |             tensor_clone = ggml_soft_max_ext(ggml_ctx, src0_clone, src1_clone, src2_clone, ((float *)tensor->op_params)[0], ((float *)tensor->op_params)[1]);
      |                                                                                ^~~~~~~~~~
      |                                                                                |
      |                                                                                ggml_tensor*
In file included from /var/cache/makepkg/build/ollama-nogpu-git/src/ollama-vulkan/llm/llama.cpp/ggml-vulkan.h:3,
                 from /var/cache/makepkg/build/ollama-nogpu-git/src/ollama-vulkan/llm/llama.cpp/ggml-vulkan.cpp:1:
/var/cache/makepkg/build/ollama-nogpu-git/src/ollama-vulkan/llm/llama.cpp/ggml.h:1446:35: note:   initializing argument 4 of ‘ggml_tensor* ggml_soft_max_ext(ggml_context*, ggml_tensor*, ggml_tensor*, float, float)’
 1446 |             float                 scale,
      |             ~~~~~~~~~~~~~~~~~~~~~~^~~~~
make[3]: *** [CMakeFiles/ggml.dir/build.make:132: CMakeFiles/ggml.dir/ggml-vulkan.cpp.o] Error 1
make[2]: *** [CMakeFiles/Makefile2:838: CMakeFiles/ggml.dir/all] Error 2
make[1]: *** [CMakeFiles/Makefile2:3322: ext_server/CMakeFiles/ollama_llama_server.dir/rule] Error 2
make: *** [Makefile:1336: ollama_llama_server] Error 2
llm/generate/generate_linux.go:3: running "bash": exit status 2

Regards!

archlinux-github pushed a commit to archlinux/aur that referenced this issue May 21, 2024
@0cc4m
Copy link
Collaborator

0cc4m commented May 22, 2024

I already fixed that issue, so you just need to update the llama.cpp version ollama is using.

But also I see that in the package you are enablingLLAMA_VULKAN_CHECK_RESULTS and LLAMA_VULKAN_RUN_TESTS. Both of those are debug flags for development and should not be active in a production release.

@dreirund
Copy link
Author

you are enablingLLAMA_VULKAN_CHECK_RESULTS and LLAMA_VULKAN_RUN_TESTS. Both of those are debug flags for development and should not be active in a production release.

Disabling them makes it build. Thanks. (But previously it did also build with this options set. So something seems to have broken anyway.)

Closing as "works for me".

@dreirund dreirund closed this as not planned Won't fix, can't repro, duplicate, stale May 22, 2024
@0cc4m
Copy link
Collaborator

0cc4m commented May 22, 2024

you are enablingLLAMA_VULKAN_CHECK_RESULTS and LLAMA_VULKAN_RUN_TESTS. Both of those are debug flags for development and should not be active in a production release.

Disabling them makes it build. Thanks. (But previously it did also build with this options set. So something seems to have broken anyway.)

Closing as "works for me".

Yes, it will also build with those features using the latest code from this repo. You were using an older version, which contained a bug.

@dreirund
Copy link
Author

You were using an older version, which contained a bug.

Yes, it is ↗ ollama, which manually uses an oder version.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants