aarch64 CUDA build: ggml.h(309): error: identifier "half" is undefined #4922

SomeoneSerge · 2024-01-13T22:57:25Z

Steps To Reproduce

Steps to reproduce the behavior:

Attempt a native aarch64 build with CUDA support (only tested as nix build .#packages.aarch64-linux.jetson-xavier)

Build log

CI: https://github.com/ggerganov/llama.cpp/actions/runs/7514510149/job/20457461738#step:8:1499
Cleaner logs: https://gist.github.com/SomeoneSerge/33008b08b7bd887e994b7e52cd432af0

[14/105] Building CUDA object CMakeFiles/ggml.dir/ggml-cuda.cu.o
FAILED: CMakeFiles/ggml.dir/ggml-cuda.cu.o 
/nix/store/69di7mgz1c5864ghppzzidwv3vy1r3p7-cuda_nvcc-11.8.89/bin/nvcc -forward-unknown-to-host-compiler -ccbin=/nix/store/a4vw7jhihwkh7zp6vj3cn8375phb31ds-gcc-wrapper-11.4.0/bin/c++ -DGGML_CUDA_DMMV_X=32 -DGGML_>
nvcc warning : incompatible redefinition for option 'compiler-bindir', the last value of this option was used
/build/source/ggml.h(309): error: identifier "half" is undefined

1 error detected in the compilation of "/build/source/ggml-cuda.cu".

Additional context

❯ git rev-parse HEAD
f172de03f11465dc6c5a0fc3a22f8ec254c6832c
❯ nix path-info --derivation .#packages.aarch64-linux.jetson-xavier --recursive | rg -- '(gcc|nvcc)-\d.*\d\.drv'
/nix/store/9zr86pkcj6cbba7g3kkqzg2smx3q74fc-xgcc-12.3.0.drv
/nix/store/px2vi9df2z1zk5qi2ql7phnbp8i0v011-gcc-12.3.0.drv
/nix/store/w9w0pii96jp5fjxafzky7bybyrdcr7bx-gcc-11.4.0.drv
/nix/store/y17s03wj6lzbp7rfrk87gvmp5sslwcgy-cuda_nvcc-11.8.89.drv
❯ # ^^^ uses gcc11 and cuda 11.8 for the build, and gcc12's libstdc++ for the link

Previous work and related issues

#1455 had faced a related issue and introduced (#2670) the typedef at the line 309.

The failure is at most three weeks old, a successful xavier build confirmed e.g. in #4605 (comment)

I'll run a bisect if/when I get access to an aarch64 builder

The text was updated successfully, but these errors were encountered:

slaren · 2024-01-13T23:25:27Z

Does it work if you replace half in ggml.h with __half?

SomeoneSerge · 2024-01-14T00:01:09Z

@slaren thanks but no:

[14/105] Building CUDA object CMakeFiles/ggml.dir/ggml-cuda.cu.o
FAILED: CMakeFiles/ggml.dir/ggml-cuda.cu.o 
/nix/store/69di7mgz1c5864ghppzzidwv3vy1r3p7-cuda_nvcc-11.8.89/bin/nvcc -forward-unknown-to-host-compiler -ccbin=/nix/store/a4vw7jhihwkh7zp6vj3cn8375phb31ds-gcc-wrapper-11.4.0/bin/c++ -DGGML_CUDA_DMMV_X=32 -DGGML_>
/build/source/ggml.h(309): error: identifier "__half" is undefined

KyL0N · 2024-01-15T03:56:27Z

Hello there
have you just run

make LLAMA_CUBLAS=1

what's the output?

I will test nix on my jetson orin nano later.
and please check the jetpack version.

planform · 2024-01-15T11:02:43Z

I had the same error and add the cuda_fp16.h in ggml.h, it works

#if defined(__ARM_NEON) && defined(__CUDACC__)
    #include <cuda_fp16.h>
#endif

#ifdef  __cplusplus
extern "C" {
#endif

#if defined(__ARM_NEON) && defined(__CUDACC__)
    typedef half ggml_fp16_t;
#elif defined(__ARM_NEON) && !defined(_MSC_VER)
    typedef __fp16 ggml_fp16_t;
#else
    typedef uint16_t ggml_fp16_t;
#endif

KyL0N · 2024-01-15T13:28:22Z

same error using make LLAMA_CUBLAS=1

nvcc -I. -Icommon -D_XOPEN_SOURCE=600 -D_GNU_SOURCE -DNDEBUG -DGGML_USE_CUBLAS -I/usr/local/cuda/include -I/opt/cuda/include -I/targets/x86_64-linux/include -I/usr/local/cuda/targets/aarch64-linux/include -std=c++11 -fPIC -O3 -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wmissing-declarations -Wmissing-noreturn -pthread -mcpu=native -use_fast_math --forward-unknown-to-host-compiler -arch=sm_87 -DGGML_CUDA_DMMV_X=32 -DGGML_CUDA_MMV_Y=1 -DK_QUANTS_PER_ITERATION=2 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -Wno-pedantic -Xcompiler "-Wno-array-bounds -Wno-format-truncation -Wextra-semi" -c ggml-cuda.cu -o ggml-cuda.o
ggml.h(309): error: identifier "half" is undefined
ggml-cuda.cu(609): warning: function "warp_reduce_sum(half2)" was declared but never referenced
ggml-cuda.cu(630): warning: function "warp_reduce_max(half2)" was declared but never referenced

@planform add this could help, but we dont know what change cause this compile error

#include <cuda_fp16.h>

KyL0N · 2024-01-15T14:07:32Z

I think problem is there
#4766

this patch add #include "ggml-cuda.h"
before include #include <cuda_fp16.h>

so the solution is move #include "ggml-cuda.h" after #include <cuda_fp16.h>

the patch to fix this problem is below

diff --git a/ggml-cuda.cu b/ggml-cuda.cu
index c3e14bc..a6e6751 100644
--- a/ggml-cuda.cu
+++ b/ggml-cuda.cu
@@ -12,9 +12,6 @@
 #include <vector>
 #include <map>
 #include <array>
-#include "ggml-cuda.h"
-#include "ggml.h"
-#include "ggml-backend-impl.h"
 
 #if defined(GGML_USE_HIPBLAS)
 #include <hip/hip_runtime.h>
@@ -118,6 +115,10 @@
 
 #endif // defined(GGML_USE_HIPBLAS)
 
+#include "ggml-cuda.h"
+#include "ggml.h"
+#include "ggml-backend-impl.h"
+
 #define CUDART_HMAX     11070 // CUDA 11.7, min. ver. for which __hmax and __hmax2 are known to work (may be higher than needed)
 
 #define CC_PASCAL     600

ark626 · 2024-01-15T17:41:51Z

After adding the patch above and also the other patch for the ggml.h file in multiple locations since this file is used in different versions during the build I get the following error in Jetson Xavier AGX with CUDA 11:

In file included from /mnt/external/LocalAI/backend/cpp/llama/llama.cpp/build/examples/grpc-server/backend.grpc.pb.cc:5: /mnt/external/LocalAI/backend/cpp/llama/llama.cpp/build/examples/grpc-server/backend.pb.h:17:2: error: #error This file was generated by an older version of protoc which is

My protoc version is libprotoc 3.6.1 and the build command i used is

make BUILD_TYPE=cublas CUDA_LIBPATH=/usr/local/cuda-11.4/targets/aarch64-linux/lib/ BUILD_GRPC_FOR_BACKEND_LLAMA=ON build

nohup.out.txt

So it seems the fixes are working but in the end the protoc version is an issue for the compiler.

ms1design · 2024-01-15T17:56:20Z

@KyL0N patch works also on Jetson AGX Orin 👍

ark626 · 2024-01-15T18:02:29Z

@KyL0N patch works also on Jetson AGX Orin 👍

Did you also build the grpc server or where you using an external one?
Can you share some details on how to set this up on Jetson AGX currently?

ms1design · 2024-01-15T18:05:41Z

@KyL0N patch works also on Jetson AGX Orin 👍

Did you also build the grpc server or where you using an external one? Can you share some details on how to set this up on Jetson AGX currently?

Sorry for the off-topic. The best option for you is to use https://github.com/dusty-nv/jetson-containers

@ark626 did you managed to instal LocalAI? Do you have a Dockerfile?

ark626 · 2024-01-15T18:15:52Z

@KyL0N patch works also on Jetson AGX Orin 👍

Did you also build the grpc server or where you using an external one? Can you share some details on how to set this up on Jetson AGX currently?

Sorry for the off-topic. The best option for you is to use https://github.com/dusty-nv/jetson-containers

@ark626 did you managed to instal LocalAI? Do you have a Dockerfile?

No sadly not. I try to install this since 3 days. I almost had a build finished v 1.25.0 but due to some repository force pushing the older builds dont seem to work anymore. Currently i try to build it directly on a fresh install of Jetson AGX Xavier but since my knowledge of c builds is very limited its hard to figure out whats wrong exactly.

I will have a look into the jetson containers but i try to install the localai.io complete as i try to play around it in the Home Assistant as an own ChatGPT variant. But first i need to get at least any build to run.

ms1design · 2024-01-15T18:19:53Z

I will have a look into the jetson containers but i try to install the localai.io complete as i try to play around it in the Home Assistant as an own ChatGPT variant. But first i need to get at least any build to run.

@ark626 in https://github.com/dusty-nv/jetson-containers you have https://github.com/oobabooga/text-generation-webui working out of the box on jetson - this webui has openai compatible API which you can use in HA extension. I'm also working on my own extension ;)

I strongly recommend to use docker for those experiments. It's easier to manage dependencies & configurations

ark626 · 2024-01-15T18:23:19Z

I will have a look into the jetson containers but i try to install the localai.io complete as i try to play around it in the Home Assistant as an own ChatGPT variant. But first i need to get at least any build to run.

@ark626 in https://github.com/dusty-nv/jetson-containers you have https://github.com/oobabooga/text-generation-webui working out of the box on jetson - this webui has openai compatible API which you can use in HA extension. I'm also working on my own extension ;)

I strongly recommend to use docker for those experiments. It's easier to manage dependencies & configurations

Thank you very nice. Will have a look at it.
Yes i know but since i havent worked long time with docker and the Jetson i hesitated to use docker most probably i have some dependancy wrong i just didnt understand why i get the error when the protoc is the same version installed as it is used in the build.
Will check it definitly thanks for the advice and the help.

SomeoneSerge · 2024-01-15T18:50:49Z

@planform's patch is sufficient and seems to be minimal (ggml.h already has a preprocessor branch for __CUDACC__)

@planform @KyL0N either of you would like to open a PR or should I?

protoc, ..., jetson-containers, docker

The more general answer is "use the correct protobuf version from any source that ships it" (e.g. another distribution, conda, a prebuilt multi-gigabyte docker image, or 🙃 Nixpkgs). I'll stop here and abstain from being a shill

twmht · 2024-01-16T05:44:44Z

@KyL0N 's solution woks for my jetson nano:)

KyL0N · 2024-01-20T07:27:46Z

@SomeoneSerge hello, pr #4975 is already merged

SomeoneSerge · 2024-01-22T16:32:04Z

Thanks @KyL0N! One can see a passing pipeline e.g. in https://github.com/ggerganov/llama.cpp/actions/runs/7611010381/job/20725572309

SomeoneSerge added the bug-unconfirmed label Jan 13, 2024

KyL0N mentioned this issue Jan 16, 2024

cuda: fix compile error in jetson platform #4975

Merged

SomeoneSerge closed this as completed Jan 22, 2024

remy415 mentioned this issue Mar 6, 2024

Add support for libcudart.so for CUDA devices (Adds Jetson support) ollama/ollama#2279

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

aarch64 CUDA build: ggml.h(309): error: identifier "half" is undefined #4922

aarch64 CUDA build: ggml.h(309): error: identifier "half" is undefined #4922

SomeoneSerge commented Jan 13, 2024 •

edited

slaren commented Jan 13, 2024

SomeoneSerge commented Jan 14, 2024

KyL0N commented Jan 15, 2024

planform commented Jan 15, 2024 •

edited

KyL0N commented Jan 15, 2024 •

edited

KyL0N commented Jan 15, 2024

ark626 commented Jan 15, 2024 •

edited

ms1design commented Jan 15, 2024

ark626 commented Jan 15, 2024

ms1design commented Jan 15, 2024 •

edited

ark626 commented Jan 15, 2024

ms1design commented Jan 15, 2024

ark626 commented Jan 15, 2024

SomeoneSerge commented Jan 15, 2024

twmht commented Jan 16, 2024

KyL0N commented Jan 20, 2024

SomeoneSerge commented Jan 22, 2024

aarch64 CUDA build: ggml.h(309): error: identifier "half" is undefined #4922

aarch64 CUDA build: ggml.h(309): error: identifier "half" is undefined #4922

Comments

SomeoneSerge commented Jan 13, 2024 • edited

Steps To Reproduce

Build log

Additional context

Previous work and related issues

slaren commented Jan 13, 2024

SomeoneSerge commented Jan 14, 2024

KyL0N commented Jan 15, 2024

planform commented Jan 15, 2024 • edited

KyL0N commented Jan 15, 2024 • edited

KyL0N commented Jan 15, 2024

ark626 commented Jan 15, 2024 • edited

ms1design commented Jan 15, 2024

ark626 commented Jan 15, 2024

ms1design commented Jan 15, 2024 • edited

ark626 commented Jan 15, 2024

ms1design commented Jan 15, 2024

ark626 commented Jan 15, 2024

SomeoneSerge commented Jan 15, 2024

twmht commented Jan 16, 2024

KyL0N commented Jan 20, 2024

SomeoneSerge commented Jan 22, 2024

SomeoneSerge commented Jan 13, 2024 •

edited

planform commented Jan 15, 2024 •

edited

KyL0N commented Jan 15, 2024 •

edited

ark626 commented Jan 15, 2024 •

edited

ms1design commented Jan 15, 2024 •

edited