cuda: fix compile error in jetson platform #4975

KyL0N · 2024-01-16T13:11:16Z

this pr fix compile error in jetson platform
related issue: #4922

the error was introduced by #4766

#4766 add #include "ggml-cuda.h"
before include #include <cuda_fp16.h>

so the solution is move #include "ggml-cuda.h" after #include <cuda_fp16.h>

cebtenzzre

It sounds like ggml.h should #include <cuda_fp16.h>, no? #includes shouldn't be sensitive to the order they are in.

KyL0N · 2024-01-17T00:33:38Z

It sounds like ggml.h should #include <cuda_fp16.h>, no? #includes shouldn't be sensitive to the order they are in.

I prefer #include <cuda_fp16.h> only in .cu file.

u can check ggml-cuda.cu changes in #4766

JohannesGaessler · 2024-01-17T00:47:54Z

Unrelated to what we prefer, if the header were to #include <cuda_fp16.h>, wouldn't that potentially cause other compilation issues? Because right now you should be able to compile a program against the interface specified in ggml-cuda.h even without a CUDA toolkit installed but I don't think that would be possible the header were to already include CUDA headers.

KyL0N · 2024-01-17T01:12:47Z

that's ok
because cuda headers should be included before include ggml-cuda.h

KyL0N · 2024-01-17T01:16:09Z

@slaren Hello, could you please check if these changes are okay, or if anything else is needed?

ggerganov · 2024-01-17T07:33:23Z

It sounds like ggml.h should #include <cuda_fp16.h>, no?

I want to narrow down the GPU-releated stuff in the core ggml.h and ggml.c. Adding a CUDA header there would be a step backwards in this regard.

Does Jetson platform not have __fp16?
Does this patch work:

diff --git a/ggml.h b/ggml.h
index 4c2ff6c6..9d165a86 100644
--- a/ggml.h
+++ b/ggml.h
@@ -305,9 +305,7 @@
 extern "C" {
 #endif
 
-#if defined(__ARM_NEON) && defined(__CUDACC__)
-    typedef half ggml_fp16_t;
-#elif defined(__ARM_NEON) && !defined(_MSC_VER)
+#if defined(__ARM_NEON) && !defined(_MSC_VER)
     typedef __fp16 ggml_fp16_t;
 #else
     typedef uint16_t ggml_fp16_t;

I'm not sure when the #if defined(__ARM_NEON) && defined(__CUDACC__) branch is actually true?

KyL0N · 2024-01-17T14:04:53Z

Does Jetson platform not have __fp16?

this patch not work, nvcc dose not recognise __fp16
the alternative choice is half as I mentioned in #1455 (comment)

nvcc -I. -Icommon -D_XOPEN_SOURCE=600 -D_GNU_SOURCE -DNDEBUG -DGGML_USE_CUBLAS -I/usr/local/cuda/include -I/opt/cuda/include -I/targets/x86_64-linux/include -I/usr/local/cuda/targets/aarch64-linux/include -std=c++11 -fPIC -O3 -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wmissing-declarations -Wmissing-noreturn -pthread -mcpu=native -use_fast_math --forward-unknown-to-host-compiler -arch=sm_87 -DGGML_CUDA_DMMV_X=32 -DGGML_CUDA_MMV_Y=1 -DK_QUANTS_PER_ITERATION=2 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -Wno-pedantic -Xcompiler "-Wno-array-bounds -Wno-format-truncation -Wextra-semi" -c ggml-cuda.cu -o ggml-cuda.o
ggml.h(319): error: identifier "__fp16" is undefined

and #if defined(__ARM_NEON) && defined(__CUDACC__) which means cuda with aarch architecture which jetson platform is this also

if just fallback to typedef uint16_t ggml_fp16_t; it will output gibberish as #1455 mentioned.

slaren

I am ok with this fix, but it would be good to add a comment explaining why these headers must be included last to prevent this from happening again.

cebtenzzre

Since there are two approvals I'll avoid blocking this, but my two cents: If we're going to have a __CUDACC__-specific typedef, why not have a __CUDACC__-specific #include in the same file? It just seems wrong to me that we would intentionally reference a symbol without including the header that declares it - ggml.h or otherwise.

It would be one thing if this was a #define, but the compiler error is on the typedef itself in this case.

ggerganov · 2024-01-19T15:29:26Z

@KyL0N Please add a comment about why the headers are there and we can merge

KyL0N · 2024-01-20T06:47:34Z

why not have a __CUDACC__-specific #include in the same file

In my view, if __CUDACC__ is defined, then nvcc must be the compiler. As a result, headers such as cuda_fp16.h or other CUDA-related files shoule be included in before ggml.h

ggerganov · 2024-01-20T07:01:40Z

Thanks for the discussion - IMO the fundamental issue is that ggml_fp16_t is exposed through the public ggml API in the first place. It's something to fix in the future, but for now will merge this workaround

* cuda: fix compile error in jetson platform * cuda: update comment in ggml-cuda.cu * cuda: update ggml-cuda.cu comment

cuda: fix compile error in jetson platform

26abc17

JohannesGaessler approved these changes Jan 16, 2024

View reviewed changes

cebtenzzre requested changes Jan 16, 2024

View reviewed changes

slaren approved these changes Jan 17, 2024

View reviewed changes

cebtenzzre approved these changes Jan 19, 2024

View reviewed changes

KyL0N added 2 commits January 20, 2024 14:10

cuda: update comment in ggml-cuda.cu

beb1c93

cuda: update ggml-cuda.cu comment

847019e

ggerganov merged commit cca894f into ggerganov:master Jan 20, 2024
35 of 43 checks passed

KyL0N mentioned this pull request Jan 20, 2024

aarch64 CUDA build: ggml.h(309): error: identifier "half" is undefined #4922

Closed

KyL0N deleted the compile-patch branch January 21, 2024 15:54

jordankanter pushed a commit to jordankanter/llama.cpp that referenced this pull request Feb 3, 2024

cuda : fix compile error in jetson platform (ggerganov#4975)

dec22a7

* cuda: fix compile error in jetson platform * cuda: update comment in ggml-cuda.cu * cuda: update ggml-cuda.cu comment

remy415 mentioned this pull request Mar 6, 2024

Add support for libcudart.so for CUDA devices (Adds Jetson support) ollama/ollama#2279

Merged

hodlen pushed a commit to hodlen/llama.cpp that referenced this pull request Apr 1, 2024

cuda : fix compile error in jetson platform (ggerganov#4975)

470207f

* cuda: fix compile error in jetson platform * cuda: update comment in ggml-cuda.cu * cuda: update ggml-cuda.cu comment

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cuda: fix compile error in jetson platform #4975

cuda: fix compile error in jetson platform #4975

KyL0N commented Jan 16, 2024

cebtenzzre left a comment •

edited

KyL0N commented Jan 17, 2024 •

edited

JohannesGaessler commented Jan 17, 2024

KyL0N commented Jan 17, 2024

KyL0N commented Jan 17, 2024

ggerganov commented Jan 17, 2024

KyL0N commented Jan 17, 2024 •

edited

slaren left a comment

cebtenzzre left a comment •

edited

ggerganov commented Jan 19, 2024

KyL0N commented Jan 20, 2024 •

edited

ggerganov commented Jan 20, 2024

cuda: fix compile error in jetson platform #4975

cuda: fix compile error in jetson platform #4975

Conversation

KyL0N commented Jan 16, 2024

cebtenzzre left a comment • edited

Choose a reason for hiding this comment

KyL0N commented Jan 17, 2024 • edited

JohannesGaessler commented Jan 17, 2024

KyL0N commented Jan 17, 2024

KyL0N commented Jan 17, 2024

ggerganov commented Jan 17, 2024

KyL0N commented Jan 17, 2024 • edited

slaren left a comment

Choose a reason for hiding this comment

cebtenzzre left a comment • edited

Choose a reason for hiding this comment

ggerganov commented Jan 19, 2024

KyL0N commented Jan 20, 2024 • edited

ggerganov commented Jan 20, 2024

cebtenzzre left a comment •

edited

KyL0N commented Jan 17, 2024 •

edited

KyL0N commented Jan 17, 2024 •

edited

cebtenzzre left a comment •

edited

KyL0N commented Jan 20, 2024 •

edited