Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cuda: fix compile error in jetson platform #4975

Merged
merged 3 commits into from
Jan 20, 2024

Conversation

KyL0N
Copy link
Contributor

@KyL0N KyL0N commented Jan 16, 2024

this pr fix compile error in jetson platform
related issue: #4922

the error was introduced by #4766

#4766 add #include "ggml-cuda.h"
before include #include <cuda_fp16.h>

so the solution is move #include "ggml-cuda.h" after #include <cuda_fp16.h>

Copy link
Collaborator

@cebtenzzre cebtenzzre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It sounds like ggml.h should #include <cuda_fp16.h>, no? #includes shouldn't be sensitive to the order they are in.

@KyL0N
Copy link
Contributor Author

KyL0N commented Jan 17, 2024

It sounds like ggml.h should #include <cuda_fp16.h>, no? #includes shouldn't be sensitive to the order they are in.

I prefer #include <cuda_fp16.h> only in .cu file.

u can check ggml-cuda.cu changes in #4766

@JohannesGaessler
Copy link
Collaborator

Unrelated to what we prefer, if the header were to #include <cuda_fp16.h>, wouldn't that potentially cause other compilation issues? Because right now you should be able to compile a program against the interface specified in ggml-cuda.h even without a CUDA toolkit installed but I don't think that would be possible the header were to already include CUDA headers.

@KyL0N
Copy link
Contributor Author

KyL0N commented Jan 17, 2024

that's ok
because cuda headers should be included before include ggml-cuda.h

@KyL0N
Copy link
Contributor Author

KyL0N commented Jan 17, 2024

@slaren Hello, could you please check if these changes are okay, or if anything else is needed?

@ggerganov
Copy link
Owner

It sounds like ggml.h should #include <cuda_fp16.h>, no?

I want to narrow down the GPU-releated stuff in the core ggml.h and ggml.c. Adding a CUDA header there would be a step backwards in this regard.

Does Jetson platform not have __fp16?
Does this patch work:

diff --git a/ggml.h b/ggml.h
index 4c2ff6c6..9d165a86 100644
--- a/ggml.h
+++ b/ggml.h
@@ -305,9 +305,7 @@
 extern "C" {
 #endif
 
-#if defined(__ARM_NEON) && defined(__CUDACC__)
-    typedef half ggml_fp16_t;
-#elif defined(__ARM_NEON) && !defined(_MSC_VER)
+#if defined(__ARM_NEON) && !defined(_MSC_VER)
     typedef __fp16 ggml_fp16_t;
 #else
     typedef uint16_t ggml_fp16_t;

I'm not sure when the #if defined(__ARM_NEON) && defined(__CUDACC__) branch is actually true?

@KyL0N
Copy link
Contributor Author

KyL0N commented Jan 17, 2024

Does Jetson platform not have __fp16?

this patch not work, nvcc dose not recognise __fp16
the alternative choice is half as I mentioned in #1455 (comment)

nvcc -I. -Icommon -D_XOPEN_SOURCE=600 -D_GNU_SOURCE -DNDEBUG -DGGML_USE_CUBLAS -I/usr/local/cuda/include -I/opt/cuda/include -I/targets/x86_64-linux/include -I/usr/local/cuda/targets/aarch64-linux/include -std=c++11 -fPIC -O3 -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wmissing-declarations -Wmissing-noreturn -pthread -mcpu=native -use_fast_math --forward-unknown-to-host-compiler -arch=sm_87 -DGGML_CUDA_DMMV_X=32 -DGGML_CUDA_MMV_Y=1 -DK_QUANTS_PER_ITERATION=2 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -Wno-pedantic -Xcompiler "-Wno-array-bounds -Wno-format-truncation -Wextra-semi" -c ggml-cuda.cu -o ggml-cuda.o
ggml.h(319): error: identifier "__fp16" is undefined

and #if defined(__ARM_NEON) && defined(__CUDACC__) which means cuda with aarch architecture which jetson platform is this also

if just fallback to typedef uint16_t ggml_fp16_t; it will output gibberish as #1455 mentioned.

Copy link
Collaborator

@slaren slaren left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am ok with this fix, but it would be good to add a comment explaining why these headers must be included last to prevent this from happening again.

Copy link
Collaborator

@cebtenzzre cebtenzzre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since there are two approvals I'll avoid blocking this, but my two cents: If we're going to have a __CUDACC__-specific typedef, why not have a __CUDACC__-specific #include in the same file? It just seems wrong to me that we would intentionally reference a symbol without including the header that declares it - ggml.h or otherwise.

It would be one thing if this was a #define, but the compiler error is on the typedef itself in this case.

@ggerganov
Copy link
Owner

@KyL0N Please add a comment about why the headers are there and we can merge

@KyL0N
Copy link
Contributor Author

KyL0N commented Jan 20, 2024

why not have a __CUDACC__-specific #include in the same file

In my view, if __CUDACC__ is defined, then nvcc must be the compiler. As a result, headers such as cuda_fp16.h or other CUDA-related files shoule be included in before ggml.h

@ggerganov
Copy link
Owner

Thanks for the discussion - IMO the fundamental issue is that ggml_fp16_t is exposed through the public ggml API in the first place. It's something to fix in the future, but for now will merge this workaround

@ggerganov ggerganov merged commit cca894f into ggerganov:master Jan 20, 2024
35 of 43 checks passed
@KyL0N KyL0N deleted the compile-patch branch January 21, 2024 15:54
jordankanter pushed a commit to jordankanter/llama.cpp that referenced this pull request Feb 3, 2024
* cuda: fix compile error in jetson platform

* cuda: update comment in ggml-cuda.cu

* cuda: update ggml-cuda.cu comment
hodlen pushed a commit to hodlen/llama.cpp that referenced this pull request Apr 1, 2024
* cuda: fix compile error in jetson platform

* cuda: update comment in ggml-cuda.cu

* cuda: update ggml-cuda.cu comment
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants