-
Notifications
You must be signed in to change notification settings - Fork 953
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Failed to build CUDA #19
Comments
The warnings you can ignore. I plan to remove those soon. As for the CUDA header error, I've tested CUDA 12.3 SDK on Windows 10 and didn't encounter this issue. I don't currently have access to a Windows 11 system to try reproducing it there. Contributions are welcome. You may also want to try building llama.cpp on your machine and if it still happens, file an issue with the upstream project. |
I got the same error with both CUDA 12.3 & 12.1., But when I build the upstream llama.cpp using with cmake it works fine. My specs: I ran [Guru3D.com]-Display_Driver_uninstaller and erased everything on my system related to nvidia then did a clean instead of Cuda 12.3, no difference. I ran nvcc with --verbose to get a better idea of what was going on, here is the log: nvcc --verbose ggml-cuda.cu
The Upstream build:
Upstream log
You can see the full command line of nvcc excerpted below from the upstream build's log and I'm sure it is the lack of some of those parameters causing the issues with llamafile's invocation: D:\llama-model-data\llama.cpp\llama.cpp-master\build>"D:\NVIDIA GPU Computing Toolkit\CUDA\v12.3\bin\nvcc.exe" --use-local-env -ccbin "C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.38.33130\bin\HostX64\x64" -x cu -I"D:\llama-model-data\llama.cpp\llama.cpp-master." -I"D:\NVIDIA GPU Computing Toolkit\CUDA\v12.3\include" -I"D:\NVIDIA GPU Computing Toolkit\CUDA\v12.3\include" --keep-dir x64\Release -use_fast_math -maxrregcount=0 --machine 64 --compile -cudart static --generate-code=arch=compute_52,code=[compute_52,sm_52] --generate-code=arch=compute_61,code=[compute_61,sm_61] --generate-code=arch=compute_70,code=[compute_70,sm_70] -Xcompiler="/EHsc -Ob2" -D_WINDOWS -DNDEBUG -DGGML_USE_CUBLAS -DGGML_CUDA_DMMV_X=32 -DGGML_CUDA_MMV_Y=1 -DK_QUANTS_PER_ITERATION=2 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -D_CRT_SECURE_NO_WARNINGS -D_XOPEN_SOURCE=600 -D"CMAKE_INTDIR="Release"" -D_MBCS -DWIN32 -D_WINDOWS -DNDEBUG -DGGML_USE_CUBLAS -DGGML_CUDA_DMMV_X=32 -DGGML_CUDA_MMV_Y=1 -DK_QUANTS_PER_ITERATION=2 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -D_CRT_SECURE_NO_WARNINGS -D_XOPEN_SOURCE=600 -D"CMAKE_INTDIR="Release"" -Xcompiler "/EHsc /W3 /nologo /O2 /FS /MD /GR" -Xcompiler "/Fdggml.dir\Release\ggml.pdb" -o ggml.dir\Release\ggml-cuda.obj "D:\llama-model-data\llama.cpp\llama.cpp-master\ggml-cuda.cu" |
@CeruleanSky I'm using the same version of CUDA as you, on Windows. Slightly different MSVC revision. Things work fine for me. The nvcc command run by llama.cpp's cmake build config is highly different from what the llama.cpp makefile config uses. llamafile is based off the makefile config. The nvcc command that llamafile runs on my machine is:
I won't make changes to the build config unless I understand why they need to be made. Could you please troubleshoot things further and tell me how specifically the above command needs to be changed so that it'll work on your machine? Thanks! |
I think the readme should change The clue was thanks to the below error when I starting adding parameters:
This lead me to finding out that running nvcc from I believe the reason the cmake build works is that it modifies the PATH variable and then manually picks out the correct compiler using |
Would it have made your life easier if I had something like: #ifdef __i386__
#error "you need to use a 64-bit compiler for llamafile"
#endif |
@savant117 I have the same question, have you found a solution? |
@savant117 I know the solution,RUN |
Getting a bunch of the following error when it's compiling the CUDA kernel on windows 11:
Any ideas?
Edit: Not sure if it matters, but also seeing some warning before that:
The text was updated successfully, but these errors were encountered: