ERROR: Failed building wheel for llama-cpp-python

Can someone please tell me what's going on here?

I installed Visual Studio with C++
I installed CUDA toolkit from NVIDIA
I installed cmake from Visual Studio

I get the following error:
```
(base) C:\Users\J>set CMAKE_ARGS="-DGGML_BLAS=ON -DGGML_BLAS_VENDOR=OpenBLAS"

(base) C:\Users\J>set CMAKE_ARGS="-DGGML_CUDA=on"

(base) C:\Users\J>pip install llama-cpp-python --upgrade --force-reinstall --no-cache-dir
Collecting llama-cpp-python
  Downloading llama_cpp_python-0.2.83.tar.gz (49.4 MB)
     ---------------------------------------- 49.4/49.4 MB 54.7 MB/s eta 0:00:00
  Installing build dependencies ... done
  Getting requirements to build wheel ... done
  Installing backend dependencies ... done
  Preparing metadata (pyproject.toml) ... done
Collecting typing-extensions>=4.5.0 (from llama-cpp-python)
  Downloading typing_extensions-4.12.2-py3-none-any.whl.metadata (3.0 kB)
Collecting numpy>=1.20.0 (from llama-cpp-python)
  Downloading numpy-2.0.1-cp312-cp312-win_amd64.whl.metadata (60 kB)
     ---------------------------------------- 60.9/60.9 kB ? eta 0:00:00
Collecting diskcache>=5.6.1 (from llama-cpp-python)
  Downloading diskcache-5.6.3-py3-none-any.whl.metadata (20 kB)
Collecting jinja2>=2.11.3 (from llama-cpp-python)
  Downloading jinja2-3.1.4-py3-none-any.whl.metadata (2.6 kB)
Collecting MarkupSafe>=2.0 (from jinja2>=2.11.3->llama-cpp-python)
  Downloading MarkupSafe-2.1.5-cp312-cp312-win_amd64.whl.metadata (3.1 kB)
Downloading diskcache-5.6.3-py3-none-any.whl (45 kB)
   ---------------------------------------- 45.5/45.5 kB ? eta 0:00:00
Downloading jinja2-3.1.4-py3-none-any.whl (133 kB)
   ---------------------------------------- 133.3/133.3 kB ? eta 0:00:00
Downloading numpy-2.0.1-cp312-cp312-win_amd64.whl (16.3 MB)
   ---------------------------------------- 16.3/16.3 MB 46.9 MB/s eta 0:00:00
Downloading typing_extensions-4.12.2-py3-none-any.whl (37 kB)
Downloading MarkupSafe-2.1.5-cp312-cp312-win_amd64.whl (17 kB)
Building wheels for collected packages: llama-cpp-python
  Building wheel for llama-cpp-python (pyproject.toml) ... error
  error: subprocess-exited-with-error

  × Building wheel for llama-cpp-python (pyproject.toml) did not run successfully.
  │ exit code: 1
  ╰─> [109 lines of output]
      *** scikit-build-core 0.9.8 using CMake 3.30.1 (wheel)
      *** Configuring CMake...
      2024-07-28 19:09:24,577 - scikit_build_core - WARNING - Can't find a Python library, got libdir=None, ldlibrary=None, multiarch=None, masd=None
      loading initial cache file C:\Users\J\AppData\Local\Temp\tmpduqojb02\build\CMakeInit.txt
      -- Building for: Visual Studio 17 2022
      -- Selecting Windows SDK version 10.0.22621.0 to target Windows 10.0.19044.
      -- The C compiler identification is MSVC 19.40.33813.0
      -- The CXX compiler identification is MSVC 19.40.33813.0
      -- Detecting C compiler ABI info
      -- Detecting C compiler ABI info - done
      -- Check for working C compiler: C:/Program Files/Microsoft Visual Studio/2022/Community/VC/Tools/MSVC/14.40.33807/bin/Hostx64/x64/cl.exe - skipped
      -- Detecting C compile features
      -- Detecting C compile features - done
      -- Detecting CXX compiler ABI info
      -- Detecting CXX compiler ABI info - done
      -- Check for working CXX compiler: C:/Program Files/Microsoft Visual Studio/2022/Community/VC/Tools/MSVC/14.40.33807/bin/Hostx64/x64/cl.exe - skipped
      -- Detecting CXX compile features
      -- Detecting CXX compile features - done
      -- Found Git: C:/Program Files/Git/cmd/git.exe (found version "2.43.0.windows.1")
      -- Performing Test CMAKE_HAVE_LIBC_PTHREAD
      -- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Failed
      -- Looking for pthread_create in pthreads
      -- Looking for pthread_create in pthreads - not found
      -- Looking for pthread_create in pthread
      -- Looking for pthread_create in pthread - not found
      -- Found Threads: TRUE
      -- Found OpenMP_C: -openmp (found version "2.0")
      -- Found OpenMP_CXX: -openmp (found version "2.0")
      -- Found OpenMP: TRUE (found version "2.0")
      -- OpenMP found
      -- Using llamafile
      -- Found CUDAToolkit: C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v12.2/include (found version "12.2.91")
      -- CUDA found
      -- Using CUDA architectures: 52;61;70;75
      -- The CUDA compiler identification is NVIDIA 12.2.91
      -- Detecting CUDA compiler ABI info
      -- Detecting CUDA compiler ABI info - failed
      -- Check for working CUDA compiler: C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v12.2/bin/nvcc.exe
      -- Check for working CUDA compiler: C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v12.2/bin/nvcc.exe - broken
      CMake Error at C:/Users/J/AppData/Local/Temp/pip-build-env-pgvyur3w/normal/Lib/site-packages/cmake/data/share/cmake-3.30/Modules/CMakeTestCUDACompiler.cmake:59 (message):
        The CUDA compiler

          "C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v12.2/bin/nvcc.exe"

        is not able to compile a simple test program.

        It fails with the following output:

          Change Dir: 'C:/Users/J/AppData/Local/Temp/tmpduqojb02/build/CMakeFiles/CMakeScratch/TryCompile-1vdtef'

          Run Build Command(s): "C:/Program Files/Microsoft Visual Studio/2022/Community/MSBuild/Current/Bin/amd64/MSBuild.exe" cmTC_001f7.vcxproj /p:Configuration=Debug /p:Platform=x64 /p:VisualStudioVersion=17.0 /v:n
          MSBuild version 17.10.4+10fbfbf2e for .NET Framework
          Build started 7/28/2024 19:09:35.

          Project "C:\Users\J\AppData\Local\Temp\tmpduqojb02\build\CMakeFiles\CMakeScratch\TryCompile-1vdtef\cmTC_001f7.vcxproj" on node 1 (default targets).
          PrepareForBuild:
            Creating directory "cmTC_001f7.dir\Debug\".
          C:\Program Files\Microsoft Visual Studio\2022\Community\MSBuild\Microsoft\VC\v170\Microsoft.CppBuild.targets(541,5): warning MSB8029: The Intermediate directory or Output directory cannot reside under the Temporary directory as it could lead to issues with incremental build. [C:\Users\J\AppData\Local\Temp\tmpduqojb02\build\CMakeFiles\CMakeScratch\TryCompile-1vdtef\cmTC_001f7.vcxproj]
            Structured output is enabled. The formatting of compiler diagnostics will reflect the error hierarchy. See https://aka.ms/cpp/structured-output for more details.
            Creating directory "C:\Users\J\AppData\Local\Temp\tmpduqojb02\build\CMakeFiles\CMakeScratch\TryCompile-1vdtef\Debug\".
            Creating directory "cmTC_001f7.dir\Debug\cmTC_001f7.tlog\".
          InitializeBuildStatus:
            Creating "cmTC_001f7.dir\Debug\cmTC_001f7.tlog\unsuccessfulbuild" because "AlwaysCreate" was specified.
            Touching "cmTC_001f7.dir\Debug\cmTC_001f7.tlog\unsuccessfulbuild".
          AddCudaCompileDeps:
            C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.40.33807\bin\HostX64\x64\cl.exe /E /nologo /showIncludes /TP /D__CUDACC__ /D__CUDACC_VER_MAJOR__=12 /D__CUDACC_VER_MINOR__=2 /D_WINDOWS /DCMAKE_INTDIR="Debug" /D_MBCS /DCMAKE_INTDIR="Debug" /I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.2\bin" /I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.2\include" /I. /FIcuda_runtime.h /c C:\Users\J\AppData\Local\Temp\tmpduqojb02\build\CMakeFiles\CMakeScratch\TryCompile-1vdtef\main.cu
          Project "C:\Users\J\AppData\Local\Temp\tmpduqojb02\build\CMakeFiles\CMakeScratch\TryCompile-1vdtef\cmTC_001f7.vcxproj" (1) is building "C:\Users\J\AppData\Local\Temp\tmpduqojb02\build\CMakeFiles\CMakeScratch\TryCompile-1vdtef\cmTC_001f7.vcxproj" (1:2) on node 1 (CudaBuildCore target(s)).
          CudaBuildCore:
            Compiling CUDA source file main.cu...
            cmd.exe /C "C:\Users\J\AppData\Local\Temp\tmp0225af6b1a3d4dd69ed296c0b0e89efa.cmd"
            "C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.2\bin\nvcc.exe"  --use-local-env -ccbin "C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.40.33807\bin\HostX64\x64" -x cu    -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.2\include"     --keep-dir cmTC_001f7\x64\Debug  -maxrregcount=0   --machine 64 --compile -cudart static --generate-code=arch=compute_52,code=[compute_52,sm_52] --generate-code=arch=compute_61,code=[compute_61,sm_61] --generate-code=arch=compute_70,code=[compute_70,sm_70] --generate-code=arch=compute_75,code=[compute_75,sm_75] -Xcompiler="/EHsc -Zi -Ob0" -g  -D_WINDOWS -D"CMAKE_INTDIR=\"Debug\"" -D_MBCS -D"CMAKE_INTDIR=\"Debug\"" -Xcompiler "/EHsc /W1 /nologo /Od /FS /Zi /RTC1 /MDd " -Xcompiler "/FdcmTC_001f7.dir\Debug\vc143.pdb" -o cmTC_001f7.dir\Debug\main.obj "C:\Users\J\AppData\Local\Temp\tmpduqojb02\build\CMakeFiles\CMakeScratch\TryCompile-1vdtef\main.cu"

            (base) C:\Users\J\AppData\Local\Temp\tmpduqojb02\build\CMakeFiles\CMakeScratch\TryCompile-1vdtef>"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.2\bin\nvcc.exe"  --use-local-env -ccbin "C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.40.33807\bin\HostX64\x64" -x cu    -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.2\include"     --keep-dir cmTC_001f7\x64\Debug  -maxrregcount=0   --machine 64 --compile -cudart static --generate-code=arch=compute_52,code=[compute_52,sm_52] --generate-code=arch=compute_61,code=[compute_61,sm_61] --generate-code=arch=compute_70,code=[compute_70,sm_70] --generate-code=arch=compute_75,code=[compute_75,sm_75] -Xcompiler="/EHsc -Zi -Ob0" -g  -D_WINDOWS -D"CMAKE_INTDIR=\"Debug\"" -D_MBCS -D"CMAKE_INTDIR=\"Debug\"" -Xcompiler "/EHsc /W1 /nologo /Od /FS /Zi /RTC1 /MDd " -Xcompiler "/FdcmTC_001f7.dir\Debug\vc143.pdb" -o cmTC_001f7.dir\Debug\main.obj "C:\Users\J\AppData\Local\Temp\tmpduqojb02\build\CMakeFiles\CMakeScratch\TryCompile-1vdtef\main.cu"
          C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.2\include\crt/host_config.h(157): fatal error C1189: #error:  -- unsupported Microsoft Visual Studio version! Only the versions between 2017 and 2022 (inclusive) are supported! The nvcc flag '-allow-unsupported-compiler' can be used to override this version check; however, using an unsupported host compiler may cause compilation failure or incorrect run time execution. Use at your own risk. [C:\Users\J\AppData\Local\Temp\tmpduqojb02\build\CMakeFiles\CMakeScratch\TryCompile-1vdtef\cmTC_001f7.vcxproj]
            main.cu
          C:\Program Files\Microsoft Visual Studio\2022\Community\MSBuild\Microsoft\VC\v170\BuildCustomizations\CUDA 12.2.targets(799,9): error MSB3721: The command ""C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.2\bin\nvcc.exe"  --use-local-env -ccbin "C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.40.33807\bin\HostX64\x64" -x cu    -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.2\include"     --keep-dir cmTC_001f7\x64\Debug  -maxrregcount=0   --machine 64 --compile -cudart static --generate-code=arch=compute_52,code=[compute_52,sm_52] --generate-code=arch=compute_61,code=[compute_61,sm_61] --generate-code=arch=compute_70,code=[compute_70,sm_70] --generate-code=arch=compute_75,code=[compute_75,sm_75] -Xcompiler="/EHsc -Zi -Ob0" -g  -D_WINDOWS -D"CMAKE_INTDIR=\"Debug\"" -D_MBCS -D"CMAKE_INTDIR=\"Debug\"" -Xcompiler "/EHsc /W1 /nologo /Od /FS /Zi /RTC1 /MDd " -Xcompiler "/FdcmTC_001f7.dir\Debug\vc143.pdb" -o cmTC_001f7.dir\Debug\main.obj "C:\Users\J\AppData\Local\Temp\tmpduqojb02\build\CMakeFiles\CMakeScratch\TryCompile-1vdtef\main.cu"" exited with code 2. [C:\Users\J\AppData\Local\Temp\tmpduqojb02\build\CMakeFiles\CMakeScratch\TryCompile-1vdtef\cmTC_001f7.vcxproj]
          Done Building Project "C:\Users\J\AppData\Local\Temp\tmpduqojb02\build\CMakeFiles\CMakeScratch\TryCompile-1vdtef\cmTC_001f7.vcxproj" (CudaBuildCore target(s)) -- FAILED.
          Done Building Project "C:\Users\J\AppData\Local\Temp\tmpduqojb02\build\CMakeFiles\CMakeScratch\TryCompile-1vdtef\cmTC_001f7.vcxproj" (default targets) -- FAILED.

          Build FAILED.

          "C:\Users\J\AppData\Local\Temp\tmpduqojb02\build\CMakeFiles\CMakeScratch\TryCompile-1vdtef\cmTC_001f7.vcxproj" (default target) (1) ->
          (PrepareForBuild target) ->
            C:\Program Files\Microsoft Visual Studio\2022\Community\MSBuild\Microsoft\VC\v170\Microsoft.CppBuild.targets(541,5): warning MSB8029: The Intermediate directory or Output directory cannot reside under the Temporary directory as it could lead to issues with incremental build. [C:\Users\J\AppData\Local\Temp\tmpduqojb02\build\CMakeFiles\CMakeScratch\TryCompile-1vdtef\cmTC_001f7.vcxproj]


          "C:\Users\J\AppData\Local\Temp\tmpduqojb02\build\CMakeFiles\CMakeScratch\TryCompile-1vdtef\cmTC_001f7.vcxproj" (default target) (1) ->
          "C:\Users\J\AppData\Local\Temp\tmpduqojb02\build\CMakeFiles\CMakeScratch\TryCompile-1vdtef\cmTC_001f7.vcxproj" (CudaBuildCore target) (1:2) ->
          (CudaBuildCore target) ->
            C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.2\include\crt/host_config.h(157): fatal error C1189: #error:  -- unsupported Microsoft Visual Studio version! Only the versions between 2017 and 2022 (inclusive) are supported! The nvcc flag '-allow-unsupported-compiler' can be used to override this version check; however, using an unsupported host compiler may cause compilation failure or incorrect run time execution. Use at your own risk. [C:\Users\J\AppData\Local\Temp\tmpduqojb02\build\CMakeFiles\CMakeScratch\TryCompile-1vdtef\cmTC_001f7.vcxproj]
            C:\Program Files\Microsoft Visual Studio\2022\Community\MSBuild\Microsoft\VC\v170\BuildCustomizations\CUDA 12.2.targets(799,9): error MSB3721: The command ""C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.2\bin\nvcc.exe"  --use-local-env -ccbin "C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.40.33807\bin\HostX64\x64" -x cu    -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.2\include"     --keep-dir cmTC_001f7\x64\Debug  -maxrregcount=0   --machine 64 --compile -cudart static --generate-code=arch=compute_52,code=[compute_52,sm_52] --generate-code=arch=compute_61,code=[compute_61,sm_61] --generate-code=arch=compute_70,code=[compute_70,sm_70] --generate-code=arch=compute_75,code=[compute_75,sm_75] -Xcompiler="/EHsc -Zi -Ob0" -g  -D_WINDOWS -D"CMAKE_INTDIR=\"Debug\"" -D_MBCS -D"CMAKE_INTDIR=\"Debug\"" -Xcompiler "/EHsc /W1 /nologo /Od /FS /Zi /RTC1 /MDd " -Xcompiler "/FdcmTC_001f7.dir\Debug\vc143.pdb" -o cmTC_001f7.dir\Debug\main.obj "C:\Users\J\AppData\Local\Temp\tmpduqojb02\build\CMakeFiles\CMakeScratch\TryCompile-1vdtef\main.cu"" exited with code 2. [C:\Users\J\AppData\Local\Temp\tmpduqojb02\build\CMakeFiles\CMakeScratch\TryCompile-1vdtef\cmTC_001f7.vcxproj]

              1 Warning(s)
              2 Error(s)

          Time Elapsed 00:00:00.58





        CMake will not be able to correctly generate this project.
      Call Stack (most recent call first):
        vendor/llama.cpp/ggml/src/CMakeLists.txt:271 (enable_language)


      -- Configuring incomplete, errors occurred!

      *** CMake configuration failed
      [end of output]

  note: This error originates from a subprocess, and is likely not a problem with pip.
  ERROR: Failed building wheel for llama-cpp-python
Failed to build llama-cpp-python
ERROR: Could not build wheels for llama-cpp-python, which is required to install pyproject.toml-based projects
```
I think this part could be the problem
```
The CUDA compiler

          "C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v12.2/bin/nvcc.exe"

        is not able to compile a simple test program.

        It fails with the following output:
```

Please be gentle with me as I'm a newcomer trying to figure this out

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

ERROR: Failed building wheel for llama-cpp-python #1629

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

ERROR: Failed building wheel for llama-cpp-python #1629

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions