Skip to content

ERROR: Failed building wheel for llama-cpp-python #1629

@kot197

Description

@kot197

Can someone please tell me what's going on here?

I installed Visual Studio with C++
I installed CUDA toolkit from NVIDIA
I installed cmake from Visual Studio

I get the following error:

(base) C:\Users\J>set CMAKE_ARGS="-DGGML_BLAS=ON -DGGML_BLAS_VENDOR=OpenBLAS"

(base) C:\Users\J>set CMAKE_ARGS="-DGGML_CUDA=on"

(base) C:\Users\J>pip install llama-cpp-python --upgrade --force-reinstall --no-cache-dir
Collecting llama-cpp-python
  Downloading llama_cpp_python-0.2.83.tar.gz (49.4 MB)
     ---------------------------------------- 49.4/49.4 MB 54.7 MB/s eta 0:00:00
  Installing build dependencies ... done
  Getting requirements to build wheel ... done
  Installing backend dependencies ... done
  Preparing metadata (pyproject.toml) ... done
Collecting typing-extensions>=4.5.0 (from llama-cpp-python)
  Downloading typing_extensions-4.12.2-py3-none-any.whl.metadata (3.0 kB)
Collecting numpy>=1.20.0 (from llama-cpp-python)
  Downloading numpy-2.0.1-cp312-cp312-win_amd64.whl.metadata (60 kB)
     ---------------------------------------- 60.9/60.9 kB ? eta 0:00:00
Collecting diskcache>=5.6.1 (from llama-cpp-python)
  Downloading diskcache-5.6.3-py3-none-any.whl.metadata (20 kB)
Collecting jinja2>=2.11.3 (from llama-cpp-python)
  Downloading jinja2-3.1.4-py3-none-any.whl.metadata (2.6 kB)
Collecting MarkupSafe>=2.0 (from jinja2>=2.11.3->llama-cpp-python)
  Downloading MarkupSafe-2.1.5-cp312-cp312-win_amd64.whl.metadata (3.1 kB)
Downloading diskcache-5.6.3-py3-none-any.whl (45 kB)
   ---------------------------------------- 45.5/45.5 kB ? eta 0:00:00
Downloading jinja2-3.1.4-py3-none-any.whl (133 kB)
   ---------------------------------------- 133.3/133.3 kB ? eta 0:00:00
Downloading numpy-2.0.1-cp312-cp312-win_amd64.whl (16.3 MB)
   ---------------------------------------- 16.3/16.3 MB 46.9 MB/s eta 0:00:00
Downloading typing_extensions-4.12.2-py3-none-any.whl (37 kB)
Downloading MarkupSafe-2.1.5-cp312-cp312-win_amd64.whl (17 kB)
Building wheels for collected packages: llama-cpp-python
  Building wheel for llama-cpp-python (pyproject.toml) ... error
  error: subprocess-exited-with-error

  × Building wheel for llama-cpp-python (pyproject.toml) did not run successfully.
  │ exit code: 1
  ╰─> [109 lines of output]
      *** scikit-build-core 0.9.8 using CMake 3.30.1 (wheel)
      *** Configuring CMake...
      2024-07-28 19:09:24,577 - scikit_build_core - WARNING - Can't find a Python library, got libdir=None, ldlibrary=None, multiarch=None, masd=None
      loading initial cache file C:\Users\J\AppData\Local\Temp\tmpduqojb02\build\CMakeInit.txt
      -- Building for: Visual Studio 17 2022
      -- Selecting Windows SDK version 10.0.22621.0 to target Windows 10.0.19044.
      -- The C compiler identification is MSVC 19.40.33813.0
      -- The CXX compiler identification is MSVC 19.40.33813.0
      -- Detecting C compiler ABI info
      -- Detecting C compiler ABI info - done
      -- Check for working C compiler: C:/Program Files/Microsoft Visual Studio/2022/Community/VC/Tools/MSVC/14.40.33807/bin/Hostx64/x64/cl.exe - skipped
      -- Detecting C compile features
      -- Detecting C compile features - done
      -- Detecting CXX compiler ABI info
      -- Detecting CXX compiler ABI info - done
      -- Check for working CXX compiler: C:/Program Files/Microsoft Visual Studio/2022/Community/VC/Tools/MSVC/14.40.33807/bin/Hostx64/x64/cl.exe - skipped
      -- Detecting CXX compile features
      -- Detecting CXX compile features - done
      -- Found Git: C:/Program Files/Git/cmd/git.exe (found version "2.43.0.windows.1")
      -- Performing Test CMAKE_HAVE_LIBC_PTHREAD
      -- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Failed
      -- Looking for pthread_create in pthreads
      -- Looking for pthread_create in pthreads - not found
      -- Looking for pthread_create in pthread
      -- Looking for pthread_create in pthread - not found
      -- Found Threads: TRUE
      -- Found OpenMP_C: -openmp (found version "2.0")
      -- Found OpenMP_CXX: -openmp (found version "2.0")
      -- Found OpenMP: TRUE (found version "2.0")
      -- OpenMP found
      -- Using llamafile
      -- Found CUDAToolkit: C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v12.2/include (found version "12.2.91")
      -- CUDA found
      -- Using CUDA architectures: 52;61;70;75
      -- The CUDA compiler identification is NVIDIA 12.2.91
      -- Detecting CUDA compiler ABI info
      -- Detecting CUDA compiler ABI info - failed
      -- Check for working CUDA compiler: C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v12.2/bin/nvcc.exe
      -- Check for working CUDA compiler: C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v12.2/bin/nvcc.exe - broken
      CMake Error at C:/Users/J/AppData/Local/Temp/pip-build-env-pgvyur3w/normal/Lib/site-packages/cmake/data/share/cmake-3.30/Modules/CMakeTestCUDACompiler.cmake:59 (message):
        The CUDA compiler

          "C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v12.2/bin/nvcc.exe"

        is not able to compile a simple test program.

        It fails with the following output:

          Change Dir: 'C:/Users/J/AppData/Local/Temp/tmpduqojb02/build/CMakeFiles/CMakeScratch/TryCompile-1vdtef'

          Run Build Command(s): "C:/Program Files/Microsoft Visual Studio/2022/Community/MSBuild/Current/Bin/amd64/MSBuild.exe" cmTC_001f7.vcxproj /p:Configuration=Debug /p:Platform=x64 /p:VisualStudioVersion=17.0 /v:n
          MSBuild version 17.10.4+10fbfbf2e for .NET Framework
          Build started 7/28/2024 19:09:35.

          Project "C:\Users\J\AppData\Local\Temp\tmpduqojb02\build\CMakeFiles\CMakeScratch\TryCompile-1vdtef\cmTC_001f7.vcxproj" on node 1 (default targets).
          PrepareForBuild:
            Creating directory "cmTC_001f7.dir\Debug\".
          C:\Program Files\Microsoft Visual Studio\2022\Community\MSBuild\Microsoft\VC\v170\Microsoft.CppBuild.targets(541,5): warning MSB8029: The Intermediate directory or Output directory cannot reside under the Temporary directory as it could lead to issues with incremental build. [C:\Users\J\AppData\Local\Temp\tmpduqojb02\build\CMakeFiles\CMakeScratch\TryCompile-1vdtef\cmTC_001f7.vcxproj]
            Structured output is enabled. The formatting of compiler diagnostics will reflect the error hierarchy. See https://aka.ms/cpp/structured-output for more details.
            Creating directory "C:\Users\J\AppData\Local\Temp\tmpduqojb02\build\CMakeFiles\CMakeScratch\TryCompile-1vdtef\Debug\".
            Creating directory "cmTC_001f7.dir\Debug\cmTC_001f7.tlog\".
          InitializeBuildStatus:
            Creating "cmTC_001f7.dir\Debug\cmTC_001f7.tlog\unsuccessfulbuild" because "AlwaysCreate" was specified.
            Touching "cmTC_001f7.dir\Debug\cmTC_001f7.tlog\unsuccessfulbuild".
          AddCudaCompileDeps:
            C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.40.33807\bin\HostX64\x64\cl.exe /E /nologo /showIncludes /TP /D__CUDACC__ /D__CUDACC_VER_MAJOR__=12 /D__CUDACC_VER_MINOR__=2 /D_WINDOWS /DCMAKE_INTDIR="Debug" /D_MBCS /DCMAKE_INTDIR="Debug" /I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.2\bin" /I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.2\include" /I. /FIcuda_runtime.h /c C:\Users\J\AppData\Local\Temp\tmpduqojb02\build\CMakeFiles\CMakeScratch\TryCompile-1vdtef\main.cu
          Project "C:\Users\J\AppData\Local\Temp\tmpduqojb02\build\CMakeFiles\CMakeScratch\TryCompile-1vdtef\cmTC_001f7.vcxproj" (1) is building "C:\Users\J\AppData\Local\Temp\tmpduqojb02\build\CMakeFiles\CMakeScratch\TryCompile-1vdtef\cmTC_001f7.vcxproj" (1:2) on node 1 (CudaBuildCore target(s)).
          CudaBuildCore:
            Compiling CUDA source file main.cu...
            cmd.exe /C "C:\Users\J\AppData\Local\Temp\tmp0225af6b1a3d4dd69ed296c0b0e89efa.cmd"
            "C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.2\bin\nvcc.exe"  --use-local-env -ccbin "C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.40.33807\bin\HostX64\x64" -x cu    -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.2\include"     --keep-dir cmTC_001f7\x64\Debug  -maxrregcount=0   --machine 64 --compile -cudart static --generate-code=arch=compute_52,code=[compute_52,sm_52] --generate-code=arch=compute_61,code=[compute_61,sm_61] --generate-code=arch=compute_70,code=[compute_70,sm_70] --generate-code=arch=compute_75,code=[compute_75,sm_75] -Xcompiler="/EHsc -Zi -Ob0" -g  -D_WINDOWS -D"CMAKE_INTDIR=\"Debug\"" -D_MBCS -D"CMAKE_INTDIR=\"Debug\"" -Xcompiler "/EHsc /W1 /nologo /Od /FS /Zi /RTC1 /MDd " -Xcompiler "/FdcmTC_001f7.dir\Debug\vc143.pdb" -o cmTC_001f7.dir\Debug\main.obj "C:\Users\J\AppData\Local\Temp\tmpduqojb02\build\CMakeFiles\CMakeScratch\TryCompile-1vdtef\main.cu"

            (base) C:\Users\J\AppData\Local\Temp\tmpduqojb02\build\CMakeFiles\CMakeScratch\TryCompile-1vdtef>"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.2\bin\nvcc.exe"  --use-local-env -ccbin "C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.40.33807\bin\HostX64\x64" -x cu    -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.2\include"     --keep-dir cmTC_001f7\x64\Debug  -maxrregcount=0   --machine 64 --compile -cudart static --generate-code=arch=compute_52,code=[compute_52,sm_52] --generate-code=arch=compute_61,code=[compute_61,sm_61] --generate-code=arch=compute_70,code=[compute_70,sm_70] --generate-code=arch=compute_75,code=[compute_75,sm_75] -Xcompiler="/EHsc -Zi -Ob0" -g  -D_WINDOWS -D"CMAKE_INTDIR=\"Debug\"" -D_MBCS -D"CMAKE_INTDIR=\"Debug\"" -Xcompiler "/EHsc /W1 /nologo /Od /FS /Zi /RTC1 /MDd " -Xcompiler "/FdcmTC_001f7.dir\Debug\vc143.pdb" -o cmTC_001f7.dir\Debug\main.obj "C:\Users\J\AppData\Local\Temp\tmpduqojb02\build\CMakeFiles\CMakeScratch\TryCompile-1vdtef\main.cu"
          C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.2\include\crt/host_config.h(157): fatal error C1189: #error:  -- unsupported Microsoft Visual Studio version! Only the versions between 2017 and 2022 (inclusive) are supported! The nvcc flag '-allow-unsupported-compiler' can be used to override this version check; however, using an unsupported host compiler may cause compilation failure or incorrect run time execution. Use at your own risk. [C:\Users\J\AppData\Local\Temp\tmpduqojb02\build\CMakeFiles\CMakeScratch\TryCompile-1vdtef\cmTC_001f7.vcxproj]
            main.cu
          C:\Program Files\Microsoft Visual Studio\2022\Community\MSBuild\Microsoft\VC\v170\BuildCustomizations\CUDA 12.2.targets(799,9): error MSB3721: The command ""C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.2\bin\nvcc.exe"  --use-local-env -ccbin "C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.40.33807\bin\HostX64\x64" -x cu    -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.2\include"     --keep-dir cmTC_001f7\x64\Debug  -maxrregcount=0   --machine 64 --compile -cudart static --generate-code=arch=compute_52,code=[compute_52,sm_52] --generate-code=arch=compute_61,code=[compute_61,sm_61] --generate-code=arch=compute_70,code=[compute_70,sm_70] --generate-code=arch=compute_75,code=[compute_75,sm_75] -Xcompiler="/EHsc -Zi -Ob0" -g  -D_WINDOWS -D"CMAKE_INTDIR=\"Debug\"" -D_MBCS -D"CMAKE_INTDIR=\"Debug\"" -Xcompiler "/EHsc /W1 /nologo /Od /FS /Zi /RTC1 /MDd " -Xcompiler "/FdcmTC_001f7.dir\Debug\vc143.pdb" -o cmTC_001f7.dir\Debug\main.obj "C:\Users\J\AppData\Local\Temp\tmpduqojb02\build\CMakeFiles\CMakeScratch\TryCompile-1vdtef\main.cu"" exited with code 2. [C:\Users\J\AppData\Local\Temp\tmpduqojb02\build\CMakeFiles\CMakeScratch\TryCompile-1vdtef\cmTC_001f7.vcxproj]
          Done Building Project "C:\Users\J\AppData\Local\Temp\tmpduqojb02\build\CMakeFiles\CMakeScratch\TryCompile-1vdtef\cmTC_001f7.vcxproj" (CudaBuildCore target(s)) -- FAILED.
          Done Building Project "C:\Users\J\AppData\Local\Temp\tmpduqojb02\build\CMakeFiles\CMakeScratch\TryCompile-1vdtef\cmTC_001f7.vcxproj" (default targets) -- FAILED.

          Build FAILED.

          "C:\Users\J\AppData\Local\Temp\tmpduqojb02\build\CMakeFiles\CMakeScratch\TryCompile-1vdtef\cmTC_001f7.vcxproj" (default target) (1) ->
          (PrepareForBuild target) ->
            C:\Program Files\Microsoft Visual Studio\2022\Community\MSBuild\Microsoft\VC\v170\Microsoft.CppBuild.targets(541,5): warning MSB8029: The Intermediate directory or Output directory cannot reside under the Temporary directory as it could lead to issues with incremental build. [C:\Users\J\AppData\Local\Temp\tmpduqojb02\build\CMakeFiles\CMakeScratch\TryCompile-1vdtef\cmTC_001f7.vcxproj]


          "C:\Users\J\AppData\Local\Temp\tmpduqojb02\build\CMakeFiles\CMakeScratch\TryCompile-1vdtef\cmTC_001f7.vcxproj" (default target) (1) ->
          "C:\Users\J\AppData\Local\Temp\tmpduqojb02\build\CMakeFiles\CMakeScratch\TryCompile-1vdtef\cmTC_001f7.vcxproj" (CudaBuildCore target) (1:2) ->
          (CudaBuildCore target) ->
            C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.2\include\crt/host_config.h(157): fatal error C1189: #error:  -- unsupported Microsoft Visual Studio version! Only the versions between 2017 and 2022 (inclusive) are supported! The nvcc flag '-allow-unsupported-compiler' can be used to override this version check; however, using an unsupported host compiler may cause compilation failure or incorrect run time execution. Use at your own risk. [C:\Users\J\AppData\Local\Temp\tmpduqojb02\build\CMakeFiles\CMakeScratch\TryCompile-1vdtef\cmTC_001f7.vcxproj]
            C:\Program Files\Microsoft Visual Studio\2022\Community\MSBuild\Microsoft\VC\v170\BuildCustomizations\CUDA 12.2.targets(799,9): error MSB3721: The command ""C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.2\bin\nvcc.exe"  --use-local-env -ccbin "C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.40.33807\bin\HostX64\x64" -x cu    -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.2\include"     --keep-dir cmTC_001f7\x64\Debug  -maxrregcount=0   --machine 64 --compile -cudart static --generate-code=arch=compute_52,code=[compute_52,sm_52] --generate-code=arch=compute_61,code=[compute_61,sm_61] --generate-code=arch=compute_70,code=[compute_70,sm_70] --generate-code=arch=compute_75,code=[compute_75,sm_75] -Xcompiler="/EHsc -Zi -Ob0" -g  -D_WINDOWS -D"CMAKE_INTDIR=\"Debug\"" -D_MBCS -D"CMAKE_INTDIR=\"Debug\"" -Xcompiler "/EHsc /W1 /nologo /Od /FS /Zi /RTC1 /MDd " -Xcompiler "/FdcmTC_001f7.dir\Debug\vc143.pdb" -o cmTC_001f7.dir\Debug\main.obj "C:\Users\J\AppData\Local\Temp\tmpduqojb02\build\CMakeFiles\CMakeScratch\TryCompile-1vdtef\main.cu"" exited with code 2. [C:\Users\J\AppData\Local\Temp\tmpduqojb02\build\CMakeFiles\CMakeScratch\TryCompile-1vdtef\cmTC_001f7.vcxproj]

              1 Warning(s)
              2 Error(s)

          Time Elapsed 00:00:00.58





        CMake will not be able to correctly generate this project.
      Call Stack (most recent call first):
        vendor/llama.cpp/ggml/src/CMakeLists.txt:271 (enable_language)


      -- Configuring incomplete, errors occurred!

      *** CMake configuration failed
      [end of output]

  note: This error originates from a subprocess, and is likely not a problem with pip.
  ERROR: Failed building wheel for llama-cpp-python
Failed to build llama-cpp-python
ERROR: Could not build wheels for llama-cpp-python, which is required to install pyproject.toml-based projects

I think this part could be the problem

The CUDA compiler

          "C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v12.2/bin/nvcc.exe"

        is not able to compile a simple test program.

        It fails with the following output:

Please be gentle with me as I'm a newcomer trying to figure this out

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions