Pre-built Wheels for llama-cpp-python

Pre-built wheels for llama-cpp-python
To install the package, copy the wheel file's URL from the Releases page and run:

pip install PASTE_THE_COPIED_URL_HERE

Installation Examples

Note

If import llama_cpp returns an error Failed to load shared library: libgomp.so.1, then you need to install the library

sudo apt-get update && apt-get install -y libgomp1

Linux x86_64 / Python 3.11 / CPU

pip install https://github.com/sergey21000/llama-cpp-python-wheels/releases/download/v0.3.15-cpu/llama_cpp_python-0.3.15-cp311-cp311-linux_x86_64.whl

Linux x86_64 / Python 3.11 / CUDA 12.4

pip install https://github.com/sergey21000/llama-cpp-python-wheels/releases/download/v0.3.15-cu124/llama_cpp_python-0.3.15-cp311-cp311-linux_x86_64.whl

Linux x86_64 / Python 3.12 / CPU (Google Colab)

pip install https://github.com/sergey21000/llama-cpp-python-wheels/releases/download/v0.3.15-cpu/llama_cpp_python-0.3.15-cp312-cp312-linux_x86_64.whl

Linux x86_64 / Python 3.12 / CUDA 12.4 (Google Colab)

pip install https://github.com/sergey21000/llama-cpp-python-wheels/releases/download/v0.3.15-cu124/llama_cpp_python-0.3.15-cp312-cp312-linux_x86_64.whl

Windows amd64 / Python 3.12 / CPU

pip install https://github.com/sergey21000/llama-cpp-python-wheels/releases/download/v0.3.15-cpu/llama_cpp_python-0.3.15-cp312-cp312-win_amd64.whl

Windows amd64 / Python 3.12 / CUDA 12.8

pip install https://github.com/sergey21000/llama-cpp-python-wheels/releases/download/v0.3.15-cu128-win/llama_cpp_python-0.3.15-cp312-cp312-win_amd64.whl

Build Instructions

Example: Building llama-cpp-python with CUDA support in Google Colab

Build the latest version of llama-cpp-python with CUDA support into the wheel_dir directory

!CMAKE_ARGS="-DGGML_CUDA=on" pip wheel --no-deps --wheel-dir=wheel_dir llama-cpp-python

The build process takes about 30–40 minutes. Make sure that a GPU is enabled in your Colab environment.
Once completed, the .whl file will be located in the wheel_dir directory.
The .whl file will be compiled for the architecture of the current GPU, if you need to compile with support for other CUDA architectures, you need to specify

!CMAKE_ARGS="-DGGML_CUDA=on -DCMAKE_CUDA_ARCHITECTURES=75;80;86;89;90" pip wheel --no-deps --wheel-dir=wheel_dir llama-cpp-python

(Optional) Saving the .whl file to Google Drive for convenience (after mounting the drive)

import shutil
src_wheel_file = 'wheel_dir/llama_cpp_python-0.3.14-cp311-cp311-linux_x86_64.whl'
trg_wheel_file = '/content/drive/MyDrive/llama_cpp_python-0.3.14-cp311-cp311-linux_x86_64.whl'
shutil.copyfile(src_wheel_file, trg_wheel_file)

Installing from a saved wheel:

!pip install wheel_dir/llama_cpp_python-0.3.14-cp311-cp311-linux_x86_64.whl

Example: Building llama-cpp-python on Windows with CUDA support

Build the latest version of llama-cpp-python with CUDA support into the wheel_dir directory (Windows Powershell)

$env:FORCE_CMAKE='1'; $env:CMAKE_ARGS='-DGGML_CUDA=on'
pip wheel --no-deps --no-cache-dir --wheel-dir=wheel_dir llama-cpp-python

If DLLAMA_AVX or other instructions are not supported then you need to specify this

$env:FORCE_CMAKE='1'; $env:CMAKE_ARGS='-DGGML_CUDA=on -DLLAMA_AVX=off -DLLAMA_AVX2=off -DLLAMA_FMA=off'
pip wheel --no-deps --no-cache-dir --wheel-dir=wheel_dir llama-cpp-python

Build for other CUDA architectures

$env:FORCE_CMAKE='1'; $env:CMAKE_ARGS='-DGGML_CUDA=on -DCMAKE_CUDA_ARCHITECTURES=75;80;86;89;90'
pip wheel --no-deps --no-cache-dir --wheel-dir=wheel_dir llama-cpp-python

Instead of pip wheel you can use pip install to install the library right away

Note

To install llama-cpp-python on Windows with CUDA support, you must first install Visual Studio 2022 Community and CUDA Toolkit, as indicated in this or this instructions

Example: Building llama-cpp-python on Android

Build the latest version of llama-cpp-python on Termux (Android, aarch64), taken from this comment

pkg update && pkg upgrade 
pkg install libexpat openssl python-pip python-cryptography cmake ninja autoconf automake libandroid-execinfo patchelf

# command for build wheels
pip wheel --no-deps --no-cache-dir --wheel-dir=wheel_dir llama-cpp-python
# or command to install
pip install llama-cpp-python

Name		Name	Last commit message	Last commit date
Latest commit History 99 Commits
.github/workflows		.github/workflows
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Pre-built Wheels for llama-cpp-python

Installation Examples

Build Instructions

Example: Building llama-cpp-python with CUDA support in Google Colab

Example: Building llama-cpp-python on Windows with CUDA support

Example: Building llama-cpp-python on Android

About

Uh oh!

Releases 9

Packages

sergey21000/llama-cpp-python-wheels

Folders and files

Latest commit

History

Repository files navigation

Pre-built Wheels for llama-cpp-python

Installation Examples

Build Instructions

Example: Building llama-cpp-python with CUDA support in Google Colab

Example: Building llama-cpp-python on Windows with CUDA support

Example: Building llama-cpp-python on Android

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases 9

Packages 0

Packages