-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
llama-cpp-python compile script for windows (working cublas example for powershell). Updated script and wheel #182
Comments
I was able to install it under windows , but got another type of error, do you have any hints for this?
|
Small update of the script because of the new breaking changes to old quantized models. set-executionpolicy RemoteSigned -Scope CurrentUser
python -m venv venv
venv\Scripts\Activate.ps1
pip install scikit-build
python -m pip install -U pip wheel setuptools
git clone https://github.com/abetlen/llama-cpp-python.git
cd llama-cpp-python
cd vendor
git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
# remove the line git checkout if you want the latest and new quant(=not working with old ggmls).
git checkout b608b55a3ea8e4760c617418538465449175bdb8
cd ..\..
$Env:LLAMA_CUBLAS = "1"
$Env:FORCE_CMAKE = "1"
$Env:CMAKE_ARGS="-DLLAMA_CUBLAS=on"
python setup.py bdist_wheel
Write-Host "Done! The llama_cpp folder with the cublas llama.dll is under ..\llama-cpp-python\_skbuild\win-amd64-3.10\cmake-install"
Write-Host "You can use this folder to replace your old folder. The wheel is under \llama-cpp-python\dist"
pause
# you need installed and working (PATH is the main problem):
# git, python (3.10.11) cuda toolkit (11.8)
# visual studio 2022 community AND Build Tools 2019.
# cmake (click on path during installation and restart computer) |
If you cannot compile cuda, i have made a wheel for the "old" ggmls for installation. You need only the CUDA Toolkit 11.8 installed. git clone https://github.com/CapitalBeyond/win-cuda-llama-cpp-python.git cd win-cuda-llama-cpp-python pip install llama_cpp_python-0.1.49-cp310-cp310-win_amd64.whl --upgrade Example CUDA 11.8 Oobabooga installation script without compiling: set-executionpolicy RemoteSigned -Scope CurrentUser
git clone https://github.com/oobabooga/text-generation-webui.git
cd text-generation-webui
python -m venv venv
venv\Scripts\Activate.ps1
pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
pip install https://github.com/jllllll/bitsandbytes-windows-webui/raw/main/bitsandbytes-0.38.1-py3-none-any.whl
pip install -r requirements.txt
mkdir repositories
cd repositories
git clone https://github.com/oobabooga/GPTQ-for-LLaMa.git -b cuda
cd GPTQ-for-LLaMa
python -m pip install -r requirements.txt
cd ..
pip install https://github.com/jllllll/GPTQ-for-LLaMa-Wheels/raw/main/quant_cuda-0.0.0-cp310-cp310-win_amd64.whl
cd ..
pip install einops
pip install -r extensions\superbooga\requirements.txt --upgrade
pip install -r extensions\api\requirements.txt --upgrade
pip install -r extensions\elevenlabs_tts\requirements.txt --upgrade
pip install -r extensions\google_translate\requirements.txt --upgrade
pip install -r extensions\silero_tts\requirements.txt --upgrade
pip install -r extensions\whisper_stt\requirements.txt --upgrade
pip install -r extensions\openai\requirements.txt --upgrade
# llama-cpp-python cuda update direct link to install the wheel gave me an error
git clone https://github.com/CapitalBeyond/win-cuda-llama-cpp-python/
cd win-cuda-llama-cpp-python
pip install llama_cpp_python-0.1.49-cp310-cp310-win_amd64.whl --upgrade
cd..
copy venv\Scripts\activate.bat .
copy activate.bat start.bat
Add-Content start.bat "`python server.py --chat"
Write-Host "Done! run start.bat in the text-generation-webui folder"
pause |
can you tell me why when replacing a dll or folder I get an error - Traceback (most recent call last): |
I have the same issue |
I use it in a virtual environment (venv) under oobabooga and use python 3.10. It is always best to set up a virtual environment to work with phyton. This isolates different installations, which causes less errors and makes debugging easier. I use python venv (see example scripts), but there are also others like conda. I personally would do a normal (without cublas) reinstall in a venv. If everything works, then I would rename the existing llama_cpp folder like llama_cpp.old and copy the new complete cublas folder in. This way you always have a backup. You should always change the complete folder, otherwise it can happen that you mix versions. I hope these general hints could help a bit. |
for some reason in a new environment with python version 3.10.11 I get the same error when trying to import. By the way, with llama.cpp I can build cuBLAS version without any problems. |
Can you compile new and try it again with the latest version? |
now with the replacement of the dll with the one assembled using your script, cublas worked for me (on latest version lcp) finally, cool |
When I try and install using this script
I get the error
I've also tried installing with pip directly
but that gives me the error
which I think is caused by this issue: #208 |
I have created a PR #225 which fixes this issue on Windows. Details are in the PR comments |
For the installation and the solution that produced the result, see user jllllllllll's post: Problem to install llama-cpp-python on Windows 10 with GPU NVidia Support CUBlast, BLAS = 0 #721 |
i did everything that has mentioned about this , still cant run wit cuBLAS AVX = 1 | AVX2 = 1 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 0 | SSE3 = 1 | SSSE3 |
Did you find any solution? I am facing the same issue |
it works for me....
You need installed and working (PATH is the main problem):
git
python (i use 3.10.11)
cuda toolkit (i use 11.8)
I have visual studio 2022 community AND Build Tools 2019 installed.
cmake (click on path during installation and restart computer)
Copy the script and save it as: yourname.ps1 into an empty folder
Right click and run it with powershell.
I use the new compiled folder to replace the installed llama_cpp folder in oobabooga.
Processing the first prompt is faster with cublas and larger (512?) prompts. Generating prompt speed is the same.
At the moment the overall cublas speed increase is small for me. The main advantage is rather that you can compile yourself in windows.
If it works for more people, feel free to use the script for the readme.
The text was updated successfully, but these errors were encountered: