-
Notifications
You must be signed in to change notification settings - Fork 959
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Not able to install locally #1788
Comments
You need to re-install vllm and flash-attention-v2 rm -rf flash-attention-v2 They forgot to add this to the release notes about local installs. |
I have been installing all of the extensions via those commands for 2 days now; |
I feel you, did exactly the same. install/delete about 4 times |
You can follow the steps in the Dockerfile, after compile flash-attn with cmd 'make install-flash..‘, the script moves the compiled file to python's site-package folder, just like |
have resolved the issues using the following set of install-scripts; Usually, if u have required version of cmake, libkineto, protobuff & rust installed you can directly run
use other scripts in the directory as required. for other system and driver details see - https://github.com/nyunAI/Faster-LLM-Survey/blob/A100TGIv2.0.1/experiment_details.txt ps. maintainer can close this. leaving open for anyone facing a similar issue. |
When install vllm for TGI-2.0.1, I came across : error: triton 2.3.0 is installed but triton==2.1.0 is required by {'torch'}
make: *** [Makefile-vllm:12: install-vllm-cuda] Error 1 Is this because I use wrong vllm version. I don't modify anything in the Makefile-* scriot |
Your PyTorch version might be different. I faced this issue for the same reason that my PyTorch version was higher than torch==2.1.0 and hence the default triton that was installed was 2.2.0 (afair). install torch==2.1.0 or use install-tgi.sh |
Build and install rotary and layer_norm from https://github.com/Dao-AILab/flash-attention/tree/main/csrc. |
This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days. |
System Info
Information
Tasks
Reproduction
I have a local model quantised with autoawq; even tried with bloke awq for llama 2 7b from hf directly
use the command:
Expected behavior
The server should start;
I have all the packages installed using the commands mentioned to install using make
The text was updated successfully, but these errors were encountered: