Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: single lora request error make all processing requests error #4879

Open
jinzhen-lin opened this issue May 17, 2024 · 0 comments 路 May be fixed by #5173
Open

[Bug]: single lora request error make all processing requests error #4879

jinzhen-lin opened this issue May 17, 2024 · 0 comments 路 May be fixed by #5173
Labels
bug Something isn't working

Comments

@jinzhen-lin
Copy link
Contributor

Your current environment

The output of `python collect_env.py`

馃悰 Describe the bug

Vllm load lora checkpoints when executing model

https://github.com/vllm-project/vllm/blob/v0.4.2/vllm/worker/model_runner.py#L789-L790

https://github.com/vllm-project/vllm/blob/v0.4.2/vllm/lora/worker_manager.py#L138-L172

Then when we get an error when loading lora checkpoint (e.g. lora rank > max_lora_rank), all processing requests would fail (no matter whether other requests use lora).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
1 participant