-
-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add multi-LoRA support for more architectures #2602
Labels
Comments
Is it possible to add "GPT-NeoX" as well? |
how about "chatglm" ? |
@Yard1 I'm interested in extending this to other architectures, do you want to meet and talk about the problems that need to be solved to get it working? |
+1, bump for Phi-3.0 (any Phi right now) |
+1 for Qwen |
1 similar comment
+1 for Qwen |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Currently, multi-LoRA supports only Llama and Mistral architectures. We should extend this functionality to all architectures.
Yi, Qwen, Phi and Mixtral architectures seem to be the most demanded right now.
One challenge will be ensuring that all allowed weight shapes are supported by punica kernels. We may need to investigate some sort of padding there.
Originally posted by @Yard1 in #1804 (comment)
The text was updated successfully, but these errors were encountered: