Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add multi-LoRA support for more architectures #2602

Open
Yard1 opened this issue Jan 25, 2024 · 6 comments
Open

Add multi-LoRA support for more architectures #2602

Yard1 opened this issue Jan 25, 2024 · 6 comments

Comments

@Yard1
Copy link
Collaborator

Yard1 commented Jan 25, 2024

Currently, multi-LoRA supports only Llama and Mistral architectures. We should extend this functionality to all architectures.

Yi, Qwen, Phi and Mixtral architectures seem to be the most demanded right now.

One challenge will be ensuring that all allowed weight shapes are supported by punica kernels. We may need to investigate some sort of padding there.

Originally posted by @Yard1 in #1804 (comment)

@TaeWoo21
Copy link

Is it possible to add "GPT-NeoX" as well?

@Cloopen-ReLiNK
Copy link

how about "chatglm" ?

@FurtherAI
Copy link
Contributor

@Yard1 I'm interested in extending this to other architectures, do you want to meet and talk about the problems that need to be solved to get it working?

@nightflight-dk
Copy link

+1, bump for Phi-3.0 (any Phi right now)

@jjjjohnson
Copy link

+1 for Qwen

1 similar comment
@jiauy
Copy link

jiauy commented Jun 14, 2024

+1 for Qwen

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

8 participants