Combine multi-LoRA and quantization #2601

Yard1 · 2024-01-25T17:28:12Z

There is no fundamental reason why multi-LoRA cannot work with quantized models. We will most likely want to keep LoRA's unquantized and dequantize the base model output before applying LoRAs with punica kernels. That seems to be the pattern present in other projects too.

Originally posted by @Yard1 in #1804 (comment)

jacob-hansen · 2024-02-21T15:26:17Z

Has there been any progress with this? Or has anyone tested multi-LoRA with different quantization to see what might work?

thincal · 2024-03-23T14:40:17Z

@Yard1 is there any plan for this support? really depends on this wonderful feature and also needs to know the real effect. thanks.

whyiug · 2024-04-28T11:51:07Z

@thincal #4012

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Combine multi-LoRA and quantization #2601

Combine multi-LoRA and quantization #2601

Yard1 commented Jan 25, 2024

jacob-hansen commented Feb 21, 2024

thincal commented Mar 23, 2024

whyiug commented Apr 28, 2024

Combine multi-LoRA and quantization #2601

Combine multi-LoRA and quantization #2601

Comments

Yard1 commented Jan 25, 2024

jacob-hansen commented Feb 21, 2024

thincal commented Mar 23, 2024

whyiug commented Apr 28, 2024