fix: Assign dtype of lora to base model dtype #82

tgaddair · 2023-11-29T20:22:09Z

Fixes #74.

When the model is quantized, the dtype of the weights (compressed) is typically uint or similar. This means if we cast the lora weights to the weight of the parent layer, it will corrupt them, rendering the output garbage. Instead, we should assign the dtype of the base model (float16 or bloat16) to the lora weights, which is consistent with how inference in QLoRA is done outside LoRAX (quantized weights + unquantized lora).

geoffreyangus

wow, nice catch!

Assign dtype of lora to base model dtype

7e42686

tgaddair requested review from geoffreyangus and arnavgarg1 November 29, 2023 20:22

geoffreyangus approved these changes Nov 29, 2023

View reviewed changes

tgaddair merged commit f1b9778 into main Nov 29, 2023
1 check passed

tgaddair deleted the fix-quant branch November 29, 2023 20:26

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: Assign dtype of lora to base model dtype #82

fix: Assign dtype of lora to base model dtype #82

tgaddair commented Nov 29, 2023

geoffreyangus left a comment

fix: Assign dtype of lora to base model dtype #82

fix: Assign dtype of lora to base model dtype #82

Conversation

tgaddair commented Nov 29, 2023

geoffreyangus left a comment

Choose a reason for hiding this comment