Error while saving the model under 4bit lora #381

visionxyz · 2024-07-22T04:36:34Z

I used

--load_in_4bit
--lora_rank 8
--lora_alpha 16
--lora_dropout 0.05
--zero_stage 2 \

in the training script, then failed to save the model after training. I find it is a problem of load_in_4bit, if it is used,
this part (line 292) in deepspeed.py will be triggered

assert state_dict_keys.issubset(
output_state_dict_keys
), f"mismatch keys {output_state_dict_keys.symmetric_difference(state_dict_keys)}"

I think the training part is okay, but somehow the keys are changed after using both Lora and load in 4-bit.

Thank you so much for providing this tool! It would be appreciated if you could give me some hints about this problem.

hijkzzz · 2024-07-22T07:35:10Z

it's strange. You could try to comment on this line or debug it.

visionxyz · 2024-07-22T07:36:59Z

Thank you! I currently comment this line. If I have time, I will debug it and report the reason.

visionxyz closed this as completed Jul 22, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error while saving the model under 4bit lora #381

Error while saving the model under 4bit lora #381

visionxyz commented Jul 22, 2024 •

edited

Loading

hijkzzz commented Jul 22, 2024

visionxyz commented Jul 22, 2024

Error while saving the model under 4bit lora #381

Error while saving the model under 4bit lora #381

Comments

visionxyz commented Jul 22, 2024 • edited Loading

hijkzzz commented Jul 22, 2024

visionxyz commented Jul 22, 2024

visionxyz commented Jul 22, 2024 •

edited

Loading