Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

将微调的 lora 权重导出到基本模型 #71

Open
ziwang-com opened this issue Jun 5, 2023 · 0 comments
Open

将微调的 lora 权重导出到基本模型 #71

ziwang-com opened this issue Jun 5, 2023 · 0 comments

Comments

@ziwang-com
Copy link
Owner

Lightning-AI/lit-llama#261
我们只保存 lora 权重的原因只是为了节省空间, 因为完整的检查点非常大.但是, 是的, 我们可以为 lora 检查点提供这样的转换, 这是一个很好的建议.实现大致如下:

将预训练权重和 lora 权重从检查点加载到 lora 模型中 (见generate_lora.py)
调用 model.eval() 将 lora 权重合并回常规权重
状态 = model.state_dict()
删除字典中与 lora 对应的所有条目
火炬保存(状态,...)


@awaelchli您能提供一些方向吗?对此功能进行 PR 会很棒,我认为拥有它会很棒!

我不明白跑步如何合并重量。model.eval()

LoRA和base之间的唯一区别是:

      (c_attn): Linear(in_features=4096, out_features=12288, bias=False)

versus

      (c_attn): MergedLinear(
        in_features=4096, out_features=12288, bias=False
        (lora_dropout): Dropout(p=0.05, inplace=False)
      )

当我打印出模型.参数(),一个层的片段时:

transformer.h.4.attn.c_attn.weight
transformer.h.4.attn.c_attn.lora_A
transformer.h.4.attn.c_attn.lora_B
那么以某种方式与增量重量合并吗?我无法弄清楚这在代码中发生在哪里。model.eval()weightlora_A @ lora_B

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant