Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

lora微调后的模型部署 #196

Closed
lzw12138 opened this issue Nov 21, 2023 · 7 comments · Fixed by #271
Closed

lora微调后的模型部署 #196

lzw12138 opened this issue Nov 21, 2023 · 7 comments · Fixed by #271

Comments

@lzw12138
Copy link

为什么我lora训练完以后合并成一个模型,进行转换bin,结果没有啥作用啊,和没有微调一样

@li-plus
Copy link
Owner

li-plus commented Nov 22, 2023

合并 lora 权重后可以先尝试 pytorch 加载起来推理,看看结果是否符合预期?

@lzw12138
Copy link
Author

lzw12138 commented Nov 22, 2023

合并 lora 权重后可以先尝试 pytorch 加载起来推理,看看结果是否符合预期?

结果是符合预期的,我用的是chatglm3

@lzw12138
Copy link
Author

合并lora权重后可以先尝试pytorch加载起来推理,看看结果是否符合预期?

我是通过LLama_factory这个项目来lora训练的,并且导出来了合并模型,用pytorch推理符合预期。使用chatglmcpp将其转换成q4_0模型,结果微调全部消失,是模型的量化问题还是我的训练问题呢?

@wanshichenguang
Copy link

合并lora权重后可以先尝试pytorch加载起来推理,看看结果是否符合预期?

我是通过LLama_factory这个项目来lora训练的,并且导出来了合并模型,用pytorch推理符合预期。使用chatglmcpp将其转换成q4_0模型,结果导致全部消失,是模型的量化问题还是我的训练问题呢?

我也遇到了这个问题,请问你解决没

@mayzhang16
Copy link

我也有这个问题,-l 不起效。另外先在pytorch里面合并微调后的模型再转换得到的效果会很差,转换的过程中提示:Some weights of the model checkpoint at /opt/model_all_lora_0824/ were not used when initializing ChatGLMForConditionalGeneration: ['lm_head.weight']

  • This IS expected if you are initializing ChatGLMForConditionalGeneration from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
  • This IS NOT expected if you are initializing ChatGLMForConditionalGeneration from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).

@logan0116
Copy link

我也遇到了类似的问题哈,查看了一下代码,537行

    with open(args.save_path, "wb") as f:
        convert(f, args.model_name_or_path, dtype=args.type)

这个里面没有带lora_model_name_or_path这个参数。[手动笑哭]

可以修改为:

    with open(args.save_path, "wb") as f:
        convert(f, args.model_name_or_path, lora_model_name_or_path=args.lora_model_name_or_path, dtype=args.type)

我在486行增加了

    print('lora load success')

可以看到是否载入。

我的测试任务是命名实体识别,微调之后效果还是比较明显的。

@li-plus
Copy link
Owner

li-plus commented Mar 6, 2024

我也遇到了类似的问题哈,查看了一下代码,537行

    with open(args.save_path, "wb") as f:
        convert(f, args.model_name_or_path, dtype=args.type)

这个里面没有带lora_model_name_or_path这个参数。[手动笑哭]

可以修改为:

    with open(args.save_path, "wb") as f:
        convert(f, args.model_name_or_path, lora_model_name_or_path=args.lora_model_name_or_path, dtype=args.type)

我在486行增加了

    print('lora load success')

可以看到是否载入。

我的测试任务是命名实体识别,微调之后效果还是比较明显的。

感谢指正~ 已经在 #271 修复了

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants