chatglm现在的reward model模型缺失吗？ #28

ymyjl · 2023-06-20T08:34:11Z

I checked to make sure that this is not a duplicate issue
I'm submitting the request to the correct repository (for model requests, see here)

chatglm 跑到rm那一步，会报错key error ，我想请教一下作者，现在的话还没有办法训练chatglm reward这个模型是吗？

shibing624 · 2023-06-21T02:40:02Z

是，不支持chatglm reward模型，原因是chatglm没有实现ModelForSequenceClassification，它的预测代码也没有上传到transformers官方库。我也没单独为它写ChatglmModelForSequenceClassification实现类。

原因是我觉得用chatglm做reward model，可能效果还不如robert，而且预测速度慢很多。

建议：reward model 直接调用chatgpt的api给打分是比较好的，如果可以用gpt4更好。

ymyjl · 2023-06-21T08:02:08Z

是，不支持chatglm reward模型，原因是chatglm没有实现ModelForSequenceClassification，它的预测代码也没有上传到transformers官方库。我也没单独为它写ChatglmModelForSequenceClassification实现类。

原因是我觉得用chatglm做reward model，可能效果还不如robert，而且预测速度慢很多。

建议：reward model 直接调用chatgpt的api给打分是比较好的，如果可以用gpt4更好。

非常感谢作者的回复，我还想问一下，如果还是使用chatglm，第四步rl应该没法进行吧？即使第三步可以用其他SequenceClassification打分，但是第四步chatglm也做不到呀，因为这个模型没有SequenceClassification

ymyjl · 2023-06-21T08:14:51Z

哦对，如果要自己写一个SequenceClassification，有可能吗？因为ChatGLMPreTrainedModel这个实现是有的。可以仿照 LlamaForSequenceClassification(LlamaPreTrainedModel)这种后面加一个分类头来做这个事情吗？我比较菜，希望作者能从高角度给个建议，是否可行，非常感谢！
不过第四步好像还涉及AutoModelForCausalLMWithValueHead，这个也没有。。

shibing624 · 2023-06-21T08:34:23Z

SequenceClassification 可以自己写；
AutoModelForCausalLMWithValueHead 也可以自己写，改下trl的trainer。

ymyjl added the enhancement New feature or request label Jun 20, 2023

This was referenced Jul 6, 2023

如果Stage1,2选用ChatGLM-6B作为基座model，Stage3训练奖励模型这里怎么设置呢？ #36

Closed

chatglm在奖励模型阶段报错，大佬指点 #74

Closed

shibing624 closed this as completed Sep 5, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

chatglm现在的reward model模型缺失吗？ #28

chatglm现在的reward model模型缺失吗？ #28

ymyjl commented Jun 20, 2023 •

edited

Loading

shibing624 commented Jun 21, 2023

ymyjl commented Jun 21, 2023

ymyjl commented Jun 21, 2023

shibing624 commented Jun 21, 2023

chatglm现在的reward model模型缺失吗？ #28

chatglm现在的reward model模型缺失吗？ #28

Comments

ymyjl commented Jun 20, 2023 • edited Loading

shibing624 commented Jun 21, 2023

ymyjl commented Jun 21, 2023

ymyjl commented Jun 21, 2023

shibing624 commented Jun 21, 2023

ymyjl commented Jun 20, 2023 •

edited

Loading