Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(应该是bug):chatglm3在agent tuning数据前处理的时候,没有拼接<|assistant|> #2421

Closed
1 task done
13269279918 opened this issue Feb 4, 2024 · 7 comments
Closed
1 task done
Labels
solved This problem has been already solved.

Comments

@13269279918
Copy link

Reminder

  • I have read the README and searched the existing issues.

Reproduction

在数据中{“from”:"gpt","content":"xxx"},gpt对应的应该是assistant。
但是在前处理代码中
format_assistant=StringFormatter(slots=["\n", "{{content}}"]),
并没有在{content}前拼接<|assistant|>

Expected behavior

No response

System Info

No response

Others

No response

@hiyouga
Copy link
Owner

hiyouga commented Feb 4, 2024

拼过了

format_user=StringFormatter(slots=[{"token": "<|user|>"}, "\n", "{{content}}", {"token": "<|assistant|>"}]),

@hiyouga hiyouga added the invalid This doesn't seem right label Feb 4, 2024
@hiyouga hiyouga closed this as completed Feb 4, 2024
@13269279918
Copy link
Author

拼过了

format_user=StringFormatter(slots=[{"token": "<|user|>"}, "\n", "{{content}}", {"token": "<|assistant|>"}]),

format_user后面拼了,format_observation后面没拼。你用glaive跑一下,然后看打印出来数据样例就能看出来问题。

@13269279918
Copy link
Author

format_user后面的<|assistant|>是给function_call预留的,observation后面也应该给gpt预留一个<|assistant|>,不然最终拼接出来的数据有问题。

@hiyouga hiyouga added solved This problem has been already solved. and removed invalid This doesn't seem right labels Feb 4, 2024
@hiyouga hiyouga reopened this Feb 4, 2024
@hiyouga
Copy link
Owner

hiyouga commented Feb 4, 2024

感谢提醒,已修复

@hiyouga hiyouga closed this as completed in 3dc86c4 Feb 4, 2024
@13269279918
Copy link
Author

13269279918 commented Feb 5, 2024

感谢提醒,已修复

建议好好检查一下对agent-tool的训练代码,这部分应该还有很多隐含问题。因为训练效果非常差。
比如我遇到了一个问题:输出的时候每一字后面都会被加上一个空格。(目前没找到该问题的原因)

@hiyouga
Copy link
Owner

hiyouga commented Feb 5, 2024

具体描述一下?

@cgq0816
Copy link

cgq0816 commented Mar 20, 2024

format_user后面的<|assistant|>是给function_call预留的,observation后面也应该给gpt预留一个<|assistant|>,不然最终拼接出来的数据有问题。

你好,想问一下训练出来的模型能调用function call功能吗?我自己训练出来的不能调用,请问一下如果能,训练的参数有哪些呢?

sangttruong pushed a commit to painkillernhat/LLaMA-Factory that referenced this issue May 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
solved This problem has been already solved.
Projects
None yet
Development

No branches or pull requests

3 participants