Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

是否可以增加联合训练的功能? #4921

Closed
1 task done
zhengjie-zhou opened this issue Jul 22, 2024 · 2 comments
Closed
1 task done

是否可以增加联合训练的功能? #4921

zhengjie-zhou opened this issue Jul 22, 2024 · 2 comments
Labels
solved This problem has been already solved

Comments

@zhengjie-zhou
Copy link

Reminder

  • I have read the README and searched the existing issues.

System Info

no

Reproduction

no

Expected behavior

目前工程中集成了DPO、PPO、KTO、SFT等训练方式,是否可以新增对他们的组合功能,比如$L= \alpha * L_{SFT} + \beta * L_{DPO}$ ,其中$\alpha$和$\beta$属于超参数。

Others

No response

@github-actions github-actions bot added the pending This problem is yet to be addressed label Jul 22, 2024
@hiyouga
Copy link
Owner

hiyouga commented Jul 22, 2024

请见 pref_ftx 参数

@hiyouga hiyouga added solved This problem has been already solved and removed pending This problem is yet to be addressed labels Jul 22, 2024
@hiyouga hiyouga closed this as completed Jul 22, 2024
@zhengjie-zhou
Copy link
Author

zhengjie-zhou commented Jul 22, 2024

请见 pref_ftx 参数

pref_ftx: float = field(
    default=0.0,
    metadata={"help": "The supervised fine-tuning loss coefficient in DPO training."},
)
那如果我想联合DPO和KTO进行训练,该如何调整? @hiyouga 

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
solved This problem has been already solved
Projects
None yet
Development

No branches or pull requests

2 participants