-
Notifications
You must be signed in to change notification settings - Fork 950
GRPO Web-UI #4285
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GRPO Web-UI #4285
Conversation
swift/ui/llm_grpo/llm_grpo.py
Outdated
| return gr.update(open=True), gr.update(visible=True) | ||
|
|
||
| @classmethod | ||
| def train(cls, *args): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里和训练能复用一部分吗
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
复用了llm_train
swift/ui/llm_grpo/llm_grpo.py
Outdated
| }, | ||
| 'rlhf_type': { | ||
| 'label': { | ||
| 'zh': '人类对齐算法类型', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
grpo这个是不是没必要了
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
已删除
swift/ui/llm_grpo/llm_grpo.py
Outdated
| } | ||
| }, | ||
| 'train_stage': { | ||
| 'label': { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这个的作用是
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
同上面rlhf_type,已删除
swift/ui/llm_grpo/model.py
Outdated
| }, | ||
| 'clear_cache': { | ||
| 'value': { | ||
| 'zh': '删除训练记录', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这个能继承llm_train么
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
继承了llm_train.model.Model
swift/utils/constants.py
Outdated
| SWIFT_TYPE_KEY = 'swift_type' | ||
| DEFAULT_ADAPTER = 'default' | ||
|
|
||
| DEFAULT_SYSTEM = ('A conversation between User and Assistant. The user asks a question, and the Assistant solves it. ' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DEFAULT_SYSTEM不要放在这里,放在ui里好了
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
已移除
PR type
PR information
Support grpo web-ui
Experiment results
Paste your experiment result here(if needed).