-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Open
Labels
enhancementNew feature or requestNew feature or request
Description
Describe the feature
Please describe the feature requested here(请在这里描述需求)
训练后的rm模型,希望可以支持推理框架部署,这样可以把rm模型抽离出来,训练grpo/ppo时,采用reward_url来指定rm的服务
Paste any useful information
Paste any useful information, including papers, github links, etc.(请在这里描述其他有用的信息,比如相关的论文地址,github链接等)
Additional context
Add any other context or information here(其他信息可以写在这里)
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or request