Skip to content

Conversation

mi804
Copy link
Collaborator

@mi804 mi804 commented Feb 25, 2025

PR type

  • Bug Fix
  • Document Updates
  • More Models or Datasets Support

PR information

  1. add grpo countdown task training experiments.
  2. fix bug for format reward.

numbers = row['nums']
target = row.pop('response', None)
query = f"""
Using the numbers {numbers}, create an equation that equals {target}.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里的换行

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

已修复

@mi804 mi804 merged commit dde4f09 into modelscope:main Feb 25, 2025
1 of 2 checks passed
tastelikefeet added a commit to tastelikefeet/swift that referenced this pull request Feb 26, 2025
…soth_fast_grpo

* commit 'df8939818d2b3694d14120d8fb07eea96e5b99a8': (24 commits)
  GRPO+LMDeploy 0.7 (modelscope#3277)
  fix lmdeploy (modelscope#3274)
  compat lmdeploy 0.7 (modelscope#3256)
  Fix typos (modelscope#3266)
  Support the base64 format of generated images for JanusPro (modelscope#3265)
  grpo_countdown & fix format reward (modelscope#3269)
  fix grpo compat transformers==4.47.* (modelscope#3252)
  save val_dataset (modelscope#3248)
  fix  grpo single gpu(modelscope#3246)
  fix grpo npu vllm (modelscope#3242)
  update docs (modelscope#3243)
  support muon optimizer (modelscope#3234)
  support moonlight (modelscope#3232)
  fix deepseek_vl2 (modelscope#3233)
  fix docs zh (modelscope#3231)
  Speed up GRPO (modelscope#3229)
  update docs (modelscope#3230)
  fix load args (modelscope#3226)
  Update the JanusPro-generation (modelscope#3221)
  Support the generation of JanusPro models (modelscope#3218)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants