Fix the random handoff and change default param #14

LovelyBuggies · 2025-09-24T16:55:08Z

No description provided.

… (mode, sandbox_slice, original/previous, expert_model).\n- Default external.mode=level_feedback; sandbox_slice=1 (supports 0/None/'all').\n- Handoff handled in CoMLRL trainer with strict modes; expose magrpo/grpo.handoff.\n- Update HumanEval/CHE splits (HE train 33:163, eval :32; CHE train 16:, eval :16).\n- Set output.save_final_model=false by default.\n- Set wandb.dir and output.base_dir to storage paths by trainer/mode:\n * ST GRPO: output_st_grpo, ST MAGRPO: output_st_magrpo\n * MT GRPO: output_mt_grpo, MT MAGRPO: output_mt_magrpo\n- Rename expert model key to external.expert_model (used only for expert_edits).\n- Simplify YAML comments to section headers only.\n- Read external.* in train_magrpo.py and train_grpo.py; defaults adjusted.\n- README: clarify external keys and sandbox_slice semantics.

…xceptions; minimal default tags

…train_magrpo.py and train_grpo.py

… avoid UnboundLocalError

* make random default * reset train num and levelfeedback as default * delete files no use * fix che train too less * Config and external overhaul:\n\n- Introduce unified external section (mode, sandbox_slice, original/previous, expert_model).\n- Default external.mode=level_feedback; sandbox_slice=1 (supports 0/None/'all').\n- Handoff handled in CoMLRL trainer with strict modes; expose magrpo/grpo.handoff.\n- Update HumanEval/CHE splits (HE train 33:163, eval :32; CHE train 16:, eval :16).\n- Set output.save_final_model=false by default.\n- Set wandb.dir and output.base_dir to storage paths by trainer/mode:\n * ST GRPO: output_st_grpo, ST MAGRPO: output_st_magrpo\n * MT GRPO: output_mt_grpo, MT MAGRPO: output_mt_magrpo\n- Rename expert model key to external.expert_model (used only for expert_edits).\n- Simplify YAML comments to section headers only.\n- Read external.* in train_magrpo.py and train_grpo.py; defaults adjusted.\n- README: clarify external keys and sandbox_slice semantics. * Remove unnecessary try/except; robust sandbox_slice parsing without exceptions; minimal default tags * Fix: define external_cfg before use; remove duplicate assignments in train_magrpo.py and train_grpo.py * Fix: handle dataset load failure in train_magrpo.py (return early) to avoid UnboundLocalError * Configs: reduce num_train_epochs by 20% (rounded) across all YAMLs

LovelyBuggies added 9 commits September 22, 2025 17:21

make random default

8eceeec

reset train num and levelfeedback as default

abee26e

delete files no use

bae4a8d

fix che train too less

fc66a4c

Remove unnecessary try/except; robust sandbox_slice parsing without e…

1c9996c

…xceptions; minimal default tags

Fix: define external_cfg before use; remove duplicate assignments in …

056dde1

…train_magrpo.py and train_grpo.py

Fix: handle dataset load failure in train_magrpo.py (return early) to…

be38668

… avoid UnboundLocalError

Configs: reduce num_train_epochs by 20% (rounded) across all YAMLs

376ba9f

LovelyBuggies merged commit 9331ca2 into main Sep 24, 2025

LovelyBuggies deleted the new branch September 24, 2025 16:57

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix the random handoff and change default param #14

Fix the random handoff and change default param #14

Uh oh!

LovelyBuggies commented Sep 24, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Fix the random handoff and change default param #14

Fix the random handoff and change default param #14

Uh oh!

Conversation

LovelyBuggies commented Sep 24, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants