v0.3.3
What's Changed
- Fix DDP for SFTTrainer by @kane-vln in #239
- Unify validation config by @Dinghow in #255
- Support of passing custom arguments to custom dataset script. by @foreverlms in #253
- fix: heartbeat opt for heavy cpu case. by @lfengad in #257
- Refactor: reset named_buffers in load_hf_weights by @kane-vln in #252
- fix:Remove transformer-engine dependence in requirement by @lfengad in #258
- [cleanup] Remove the old Deepseek-V3 implemention by @heslami in #260
- fix: reward filter in dynamic sampling + rollout outdate pause generation by @lfengad in #263
- [Fix] Deepseek V3 GRPO bug fix by @heslami in #259
- feat: Support epoch level save frequency by
save_freq_in_epochby @lfengad in #264 - InternVL sft support by @kane-vln in #254
- Revert "Support of passing custom arguments to custom dataset script.… by @gekurian in #267
- Fix the bug introduced in PR #252 by @gekurian in #268
- Support of passing custom arguments to custom dataset script by @foreverlms in #275
- Fallback to hfmodel pass if build model fails by @kane-vln in #276
- Fix: sync named_buffer for hfmodel in grpo mode by @kane-vln in #277
- fix: cpu intensive situation aware for controller and reward by @lfengad in #279
- fix: Lepton cross-node job sync for host preparation before start. by @lfengad in #278
- feat: SFT validation dataset and packer specification support by @lfengad in #281
Full Changelog: v0.3.2...v0.3.3