v0.3.2
What's Changed
- Use AutoModel in case architecture doesn't exist by @kane-vln in #236
- [1/n] Support Deepseek V3 SFT by @heslami in #190
- [3/n] Support Deepseek V3 GRPO / Deepseek R1 by @heslami in #240
- fix: outdated import from leptonai by @xlu451 in #243
- trtllm-pytorch as the rollout backend. by @foreverlms in #161
- feat: lora for grpo by @xlu451 in #222
- Fix: sync_model_vocab corner case by @kane-vln in #242
- feat: Only do for trainable params in weight sync. by @lfengad in #238
- feat: custom logger support by @lfengad in #245
- fix: Refine logger to only specified in data packer script. by @lfengad in #246
- feat: data type control in weight transfer. by @lfengad in #247
- fix: Remove cosmos-rl dependency in launch_all.py in normal mode. by @lfengad in #248
- feat: sequence packing in training for optimization by @lfengad in #211
- fix: min_filter_prefix_tokens corner case by @jcao-ai in #251
New Contributors
Full Changelog: v0.3.1...v0.3.2