v0.2.2
v0.2.2
New distillation recipes, multi-domain RL support, and Kimi K2 weight merging.
New features
- SDFT (Self-Distillation Fine-Tuning) recipe with top-K distillation (#524)
- Off-policy top-K distillation for multi-teacher knowledge merging (#572)
- Kimi K2 / K2.5 shard-by-shard merge with INT4 expert dequant/requant (#573)
- InterleavedRLDatasetBuilder for multi-domain RL training (#570)
- 22 marimo tutorials (101–503) (#562)
Bug fixes
- Fix
tokenizer_classcorrupted toTokenizersBackendduring export (#582) - Fix experts-fp8
compression_configfor non-DeepSeek models (#580) - Fix
InterleavedRLDatasetcrash on ragged last source batch (#574) - Other minor bug fixes (#567, #577)
Other
- Extend SFT LR sweep to full Tinker model lineup (#575)
- Make image_processor optional for Qwen3VL renderers (#566)
See the full CHANGELOG.md for details.