v0.2.2

YujiaBao released this 02 Apr 19:33

· 105 commits to main since this release

41bccbf

v0.2.2

New distillation recipes, multi-domain RL support, and Kimi K2 weight merging.

New features

SDFT (Self-Distillation Fine-Tuning) recipe with top-K distillation (#524)
Off-policy top-K distillation for multi-teacher knowledge merging (#572)
Kimi K2 / K2.5 shard-by-shard merge with INT4 expert dequant/requant (#573)
InterleavedRLDatasetBuilder for multi-domain RL training (#570)
22 marimo tutorials (101–503) (#562)

Bug fixes

Fix tokenizer_class corrupted to TokenizersBackend during export (#582)
Fix experts-fp8 compression_config for non-DeepSeek models (#580)
Fix InterleavedRLDataset crash on ragged last source batch (#574)
Other minor bug fixes (#567, #577)

Other

Extend SFT LR sweep to full Tinker model lineup (#575)
Make image_processor optional for Qwen3VL renderers (#566)

See the full CHANGELOG.md for details.

Assets 2