v0.2.1
v0.2.1
A patch release with weight merging fixes and several smaller additions.
Bug fixes
- Fix
build_hf_modelfor Nemotron fused Mamba projections and backbone prefix (#549) - Fix
build_lora_adapterfor Nemotron empty expert LoRA and fused Mamba projections (#548) - Fix
build_hf_modelfor Qwen3.5 MoE with partial LoRA coverage (#554)
New features
- Auto-generated model cards for HuggingFace Hub publishing (#543)
- Interleaved multi-source SFT chat dataset builder (#557)
- Rolling checkpoints for cheaper training resume (#534)
Other
- Consistent Google-style docstrings across all public APIs (#559)
- E2e test for adapter upload to HuggingFace Hub (#544)
- Align synthetic expert adapters with real Tinker broadcast patterns (#556)
See the full CHANGELOG.md for details.