Skip to content

v0.2.1

Choose a tag to compare

@YujiaBao YujiaBao released this 29 Mar 04:11
· 123 commits to main since this release
5f4219e

v0.2.1

A patch release with weight merging fixes and several smaller additions.

Bug fixes

  • Fix build_hf_model for Nemotron fused Mamba projections and backbone prefix (#549)
  • Fix build_lora_adapter for Nemotron empty expert LoRA and fused Mamba projections (#548)
  • Fix build_hf_model for Qwen3.5 MoE with partial LoRA coverage (#554)

New features

  • Auto-generated model cards for HuggingFace Hub publishing (#543)
  • Interleaved multi-source SFT chat dataset builder (#557)
  • Rolling checkpoints for cheaper training resume (#534)

Other

  • Consistent Google-style docstrings across all public APIs (#559)
  • E2e test for adapter upload to HuggingFace Hub (#544)
  • Align synthetic expert adapters with real Tinker broadcast patterns (#556)

See the full CHANGELOG.md for details.