Skip to content

v0.4.0

Choose a tag to compare

@derek-tml derek-tml released this 08 May 21:20
· 35 commits to main since this release
5306f45

What's Changed

  • Improve tic-tac-toe self-play recipe by @YujiaBao in #634
  • Extract tic-tac-toe opponents from env.py into tictactoe.py (#634 follow-up) by @YujiaBao in #647
  • Fix broken quickstart link in README by @YujiaBao in #649
  • Fix 8 broken tinker-docs links by @YujiaBao in #650
  • Consolidate 7 skills into 2 (research + debug) and fix plugin manifest by @YujiaBao in #651
  • Fix broken RL Environments link in tutorial 301 by @YujiaBao in #657
  • Bump transformers upper bound to 5.5.3, drop CI version matrix by @YujiaBao in #654
  • tutorials: add an optional text input for the Tinker API key by @akshayka in #656
  • fix(docs): update TokensWithLogprobs usage in tutorial to use maybe_logprobs by @tyfeng1997 in #658
  • fix: normalize SFT weights per-example for consistent gradient magnitudes by @bledden in #334
  • Add more runahead to supervised training loop by @sshleifer in #648
  • feat: add Nemotron-3 low thinking renderer for Super model by @YujiaBao in #667
  • fix(docs): correct TinkerAPI to Tinker API in recipes README by @Jah-yee in #669
  • feat: add support for Qwen3.6-35B-A3B by @YujiaBao in #674
  • feat: add Kimi-K2.6 model support by @YujiaBao in #675
  • sdft: reverse KL fix by @likenneth in #677
  • feat: add support for Qwen3.6-27B by @YujiaBao in #681
  • Unify checkpoint machinery by @danobi in #676
  • fix(pyright): disable reportPrivateImportUsage for torch by @YujiaBao in #686
  • [ckpt] Support fire-and-forget periodic checkpoints by @danobi in #687
  • Fix failing CI tests by @nealwu in #691
  • Updates for 101 Hello Tinker by @nealwu in #690
  • Replace parse_success bool with ParseTermination enum (fixes base-model eval regression) by @dphuang2 in #688
  • HarborReward._upload_tests: recurse into nested fixture dirs by @danqi in #693
  • evaluate_task: catch sandbox creation errors by @danqi in #694
  • Add cross-account checkpoint copy recipe by @derek-tml in #692
  • Empty commit to re-run CI by @nealwu in #696
  • Align Kimi K2.6 revision pin by @derek-tml in #697

New Contributors

Full Changelog: v0.3.0...v0.4.0