v0.2.0
v0.2.0
229 commits since v0.1.0 — a major release spanning new model support, a weights lifecycle, RL robustness, and infrastructure maturation.
Highlights
Renderer overhaul
- Tool calling support across Qwen3, DeepSeek V3, Kimi K2 with new
ToolSpecAPI - Structured content model —
ThinkingPartin content list replacesMessage.thinking(breaking) - Field renames:
prefix/content/suffix→header/output/stop_overlap(breaking) - Per-model module architecture;
Rendererchanged from Protocol to ABC - New models: Nemotron-3, Qwen3.5, Kimi K2.5 (text + vision)
- Custom renderer/tokenizer registration
Weights lifecycle (new)
- New
tinker_cookbook/weights/subpackage — download, merge, publish - Shard-by-shard merging for memory-efficient LoRA→base-model merge
- FP8 quantized export for MoE models
- PEFT-format adapter building for vLLM/HF serving
RL improvements
- Rollout error resilience — failures no longer crash the run
- Context limit handling in multi-turn environments
- Pluggable rollout executor for distributed rollouts
ActionExtrafor Env.step extensibility;EnvGroupBuilder.cleanup()- Async training hang fix on data exhaustion
Supervised learning
- SFT hyperparameter sweep with published results for 3 models
max_stepsparameter, streaming dataset batch skip fix
Infrastructure & packaging
- hatch-vcs versioning + nightly builds
- Slimmed core dependencies (recipe extras separated)
- Centralized exception hierarchy with picklability
- Deprecation framework for API evolution
- PEP 561
py.typedmarker; public API surface cleanup
New recipes
- Harbor RL (sandboxed terminal-bench), ifBench RLVR, tool-use agents library
- Multi-turn on-policy distillation, vision input, rubric-based eval
Environments & sandboxes
- Modal sandbox backend (warm pool, rate limiting, async)
- Configurable KL penalty reference model
- Pickle support for Renderer/Env (distributed execution)
Eval & logging
- Inspect AI improvements, renderer metadata persistence
- Logtree JSON + rollout summary JSONL exports
- Unified training telemetry with Wandb Gantt charts
- Per-iteration output subdirectories
Testing & CI
- Downstream API compatibility tests, weights e2e suite
- pytest markers, pyright CI, daily recipe smoke tests
See the full CHANGELOG.md for details.