feat(tinker): add TinkerNativeBackend #532

corbt · 2026-01-23T07:58:15Z

Summary

Add a separate TinkerNativeBackend that implements native Tinker loss/checkpoint flow with renderer-based data conversion and an in-process OpenAI-compatible server.
Define a new tinker dependency group (fastapi/uvicorn/tinker/tinker-cookbook) and conditionally export Tinker backends to keep base installs light.
Align LocalBackend model base_path with backend path and add an integration test for the Tinker native flow.

This is an alternative to #523. That PR tried to combine native Tinker behavior with ART-native behavior in a single backend, which made the code too complex to reason about. Here we split it into a separate native backend. Downside: args are less compatible with LocalBackend. Upside: we can take advantage of Tinker's functionality more explicitly.

Experimental: not yet tested with multi-turn rollouts or tool-calls, but the yes-no-maybe flow converges (avg reward ~0.955 by step 4).

Test plan

uv run pytest tests/integration/test_tinker_native_backend.py -v -s
Manual yes-no-maybe style loop (16 rollouts/prompt, converged by step 4).

Separate native Tinker training/inference from LocalBackend to keep the API clear while enabling explicit loss/checkpoint behavior and config.

Align tinker native types with OpenAI tooling and update tests to avoid invalid type expressions under pyright.

Use merge_state for backend persistence to avoid clobbering model state, and fail fast on trajectories without Choice objects to prevent no-op training. Expose policy version fields on trajectories for off-policy tracking.

Cursor Bot added 3 commits January 23, 2026 07:57

feat: add TinkerNativeBackend for native training

a801864

Separate native Tinker training/inference from LocalBackend to keep the API clear while enabling explicit loss/checkpoint behavior and config.

fix: address pre-commit type and format issues

c76738a

Align tinker native types with OpenAI tooling and update tests to avoid invalid type expressions under pyright.

feat: add safer state merge and policy tracking

8798b9e

Use merge_state for backend persistence to avoid clobbering model state, and fail fast on trajectories without Choice objects to prevent no-op training. Expose policy version fields on trajectories for off-policy tracking.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(tinker): add TinkerNativeBackend #532

feat(tinker): add TinkerNativeBackend #532

Uh oh!

corbt commented Jan 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

feat(tinker): add TinkerNativeBackend #532

Are you sure you want to change the base?

feat(tinker): add TinkerNativeBackend #532

Uh oh!

Conversation

corbt commented Jan 23, 2026

Summary

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants