v0.2.3
What's Changed
- Add token accounting for inference requests by @yogabbagabb in #14
- Make context-size budget configurable via SYNTHEFY_MAX_ELEMENTS_BUDGET by @sagarwal-atg in #15
- Add OpenAI-compatible usage block to inference output by @yogabbagabb in #16
- Enable HF download tracking; scrub stale classification + gated-repo docs by @sagarwal-atg in #18
- Fix Truss deploy: bundle token_accounting so production /predict stops 500ing by @yogabbagabb in #19
- Add benchmark reproduction harness and pin cu128 torch build by @realPohanLi in #21
- Release 0.2.3 by @yogabbagabb in #24
New Contributors
- @realPohanLi made their first contribution in #21
Full Changelog: v0.2.2...v0.2.3