v0.2.3

yogabbagabb released this 12 Jun 22:32

· 56 commits to main since this release

f723d72

What's Changed

Add token accounting for inference requests by @yogabbagabb in #14
Make context-size budget configurable via SYNTHEFY_MAX_ELEMENTS_BUDGET by @sagarwal-atg in #15
Add OpenAI-compatible usage block to inference output by @yogabbagabb in #16
Enable HF download tracking; scrub stale classification + gated-repo docs by @sagarwal-atg in #18
Fix Truss deploy: bundle token_accounting so production /predict stops 500ing by @yogabbagabb in #19
Add benchmark reproduction harness and pin cu128 torch build by @realPohanLi in #21
Release 0.2.3 by @yogabbagabb in #24

New Contributors

@realPohanLi made their first contribution in #21

Full Changelog: v0.2.2...v0.2.3

Contributors

yogabbagabb, sagarwal-atg, and realPohanLi

Assets 2