v0.10.3a2

Pre-release

Pre-release

rebel-shshin released this 15 Apr 11:49

· 123 commits to dev since this release

4576272

What's Changed

other(test): improve worker test coverage by @rebel-jinhwan in #517
other(tests): tests for bucketing manager by @huijjj in #521
refactor(scheduler): delay caching instead of undoing by @rebel-jaehwang in #525
fix(encoder): use RBLNClassifierPooler for classification models by @rebel-jonghewk in #526
fix: remove unused util func by @rebel-kblee in #530
fix(config): ensure max_num_batched_tokens >= max_source_positions for enc-dec by @rebel-jonghewk in #527
fix(core): deduplicate KV cache inputs for torch.export compatibility by @rebel-chanheo in #524
fix(core): handle meta tensors in KV cache storage key computation by @rebel-chanheo in #533
fix: skip NUMA cpu affinity on bare metal to prevent 32x latency regression by @rebel-jonghewk in #529
fix(gemma3): use non-negative sentinel for IMG_PAD_TOKEN_ID by @rebel-jonghewk in #528
fix(model): fix moe model attribute by @rebel-kblee in #531
fix: use num_tokens_no_spec in optimum model runner by @rebel-seinpark in #536
fix(encoder): fix T5EncoderModel scoring mismatch after v0.18.1 bump by @rebel-jonghewk in #532

New Contributors

@rebel-chanheo made their first contribution in #524

Full Changelog: v0.10.3a1...v0.10.3a2

Contributors

huijjj, rebel-kblee, and 5 other contributors

Assets 2