v0.0.10
What's Changed
- feat(sglang): test and fix bugs with GQA by @jimmy-evo in #53
- feat(sglang): sync load tokens between TP and PP ranks by @jimmy-evo in #54
- ci(cz): add commitizen to check commit message format by @jimmy-evo in #55
- core:fix pin leak when num_computed_tokens is not none by @wz1qqx in #57
- feat(ssd): pipeline prefetch with dispatcher+worker for cross-batch parallelism by @xiaguan in #58
- fix(sglang): remove pp sync in sglang by @jimmy-evo in #60
- feat(sglang): remove cuda sync, support save async threads by @jimmy-evo in #62
- chore(python): add .pyi type stubs for PyO3 bindings by @jimmy-evo in #64
- docs: add version-bump skill and commitizen format notes by @jimmy-evo in #63
- chore(deps): cleanup and upgrade dependencies by @xiaguan in #65
- feat(server): support multi-device initialization with auto-detection by @jimmy-evo in #67
- feat(ci): add pre-commit hooks for code quality checks by @jimmy-evo in #66
- chore: bump version to 0.0.10 by @jimmy-evo in #68
Full Changelog: v0.0.9...v0.0.10