1.3.0.dev20260603004028
·
113 commits
to main
since this release
Installation
Via PyPI
pip install pjrt-plugin-tt==1.3.0.dev20260603004028 --extra-index-url https://pypi.eng.aws.tenstorrent.com/
pip install vllm-tt==1.3.0.dev20260603004028 --extra-index-url https://pypi.eng.aws.tenstorrent.com/Via Docker
docker pull ghcr.io/tenstorrent/tt-xla-slim:1.3.0.dev20260603004028What's Changed
- Add streaming inference for DeepSeek-V4-Flash by @sshonTT in #4811
- [vLLM] Improve test diagnostics by enabling basic logs by @mmanzoorTT in #4879
- Uplift third_party/tt_forge_models to 363958eba679bef0cf12fe6ed39e22e917048851 2026-06-02 by @vmilosevic in #5049
- Bump version to 1.3.0 by @vvukomanTT in #4991
- [wheel] Support bundling libtt-alchemist-lib.so into manylinux wheel by @svuckovicTT in #5050
- Set default kv cache dtype to bfp_bf8 by @kdimicTT in #4613
- [vLLM] Pin input shardings in the execution path to match warmup by @sshonTT in #5035
Full Changelog: 1.2.0.dev20260602003545...1.3.0.dev20260603004028