Skip to content

Releases: togethercomputer/xorl-wheels

TransformerEngine 2.11.0

26 Mar 20:13
ee86957

Choose a tag to compare

TransformerEngine 2.11.0+c188b533

Built with:

  • Python 3.12
  • PyTorch 2.10.0
  • CUDA 13.0
  • CUDA architectures: sm_75, sm_80, sm_89, sm_90, sm_100, sm_120

TransformerEngine 2.10.0

26 Mar 21:40
ee86957

Choose a tag to compare

TransformerEngine 2.10.0+769ed778

Built with:

  • Python 3.12
  • PyTorch 2.10.0
  • CUDA 12.9
  • CUDA architectures: sm_70, sm_80, sm_89, sm_90, sm_100, sm_120

TileLang 0.1.10 + #2303 (CUDA 13.1, PyTorch 2.10)

31 May 20:55
ee86957

Choose a tag to compare

TileLang built from stock upstream tile-ai/tilelang at commit a8d93798 (post-v0.1.10), which includes PR #2303 ([CUDA] Support preferred copy instruction lowering, d0937562) — i.e. the T.copy(..., prefer_instruction="tma") API. The released PyPI tilelang==0.1.10 predates #2303 and lacks it.

  • Source: tile-ai/tilelang@a8d93798 (unmodified upstream)
  • Build: CUDA 13.1, for PyTorch 2.10 (cu129 runtime); cp38-abi3 → installs on CPython ≥3.10
  • Why hosted: consumed by xorl's FlashQLA GDN backend (XORL_GDN_BACKEND=flashqla). FlashQLA needs both the tl_gemm builtin (in stock ≥0.1.10) and prefer_instruction="tma" (#2303). The fast gemm_v1 path is re-added in-repo by xorl's tilelang_gemm_v1 shim (no tilelang source fork); this wheel is plain upstream.

Replace with PyPI tilelang>=0.1.11 once a release carrying #2303 ships.

Mamba SSM 2.3.1 + Causal Conv1d 1.6.1 (CUDA 12.9, PyTorch 2.10)

10 Apr 22:28
ee86957

Choose a tag to compare

Pre-compiled wheels for mamba-ssm 2.3.1 and causal-conv1d 1.6.1.

FlashAttention 3.0.0b1

26 Mar 20:13
ee86957

Choose a tag to compare

FlashAttention-3 v3.0.0b1

  • Requires CUDA 12.3+
  • Python 3.9+ (stable ABI)
  • Hopper (SM90) optimized

DeepGEMM 2.3.0

26 Mar 20:12
ee86957

Choose a tag to compare

DeepGEMM wheel for CUDA 13, PyTorch 2.10, Python 3.12

DeepEP 1.2.1 (CUDA 13.0)

27 Mar 04:45
ee86957

Choose a tag to compare

DeepEP 1.2.1+567632d

Built with:

  • Python 3.12
  • PyTorch 2.11.0
  • CUDA 13.0

DeepEP 1.2.1

26 Mar 20:12
ee86957

Choose a tag to compare

Pre-built DeepEP wheel (commit 567632d, torch 2.9, CUDA 12, Python 3.12)