Releases · togethercomputer/xorl-wheels · GitHub

26 Mar 20:13

qywu

transformer_engine_2.11.0

TransformerEngine 2.11.0

TransformerEngine 2.11.0+c188b533

Built with:

Python 3.12
PyTorch 2.10.0
CUDA 13.0
CUDA architectures: sm_75, sm_80, sm_89, sm_90, sm_100, sm_120

Assets 3

26 Mar 21:40

qywu

transformer_engine_2.10.0

TransformerEngine 2.10.0

TransformerEngine 2.10.0+769ed778

Built with:

Python 3.12
PyTorch 2.10.0
CUDA 12.9
CUDA architectures: sm_70, sm_80, sm_89, sm_90, sm_100, sm_120

Assets 3

31 May 20:55

tilelang_0.1.10_cu131

TileLang 0.1.10 + #2303 (CUDA 13.1, PyTorch 2.10) Latest

Latest

TileLang built from stock upstream tile-ai/tilelang at commit a8d93798 (post-v0.1.10), which includes PR #2303 ([CUDA] Support preferred copy instruction lowering, d0937562) — i.e. the T.copy(..., prefer_instruction="tma") API. The released PyPI tilelang==0.1.10 predates #2303 and lacks it.

Source: tile-ai/tilelang@a8d93798 (unmodified upstream)
Build: CUDA 13.1, for PyTorch 2.10 (cu129 runtime); cp38-abi3 → installs on CPython ≥3.10
Why hosted: consumed by xorl's FlashQLA GDN backend (XORL_GDN_BACKEND=flashqla). FlashQLA needs both the tl_gemm builtin (in stock ≥0.1.10) and prefer_instruction="tma" (#2303). The fast gemm_v1 path is re-added in-repo by xorl's tilelang_gemm_v1 shim (no tilelang source fork); this wheel is plain upstream.

Replace with PyPI tilelang>=0.1.11 once a release carrying #2303 ships.

Assets 3

10 Apr 22:28

qywu

mamba_ssm_2.3.1_cu129

Mamba SSM 2.3.1 + Causal Conv1d 1.6.1 (CUDA 12.9, PyTorch 2.10)

Pre-compiled wheels for mamba-ssm 2.3.1 and causal-conv1d 1.6.1.

Python 3.12, Linux x86_64
Built against PyTorch 2.10.0+cu129, CUDA 12.9
Source: https://github.com/state-spaces/mamba

Assets 4

26 Mar 20:13

qywu

FlashAttention 3.0.0b1

FlashAttention-3 v3.0.0b1

Requires CUDA 12.3+
Python 3.9+ (stable ABI)
Hopper (SM90) optimized

Assets 5

26 Mar 20:12

qywu

DeepGEMM 2.3.0

DeepGEMM wheel for CUDA 13, PyTorch 2.10, Python 3.12

Assets 4

27 Mar 04:45

qywu

deepep_1.2.1_cu130

DeepEP 1.2.1 (CUDA 13.0)

DeepEP 1.2.1+567632d

Built with:

Python 3.12
PyTorch 2.11.0
CUDA 13.0

Assets 3

26 Mar 20:12

qywu

DeepEP 1.2.1

Pre-built DeepEP wheel (commit 567632d, torch 2.9, CUDA 12, Python 3.12)

Assets 4