Pulse · pytorch/ao · GitHub

May 10, 2025 – May 17, 2025

Overview

28 Active pull requests

3 Active issues

Could not load contribution data

Please try again later

19 Pull requests merged by 12 people

Enable {conv3d, conv_transpose3d} + bn fusion in pt2e
#2212 merged May 15, 2025
Add CI for Arm Linux
#2211 merged May 15, 2025
ROCm mxfp4 Skips
#2209 merged May 14, 2025
Add support for KleidiAI int4 kernels on aarch64 Linux
#2169 merged May 14, 2025
unbreak CI by fixing MX tests
#2208 merged May 14, 2025
Update __init__.py
#2206 merged May 14, 2025
Add mx_fp4 path
#2201 merged May 13, 2025
Arm_inductor_quantizer for Pt2e quantization
#2139 merged May 13, 2025
[float] document e2e training -> inference flow
#2190 merged May 13, 2025
Remove sparsity/prototype/blocksparse
#2205 merged May 13, 2025
Skips for ROCm (X86 Inductor Tests)
#2202 merged May 13, 2025
Add blockwise fp8 gemm benchmarks to README
#2203 merged May 12, 2025
Feat: Implementation of the DeepSeek blockwise quantization for fp8 tensors
#1763 merged May 12, 2025
Add noindex to 0.10 and 0.9 docs
#2194 merged May 12, 2025
Add subclass based method for inference w/ MXFP8
#2132 merged May 12, 2025
unpin torch to unbreak mac tests
#2198 merged May 12, 2025
2:4 activation sparsity packing kernels
#2012 merged May 12, 2025
Forward fix lint
#2197 merged May 12, 2025
Skip ROCm MoE Quantization
#2191 merged May 12, 2025

9 Pull requests opened by 9 people

Update GemLite to support vLLM V1
#2199 opened May 12, 2025
[WIP]Enable Int4WeightOnlyGPTQQuantizer on Intel GPU.
#2200 opened May 12, 2025
[reland2][ROCm] preshuffled weight mm
#2207 opened May 13, 2025
primitive scale fix
#2210 opened May 14, 2025
Add selective weight loading decode kernel for activation sparsity
#2213 opened May 15, 2025
Fixes MX formats build for blackwell
#2214 opened May 15, 2025
Re-land the PR of "Add INT8 SDPA path for CPU"
#2215 opened May 16, 2025
Convert Pytest to Unittest for tests under test/dtypes/
#2216 opened May 16, 2025
Update temp_build.py
#2218 opened May 17, 2025

2 Issues closed by 2 people

KleidiAI int4 kernels not loading properly on aarch64 Linux
#2143 closed May 16, 2025
New test files will likely fail on ROCM
#2204 closed May 13, 2025

1 Issue opened by 1 person

Add MXFP casting kernels from triton Repro
#2217 opened May 16, 2025

15 Unresolved conversations

Sometimes conversations happen on old items that aren’t yet closed. Here is a list of all the Issues and Pull Requests with unresolved conversations.

Remove preserve_zero and zero_point_domain from choose_qparams_affine
#2149 commented on May 17, 2025 • 13 new comments
[PT2E] Fix per-tensor observer issue with varying shape & rank
#2177 commented on May 17, 2025 • 9 new comments
[CPU] Add a new layout for int8_dynamic_activation_int4_weight on CPU
#2128 commented on May 16, 2025 • 5 new comments
Enhance test_autoquant_compile to support ROCm
#2100 commented on May 14, 2025 • 2 new comments
Eval hf models using lm_eval
#2179 commented on May 14, 2025 • 1 new comment
Can FP8 GEMM be enabled via module hooks instead of module swapping?
#1887 commented on May 12, 2025 • 0 new comments
Dynamo error with large mesh + AdamWFp8 + bf16 stochastic rounding
#2074 commented on May 12, 2025 • 0 new comments
[float8] Add support for blockwise fp8 quantization scheme used in DeepSeek v3
#1594 commented on May 13, 2025 • 0 new comments
[QAT] Linear layer's weight quantization granularity can only be per_group
#2189 commented on May 14, 2025 • 0 new comments
How does this work with ONNX export and quantization?
#777 commented on May 14, 2025 • 0 new comments
[feature request] np.packbits / np.unpackbits, general BitTensors (maybe can be just tensors with dtype torch.bits8 or have a new dtype torch.bits introduced) and bit packed tensors utilities for saving memory / accesses, support for BitTensors wherever BoolTensors are used
#292 commented on May 15, 2025 • 0 new comments
[draft] add all_gather_into_tensor
#1737 commented on May 16, 2025 • 0 new comments
Fix wrong scale eps applied
#1770 commented on May 13, 2025 • 0 new comments
[sparsity] Add PartialLinear module for structured sparsity
#1982 commented on May 15, 2025 • 0 new comments
Implement dtensor.shard_dim_alltoall, aten.contiguous, aten.chunk
#2154 commented on May 13, 2025 • 0 new comments