-
Notifications
You must be signed in to change notification settings - Fork 260
Insights: pytorch/ao
Overview
Could not load contribution data
Please try again later
19 Pull requests merged by 12 people
-
Enable {conv3d, conv_transpose3d} + bn fusion in pt2e
#2212 merged
May 15, 2025 -
Add CI for Arm Linux
#2211 merged
May 15, 2025 -
ROCm mxfp4 Skips
#2209 merged
May 14, 2025 -
Add support for KleidiAI int4 kernels on aarch64 Linux
#2169 merged
May 14, 2025 -
unbreak CI by fixing MX tests
#2208 merged
May 14, 2025 -
Update __init__.py
#2206 merged
May 14, 2025 -
Add mx_fp4 path
#2201 merged
May 13, 2025 -
Arm_inductor_quantizer for Pt2e quantization
#2139 merged
May 13, 2025 -
[float] document e2e training -> inference flow
#2190 merged
May 13, 2025 -
Remove
sparsity/prototype/blocksparse
#2205 merged
May 13, 2025 -
Skips for ROCm (X86 Inductor Tests)
#2202 merged
May 13, 2025 -
Add blockwise fp8 gemm benchmarks to README
#2203 merged
May 12, 2025 -
Feat: Implementation of the DeepSeek blockwise quantization for fp8 tensors
#1763 merged
May 12, 2025 -
Add noindex to 0.10 and 0.9 docs
#2194 merged
May 12, 2025 -
Add subclass based method for inference w/ MXFP8
#2132 merged
May 12, 2025 -
unpin torch to unbreak mac tests
#2198 merged
May 12, 2025 -
2:4 activation sparsity packing kernels
#2012 merged
May 12, 2025 -
Forward fix lint
#2197 merged
May 12, 2025 -
Skip ROCm MoE Quantization
#2191 merged
May 12, 2025
9 Pull requests opened by 9 people
-
Update GemLite to support vLLM V1
#2199 opened
May 12, 2025 -
[WIP]Enable Int4WeightOnlyGPTQQuantizer on Intel GPU.
#2200 opened
May 12, 2025 -
[reland2][ROCm] preshuffled weight mm
#2207 opened
May 13, 2025 -
primitive scale fix
#2210 opened
May 14, 2025 -
Add selective weight loading decode kernel for activation sparsity
#2213 opened
May 15, 2025 -
Fixes MX formats build for blackwell
#2214 opened
May 15, 2025 -
Re-land the PR of "Add INT8 SDPA path for CPU"
#2215 opened
May 16, 2025 -
Convert Pytest to Unittest for tests under test/dtypes/
#2216 opened
May 16, 2025 -
Update temp_build.py
#2218 opened
May 17, 2025
2 Issues closed by 2 people
-
KleidiAI int4 kernels not loading properly on aarch64 Linux
#2143 closed
May 16, 2025 -
New test files will likely fail on ROCM
#2204 closed
May 13, 2025
1 Issue opened by 1 person
-
Add MXFP casting kernels from triton Repro
#2217 opened
May 16, 2025
15 Unresolved conversations
Sometimes conversations happen on old items that aren’t yet closed. Here is a list of all the Issues and Pull Requests with unresolved conversations.
-
Remove preserve_zero and zero_point_domain from choose_qparams_affine
#2149 commented on
May 17, 2025 • 13 new comments -
[PT2E] Fix per-tensor observer issue with varying shape & rank
#2177 commented on
May 17, 2025 • 9 new comments -
[CPU] Add a new layout for int8_dynamic_activation_int4_weight on CPU
#2128 commented on
May 16, 2025 • 5 new comments -
Enhance test_autoquant_compile to support ROCm
#2100 commented on
May 14, 2025 • 2 new comments -
Eval hf models using lm_eval
#2179 commented on
May 14, 2025 • 1 new comment -
Can FP8 GEMM be enabled via module hooks instead of module swapping?
#1887 commented on
May 12, 2025 • 0 new comments -
Dynamo error with large mesh + AdamWFp8 + bf16 stochastic rounding
#2074 commented on
May 12, 2025 • 0 new comments -
[float8] Add support for blockwise fp8 quantization scheme used in DeepSeek v3
#1594 commented on
May 13, 2025 • 0 new comments -
[QAT] Linear layer's weight quantization granularity can only be per_group
#2189 commented on
May 14, 2025 • 0 new comments -
How does this work with ONNX export and quantization?
#777 commented on
May 14, 2025 • 0 new comments -
[feature request] np.packbits / np.unpackbits, general BitTensors (maybe can be just tensors with dtype torch.bits8 or have a new dtype torch.bits introduced) and bit packed tensors utilities for saving memory / accesses, support for BitTensors wherever BoolTensors are used
#292 commented on
May 15, 2025 • 0 new comments -
[draft] add all_gather_into_tensor
#1737 commented on
May 16, 2025 • 0 new comments -
Fix wrong scale eps applied
#1770 commented on
May 13, 2025 • 0 new comments -
[sparsity] Add PartialLinear module for structured sparsity
#1982 commented on
May 15, 2025 • 0 new comments -
Implement dtensor.shard_dim_alltoall, aten.contiguous, aten.chunk
#2154 commented on
May 13, 2025 • 0 new comments