Fused quant linear kernel (#19490)#19490
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/19490
Note: Links to docs will display an error until the docs builds have been completed. ❗ 1 Active SEVsThere are 1 currently active SEVs. If your PR is affected, please view them below: ❌ 6 New FailuresAs of commit c987558 with merge base 8020fe0 ( NEW FAILURES - The following jobs have failed:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
|
@DrJessop has exported this pull request. If you are a Meta employee, you can view the originating Diff in D103754853. |
This PR needs a
|
Summary: Fused quant linear kernel (out = inp @ weight^T + bias) with optional dequantize/quantize. Supports 4 sets of qparams (inp, weight, bias, out), optional bias, and per-tensor/per-channel quantization. Reviewed By: mvartani-meta Differential Revision: D103754853
c57549a to
1d27ab1
Compare
Summary: Fused quant linear kernel (out = inp @ weight^T + bias) with optional dequantize/quantize. Supports 4 sets of qparams (inp, weight, bias, out), optional bias, and per-tensor/per-channel quantization. Reviewed By: mvartani-meta Differential Revision: D103754853
Summary: Fused quant linear kernel (out = inp @ weight^T + bias) with optional dequantize/quantize. Supports 4 sets of qparams (inp, weight, bias, out), optional bias, and per-tensor/per-channel quantization. Reviewed By: mvartani-meta Differential Revision: D103754853
Summary: Fused quant linear kernel (out = inp @ weight^T + bias) with optional dequantize/quantize. Supports 4 sets of qparams (inp, weight, bias, out), optional bias, and per-tensor/per-channel quantization. Reviewed By: mvartani-meta Differential Revision: D103754853
Summary: Fused quant hardswish kernel with optional dequantize/quantize. Unary op that applies x * min(max(x+3, 0), 6) / 6. Supports per-tensor and per-channel quantization. Reviewed By: mvartani-meta Differential Revision: D103754780
Summary: Fused quant batch matrix multiply kernel with optional dequantize/quantize. Binary op on 3D tensors [B,M,K] x [B,K,N] -> [B,M,N]. Supports per-tensor and per-channel quantization. Reviewed By: mvartani-meta Differential Revision: D103754815
Summary: Fused quant linear kernel (out = inp @ weight^T + bias) with optional dequantize/quantize. Supports 4 sets of qparams (inp, weight, bias, out), optional bias, and per-tensor/per-channel quantization. Reviewed By: mvartani-meta Differential Revision: D103754853
1d27ab1 to
c987558
Compare
Summary: Fused quant linear kernel (out = inp @ weight^T + bias) with optional dequantize/quantize. Supports 4 sets of qparams (inp, weight, bias, out), optional bias, and per-tensor/per-channel quantization. Reviewed By: mvartani-meta Differential Revision: D103754853
Summary:
Fused quant linear kernel (out = inp @ weight^T + bias) with optional dequantize/quantize. Supports 4 sets of qparams (inp, weight, bias, out), optional bias, and per-tensor/per-channel quantization.
Reviewed By: mvartani-meta
Differential Revision: D103754853