Sync Python-side quantized_softmax schema with C++ kernel (add mask_type and pos args) (#18495)#18495
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/18495
Note: Links to docs will display an error until the docs builds have been completed. ❌ 2 New Failures, 2 Unrelated FailuresAs of commit 4127576 with merge base 59838fc ( NEW FAILURES - The following jobs have failed:
BROKEN TRUNK - The following jobs failed but were present on the merge base:👉 Rebase onto the `viable/strict` branch to avoid these failures
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
|
@mvartani-meta has exported this pull request. If you are a Meta employee, you can view the originating Diff in D98145095. |
This PR needs a
|
97477f0 to
ac00fe2
Compare
…ype and pos args) (pytorch#18495) Summary: D88997196 added mask_type (int) and pos (Tensor) parameters to the C++ cadence::quantized_softmax kernels and custom_ops.yaml, but missed updating the Python-side op registrations, reference implementations, quantizer fusion pass, and tests. This caused an argument count mismatch at runtime (Expected 11 args received 9) when running quantized softmax on the Xtensa ISS. This diff completes the schema sync by updating: - ops_registrations.py — Updated all 4 lib.define() schemas and both register_fake meta functions to include int mask_type, Tensor pos after dim. - ref_implementations.py — Added mask_type and pos params to quantized_softmax_per_tensor_common, quantized_softmax_per_tensor, and quantized_softmax. Added assert mask_type == 0 guard consistent with existing assert mask is None. - quantizer/fusion_pass.py — Updated get_args_and_kwargs_softmax to emit mask_type=0 (no masking) and a dummy pos tensor (full([1], 0, dtype=int64)), matching the default behavior for standard softmax quantization. - tests/test_ref_implementations.py — Updated test_quantized_softmax_per_tensor and test_quantized_softmax call sites with the new args. Differential Revision: D98145095
…ype and pos args) (pytorch#18495) Summary: Pull Request resolved: pytorch#18495 D88997196 added mask_type (int) and pos (Tensor) parameters to the C++ cadence::quantized_softmax kernels and custom_ops.yaml, but missed updating the Python-side op registrations, reference implementations, quantizer fusion pass, and tests. This caused an argument count mismatch at runtime (Expected 11 args received 9) when running quantized softmax on the Xtensa ISS. This diff completes the schema sync by updating: - ops_registrations.py — Updated all 4 lib.define() schemas and both register_fake meta functions to include int mask_type, Tensor pos after dim. - ref_implementations.py — Added mask_type and pos params to quantized_softmax_per_tensor_common, quantized_softmax_per_tensor, and quantized_softmax. Added assert mask_type == 0 guard consistent with existing assert mask is None. - quantizer/fusion_pass.py — Updated get_args_and_kwargs_softmax to emit mask_type=0 (no masking) and a dummy pos tensor (full([1], 0, dtype=int64)), matching the default behavior for standard softmax quantization. - tests/test_ref_implementations.py — Updated test_quantized_softmax_per_tensor and test_quantized_softmax call sites with the new args. Reviewed By: hsharma35 Differential Revision: D98145095
4509b49 to
3e3a1ed
Compare
…ype and pos args) (pytorch#18495) Summary: D88997196 added mask_type (int) and pos (Tensor) parameters to the C++ cadence::quantized_softmax kernels and custom_ops.yaml, but missed updating the Python-side op registrations, reference implementations, quantizer fusion pass, and tests. This caused an argument count mismatch at runtime (Expected 11 args received 9) when running quantized softmax on the Xtensa ISS. This diff completes the schema sync by updating: - ops_registrations.py — Updated all 4 lib.define() schemas and both register_fake meta functions to include int mask_type, Tensor pos after dim. - ref_implementations.py — Added mask_type and pos params to quantized_softmax_per_tensor_common, quantized_softmax_per_tensor, and quantized_softmax. Added assert mask_type == 0 guard consistent with existing assert mask is None. - quantizer/fusion_pass.py — Updated get_args_and_kwargs_softmax to emit mask_type=0 (no masking) and a dummy pos tensor (full([1], 0, dtype=int64)), matching the default behavior for standard softmax quantization. - tests/test_ref_implementations.py — Updated test_quantized_softmax_per_tensor and test_quantized_softmax call sites with the new args. Reviewed By: hsharma35 Differential Revision: D98145095
…ype and pos args) (pytorch#18495) Summary: D88997196 added mask_type (int) and pos (Tensor) parameters to the C++ cadence::quantized_softmax kernels and custom_ops.yaml, but missed updating the Python-side op registrations, reference implementations, quantizer fusion pass, and tests. This caused an argument count mismatch at runtime (Expected 11 args received 9) when running quantized softmax on the Xtensa ISS. This diff completes the schema sync by updating: - ops_registrations.py — Updated all 4 lib.define() schemas and both register_fake meta functions to include int mask_type, Tensor pos after dim. - ref_implementations.py — Added mask_type and pos params to quantized_softmax_per_tensor_common, quantized_softmax_per_tensor, and quantized_softmax. Added assert mask_type == 0 guard consistent with existing assert mask is None. - quantizer/fusion_pass.py — Updated get_args_and_kwargs_softmax to emit mask_type=0 (no masking) and a dummy pos tensor (full([1], 0, dtype=int64)), matching the default behavior for standard softmax quantization. - tests/test_ref_implementations.py — Updated test_quantized_softmax_per_tensor and test_quantized_softmax call sites with the new args. Reviewed By: hsharma35 Differential Revision: D98145095
3e3a1ed to
2cc1f22
Compare
…ype and pos args) (pytorch#18495) Summary: Pull Request resolved: pytorch#18495 D88997196 added mask_type (int) and pos (Tensor) parameters to the C++ cadence::quantized_softmax kernels and custom_ops.yaml, but missed updating the Python-side op registrations, reference implementations, quantizer fusion pass, and tests. This caused an argument count mismatch at runtime (Expected 11 args received 9) when running quantized softmax on the Xtensa ISS. This diff completes the schema sync by updating: - ops_registrations.py — Updated all 4 lib.define() schemas and both register_fake meta functions to include int mask_type, Tensor pos after dim. - ref_implementations.py — Added mask_type and pos params to quantized_softmax_per_tensor_common, quantized_softmax_per_tensor, and quantized_softmax. Added assert mask_type == 0 guard consistent with existing assert mask is None. - quantizer/fusion_pass.py — Updated get_args_and_kwargs_softmax to emit mask_type=0 (no masking) and a dummy pos tensor (full([1], 0, dtype=int64)), matching the default behavior for standard softmax quantization. - tests/test_ref_implementations.py — Updated test_quantized_softmax_per_tensor and test_quantized_softmax call sites with the new args. Reviewed By: hsharma35 Differential Revision: D98145095
2cc1f22 to
c7c9d70
Compare
…ype and pos args) (pytorch#18495) Summary: Pull Request resolved: pytorch#18495 D88997196 added mask_type (int) and pos (Tensor) parameters to the C++ cadence::quantized_softmax kernels and custom_ops.yaml, but missed updating the Python-side op registrations, reference implementations, quantizer fusion pass, and tests. This caused an argument count mismatch at runtime (Expected 11 args received 9) when running quantized softmax on the Xtensa ISS. This diff completes the schema sync by updating: - ops_registrations.py — Updated all 4 lib.define() schemas and both register_fake meta functions to include int mask_type, Tensor pos after dim. - ref_implementations.py — Added mask_type and pos params to quantized_softmax_per_tensor_common, quantized_softmax_per_tensor, and quantized_softmax. Added assert mask_type == 0 guard consistent with existing assert mask is None. - quantizer/fusion_pass.py — Updated get_args_and_kwargs_softmax to emit mask_type=0 (no masking) and a dummy pos tensor (full([1], 0, dtype=int64)), matching the default behavior for standard softmax quantization. - tests/test_ref_implementations.py — Updated test_quantized_softmax_per_tensor and test_quantized_softmax call sites with the new args. Reviewed By: hsharma35 Differential Revision: D98145095
c7c9d70 to
4127576
Compare
Summary:
D88997196 added mask_type (int) and pos (Tensor) parameters to the C++ cadence::quantized_softmax kernels and custom_ops.yaml, but missed updating the Python-side op registrations, reference implementations, quantizer fusion pass, and tests. This caused an argument count mismatch at runtime (Expected 11 args received 9) when running quantized softmax on the Xtensa ISS.
This diff completes the schema sync by updating:
Reviewed By: hsharma35
Differential Revision: D98145095