Fix nightly operator test failures across nxp_rt600 and DLA_V130 backends by ethansfng · Pull Request #18951 · pytorch/executorch

ethansfng · 2026-04-16T17:38:25Z

Summary:
Fix 6 categories of nightly operator test failures by addressing FACTO test generation constraints and kernel bugs:

FACTO constraint fixes (facto_util.py):

div.Tensor_mode: Remove int64 from dtype constraints — nxp_rt600 lacks native int64 support, causing off-by-1 rounding errors via _to_copy fallback
permute_copy.default: Restrict dtypes to float32/int32 — int8/uint8 cause ISS crashes since xa_nn_transpose doesn't handle sub-word integer types
pow.Tensor_Scalar: Add Value.Ge(0) constraint — negative inputs produce NaN via negative^fractional, which DSP backends don't implement per IEEE 754

Kernel bug fixes (op_add.cpp, op_sub.cpp):

Fix || vs && logic error in broadcast type dispatch that caused int32 data to be reinterpreted as float32, producing garbage output on DLA_V130
Add missing broadcast dispatch cases for Int+Int, Long+Long, Int+Long, Long+Int
Change static_cast to static_cast to avoid precision loss for large int32 values

HiFi kernel guard fixes:

op_where.cpp: Disable optimized nnlib path when condition tensor needs broadcasting — xa_nn_elm_select_broadcast_4D only computes strides for inp1/inp2, not the condition tensor
op_permute_copy.cpp: Disable xa_nn_transpose_32_32 for Float — the nnlib function crashes the ISS for certain tensor shapes; fall back to correct generic implementation
op_softmax.cpp: Disable optimized nnlib path when softmax dim is not the last dimension — the permuted path allocates temp memory exceeding the budget on resource-constrained targets

Differential Revision: D100873709

…ends Summary: Fix 6 categories of nightly operator test failures by addressing FACTO test generation constraints and kernel bugs: **FACTO constraint fixes (facto_util.py):** - div.Tensor_mode: Remove int64 from dtype constraints — nxp_rt600 lacks native int64 support, causing off-by-1 rounding errors via _to_copy fallback - permute_copy.default: Restrict dtypes to float32/int32 — int8/uint8 cause ISS crashes since xa_nn_transpose doesn't handle sub-word integer types - pow.Tensor_Scalar: Add Value.Ge(0) constraint — negative inputs produce NaN via negative^fractional, which DSP backends don't implement per IEEE 754 **Kernel bug fixes (op_add.cpp, op_sub.cpp):** - Fix || vs && logic error in broadcast type dispatch that caused int32 data to be reinterpreted as float32, producing garbage output on DLA_V130 - Add missing broadcast dispatch cases for Int+Int, Long+Long, Int+Long, Long+Int - Change static_cast<float> to static_cast<double> to avoid precision loss for large int32 values **HiFi kernel guard fixes:** - op_where.cpp: Disable optimized nnlib path when condition tensor needs broadcasting — xa_nn_elm_select_broadcast_4D only computes strides for inp1/inp2, not the condition tensor - op_permute_copy.cpp: Disable xa_nn_transpose_32_32 for Float — the nnlib function crashes the ISS for certain tensor shapes; fall back to correct generic implementation - op_softmax.cpp: Disable optimized nnlib path when softmax dim is not the last dimension — the permuted path allocates temp memory exceeding the budget on resource-constrained targets Differential Revision: D100873709

pytorch-bot · 2026-04-16T17:38:29Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/18951

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❗ 1 Active SEVs

There are 1 currently active SEVs. If your PR is affected, please view them below:

[CI[B200] Smoke test encounters CUDA Unknown error for dgxb200-03 and dgxb200-04

✅ You can merge normally! (3 Unrelated Failures)

As of commit 450cdbf with merge base a489707 ():

FLAKY - The following job failed but was likely due to flakiness present on trunk:

pull / test-moshi-linux / linux-job (gh) (detected as infra flaky with no log or failing log classifier)

BROKEN TRUNK - The following jobs failed but was present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

pull / unittest / windows / windows-job (gh) (trunk failure)
##[error]The operation was canceled.
pull / unittest-editable / windows / windows-job (gh) (trunk failure)
##[error]The operation was canceled.

This comment was automatically generated by Dr. CI and updates every 15 minutes.

meta-codesync · 2026-04-16T17:38:33Z

@ethansfng has exported this pull request. If you are a Meta employee, you can view the originating Diff in D100873709.

github-actions · 2026-04-16T17:39:05Z

This PR needs a `release notes:` label

If your change should be included in the release notes (i.e. would users of this library care about this change?), please use a label starting with release notes:. This helps us keep track and include your important work in the next release notes.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "release notes: none"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

meta-cla Bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Apr 16, 2026

meta-codesync Bot added fb-exported meta-exported labels Apr 16, 2026

DrJessop approved these changes Apr 17, 2026

View reviewed changes

meta-codesync Bot merged commit 4618b80 into pytorch:main Apr 17, 2026
161 of 173 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix nightly operator test failures across nxp_rt600 and DLA_V130 backends#18951

Fix nightly operator test failures across nxp_rt600 and DLA_V130 backends#18951
meta-codesync[bot] merged 1 commit intopytorch:mainfrom
ethansfng:export-D100873709

ethansfng commented Apr 16, 2026

Uh oh!

pytorch-bot Bot commented Apr 16, 2026 •

edited

Loading

Uh oh!

meta-codesync Bot commented Apr 16, 2026

Uh oh!

github-actions Bot commented Apr 16, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

ethansfng commented Apr 16, 2026

Uh oh!

pytorch-bot Bot commented Apr 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/18951

❗ 1 Active SEVs

✅ You can merge normally! (3 Unrelated Failures)

Uh oh!

meta-codesync Bot commented Apr 16, 2026

Uh oh!

github-actions Bot commented Apr 16, 2026

This PR needs a release notes: label

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

pytorch-bot Bot commented Apr 16, 2026 •

edited

Loading

This PR needs a `release notes:` label