Fix nightly operator test failures across nxp_rt600 and DLA_V130 backends#18951
Conversation
…ends Summary: Fix 6 categories of nightly operator test failures by addressing FACTO test generation constraints and kernel bugs: **FACTO constraint fixes (facto_util.py):** - div.Tensor_mode: Remove int64 from dtype constraints — nxp_rt600 lacks native int64 support, causing off-by-1 rounding errors via _to_copy fallback - permute_copy.default: Restrict dtypes to float32/int32 — int8/uint8 cause ISS crashes since xa_nn_transpose doesn't handle sub-word integer types - pow.Tensor_Scalar: Add Value.Ge(0) constraint — negative inputs produce NaN via negative^fractional, which DSP backends don't implement per IEEE 754 **Kernel bug fixes (op_add.cpp, op_sub.cpp):** - Fix || vs && logic error in broadcast type dispatch that caused int32 data to be reinterpreted as float32, producing garbage output on DLA_V130 - Add missing broadcast dispatch cases for Int+Int, Long+Long, Int+Long, Long+Int - Change static_cast<float> to static_cast<double> to avoid precision loss for large int32 values **HiFi kernel guard fixes:** - op_where.cpp: Disable optimized nnlib path when condition tensor needs broadcasting — xa_nn_elm_select_broadcast_4D only computes strides for inp1/inp2, not the condition tensor - op_permute_copy.cpp: Disable xa_nn_transpose_32_32 for Float — the nnlib function crashes the ISS for certain tensor shapes; fall back to correct generic implementation - op_softmax.cpp: Disable optimized nnlib path when softmax dim is not the last dimension — the permuted path allocates temp memory exceeding the budget on resource-constrained targets Differential Revision: D100873709
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/18951
Note: Links to docs will display an error until the docs builds have been completed. ❗ 1 Active SEVsThere are 1 currently active SEVs. If your PR is affected, please view them below: ✅ You can merge normally! (3 Unrelated Failures)As of commit 450cdbf with merge base a489707 ( FLAKY - The following job failed but was likely due to flakiness present on trunk:
BROKEN TRUNK - The following jobs failed but was present on the merge base:👉 Rebase onto the `viable/strict` branch to avoid these failures
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
|
@ethansfng has exported this pull request. If you are a Meta employee, you can view the originating Diff in D100873709. |
This PR needs a
|
Summary:
Fix 6 categories of nightly operator test failures by addressing FACTO test generation constraints and kernel bugs:
FACTO constraint fixes (facto_util.py):
Kernel bug fixes (op_add.cpp, op_sub.cpp):
HiFi kernel guard fixes:
Differential Revision: D100873709