Fix flaky ReplaceTrivialConvWithLinear pass validation tolerance (#18482) by hsharma35 · Pull Request #18482 · pytorch/executorch

hsharma35 · 2026-03-24T23:50:25Z

Summary:

The test_replace_conv2d_with_linear and test_replace_conv1d_with_linear
tests validate that replacing trivial convolutions with linear ops produces
numerically equivalent outputs. Both operations compute the same dot product
(sum of element-wise products), but conv accumulates across spatial dimensions
(C,H,W) while linear accumulates over a flattened K dimension. With K=294
(conv2d: 677) or K=672 (conv1d: 96*7) fp32 terms, different accumulation
orders produce diffs up to ~1.2e-05 due to non-associativity of floating-point
addition.

This is not a correctness issue — the mathematical operation is identical.
Relax rtol from 1e-05 to 2e-05 to accommodate fp32 accumulation order
differences while remaining tight enough to catch real bugs.

Reviewed By: DrJessop

Differential Revision: D98001101

pytorch-bot · 2026-03-24T23:50:29Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/18482

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❌ 29 Cancelled Jobs, 2 Unrelated Failures

As of commit 5c7745a with merge base 123860f ():

CANCELLED JOBS - The following jobs were cancelled. Please retry:

pull / android / run-emulator (gh)
##[error]The operation was canceled.
pull / test-coreml-bc-macos (macos-m1-stable) / macos-job (gh)
pull / test-llama-runner-qnn-linux (fp32, qnn_8a8w, qnn) / linux-job (gh)
##[error]The operation was canceled.
pull / test-lora-linux / linux-job (gh)
##[error]The operation was canceled.
pull / test-lora-multimethod-linux / linux-job (gh)
##[error]The operation was canceled.
pull / test-models-linux (emformer_join, portable, linux.4xlarge.memory) / linux-job (gh)
##[error]The operation was canceled.
pull / test-models-linux (ic4, portable, linux.4xlarge.memory) / linux-job (gh)
##[error]The operation was canceled.
pull / test-models-linux (llama3_2_vision_encoder, portable, linux.4xlarge.memory) / linux-job (gh)
##[error]The operation was canceled.
pull / test-models-linux (phi_4_mini, portable, linux.4xlarge.memory) / linux-job (gh)
##[error]The operation was canceled.
pull / test-parakeet-xnnpack-linux / linux-job (gh)
##[error]The operation was canceled.
pull / test-sqnr-static-llm-qnn-linux (smollm2_135m) / linux-job (gh)
##[error]The operation was canceled.
pull / test-static-llama-qnn-linux (stories_110m) / linux-job (gh)
##[error]The operation was canceled.
pull / test-voxtral-realtime-xnnpack-linux / linux-job (gh)
##[error]The operation was canceled.
pull / unittest / linux / linux-job (gh)
##[error]The operation was canceled.
pull / unittest / macos / macos-job (gh)
pull / unittest-arm-backend-with-no-deps (test_pytest_models_tosa) / linux-job (gh)
##[error]The operation was canceled.
pull / unittest-arm-backend-with-no-deps (test_pytest_ops_tosa) / linux-job (gh)
##[error]The operation was canceled.
pull / unittest-buck / macos / macos-job (gh)
pull / unittest-editable / linux / linux-job (gh)
##[error]The operation was canceled.
pull / unittest-editable / macos / macos-job (gh)
pull / unittest-nxp-neutron / linux-job (gh)
##[error]The operation was canceled.
Test ARM Backend / test-arm / package-golden-artifacts (gh)
Test ARM Backend / test-arm / test-backend-linux (arm_tosa_fp, models) / linux-job (gh)
##[error]The operation was canceled.
Test ARM Backend / test-arm / test-backend-linux (arm_vgf_fp, models) / linux-job (gh)
##[error]The operation was canceled.
Test CoreML Backend / test-coreml / test-backend-macos (coreml, models) / macos-job (gh)
Test CoreML Backend / test-coreml / test-backend-macos (coreml, operators) / macos-job (gh)
Test QNN Backend / test-qnn / package-golden-artifacts (gh)
##[error]The operation was canceled.
Test XNNPACK Backend / test-xnnpack / package-golden-artifacts (gh)
Test XNNPACK Backend / test-xnnpack / test-backend-linux (xnnpack, models) / linux-job (gh)
##[error]The operation was canceled.

FLAKY - The following jobs failed but were likely due to flakiness present on trunk:

pull / unittest / windows / windows-job (gh) (matched win rule in flaky-rules.json)
##[error]The operation was canceled.
pull / unittest-editable / windows / windows-job (gh) (matched win rule in flaky-rules.json)
##[error]The operation was canceled.

This comment was automatically generated by Dr. CI and updates every 15 minutes.

meta-codesync · 2026-03-24T23:50:34Z

@hsharma35 has exported this pull request. If you are a Meta employee, you can view the originating Diff in D98001101.

github-actions · 2026-03-24T23:51:03Z

This PR needs a `release notes:` label

If your change should be included in the release notes (i.e. would users of this library care about this change?), please use a label starting with release notes:. This helps us keep track and include your important work in the next release notes.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "release notes: none"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

…orch#18482) Summary: The `test_replace_conv2d_with_linear` and `test_replace_conv1d_with_linear` tests validate that replacing trivial convolutions with linear ops produces numerically equivalent outputs. Both operations compute the same dot product (sum of element-wise products), but conv accumulates across spatial dimensions (C,H,W) while linear accumulates over a flattened K dimension. With K=294 (conv2d: 6*7*7) or K=672 (conv1d: 96*7) fp32 terms, different accumulation orders produce diffs up to ~1.2e-05 due to non-associativity of floating-point addition. This is not a correctness issue — the mathematical operation is identical. Relax rtol from 1e-05 to 2e-05 to accommodate fp32 accumulation order differences while remaining tight enough to catch real bugs. Reviewed By: DrJessop Differential Revision: D98001101

meta-cla Bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Mar 24, 2026

meta-codesync Bot added fb-exported meta-exported labels Mar 24, 2026

DrJessop approved these changes Mar 25, 2026

View reviewed changes

meta-codesync Bot changed the title ~~Fix flaky ReplaceTrivialConvWithLinear pass validation tolerance~~ Fix flaky ReplaceTrivialConvWithLinear pass validation tolerance (#18482) Mar 25, 2026

hsharma35 force-pushed the export-D98001101 branch from 5c7745a to 3360b25 Compare March 25, 2026 00:21

meta-codesync Bot merged commit 8f1b5ee into pytorch:main Mar 25, 2026
159 of 162 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix flaky ReplaceTrivialConvWithLinear pass validation tolerance (#18482)#18482

Fix flaky ReplaceTrivialConvWithLinear pass validation tolerance (#18482)#18482
meta-codesync[bot] merged 1 commit intopytorch:mainfrom
hsharma35:export-D98001101

hsharma35 commented Mar 24, 2026 •

edited by meta-codesync Bot

Loading

Uh oh!

pytorch-bot Bot commented Mar 24, 2026 •

edited

Loading

Uh oh!

meta-codesync Bot commented Mar 24, 2026

Uh oh!

github-actions Bot commented Mar 24, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

hsharma35 commented Mar 24, 2026 • edited by meta-codesync Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot Bot commented Mar 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/18482

❌ 29 Cancelled Jobs, 2 Unrelated Failures

Uh oh!

meta-codesync Bot commented Mar 24, 2026

Uh oh!

github-actions Bot commented Mar 24, 2026

This PR needs a release notes: label

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

hsharma35 commented Mar 24, 2026 •

edited by meta-codesync Bot

Loading

pytorch-bot Bot commented Mar 24, 2026 •

edited

Loading

This PR needs a `release notes:` label