Skip to content

Move optimization passes from opt_level=0 to opt_level=1 (#18206)#18206

Merged
meta-codesync[bot] merged 4 commits intomainfrom
export-D96766073
Mar 19, 2026
Merged

Move optimization passes from opt_level=0 to opt_level=1 (#18206)#18206
meta-codesync[bot] merged 4 commits intomainfrom
export-D96766073

Conversation

@mcremon-meta
Copy link
Contributor

@mcremon-meta mcremon-meta commented Mar 16, 2026

Summary:

Many passes in the cadence backend were incorrectly placed at opt_level=0
with outdated comments claiming ops like mm, repeat, scalar_tensor, full_like
were 'not supported'. These ops have portable kernel fallbacks, so
the passes are optimizations, not correctness requirements.

This diff:

  1. Moves 18 passes from opt_level=0 to opt_level=1:

    • replace_ops.py: ReplaceLogicalNotBooleanWhereWithWherePass,
      ReplaceSafeSoftmaxWithSoftmax, ReplaceSqueezeAndUnsqueezeWithViewPass,
      ReplaceFunctionallyEquivalentOpTargets, ReplaceMMWithAddMMPass,
      ReplaceConvolutionOptionalArgsWithConcreteArgsPass, ReplaceRepeatWithCatPass,
      ReplaceScalarTensorWithFullPass, ReplaceFullLikeWithFullPass,
      ReplaceInfArgInFullWithValuePass, ReplaceMatmulWithTransposedMatmulPass
    • remove_ops.py: RemoveCloneOpsTransformImported, RemoveDetachCopyPass,
      RemoveZeroSizedCatArgsPass, RemoveNopExpandOpPass, RemoveToOpsPass,
      RemoveAliasCopyOpPass
    • decompose_ops.py: DecomposeAtenApproxGeluPass
    • simplify_ops.py: SimplifySliceOpPass
  2. Updates docstrings to remove incorrect 'not supported' claims and
    clarify these are optimizations with portable fallbacks available.

Reviewed By: ethansfng

Differential Revision: D96766073

@pytorch-bot
Copy link

pytorch-bot bot commented Mar 16, 2026

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/18206

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (2 Unrelated Failures)

As of commit 0626c21 with merge base 569cf41 (image):

BROKEN TRUNK - The following jobs failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Mar 16, 2026
@meta-codesync
Copy link
Contributor

meta-codesync bot commented Mar 16, 2026

@mcremon-meta has exported this pull request. If you are a Meta employee, you can view the originating Diff in D96766073.

@github-actions
Copy link

This PR needs a release notes: label

If your change should be included in the release notes (i.e. would users of this library care about this change?), please use a label starting with release notes:. This helps us keep track and include your important work in the next release notes.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "release notes: none"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

Copy link
Contributor

@digantdesai digantdesai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any OSS visible impact from this? Stamping to unblock you.

Summary:
Pull Request resolved: #18239

As titled. Should perform better and also allow removing some permutes when convolutions are also moved to channel last.

Differential Revision: D96869747

Reviewed By: hsharma35
Summary:
Pull Request resolved: #18240

As titled. Calls into nnlib directly.

Differential Revision: D96874522

Reviewed By: hsharma35
…better (#18256)

Summary:
Pull Request resolved: #18256

As titled. It is currently not cleaning up as much as it should, and the pass is only capable of handling single input cases.

Result: from 9 to 1 (minimum by construction) permutes on Wake Gesture.

Differential Revision: D96940254

Reviewed By: abeakkas
@meta-codesync meta-codesync bot changed the title Move optimization passes from opt_level=0 to opt_level=1 Move optimization passes from opt_level=0 to opt_level=1 (#18206) Mar 19, 2026
meta-codesync bot pushed a commit that referenced this pull request Mar 19, 2026
Summary:

Many passes in the cadence backend were incorrectly placed at opt_level=0
with outdated comments claiming ops like `mm`, `repeat`, `scalar_tensor`, `full_like`
were 'not supported'. These ops have portable kernel fallbacks, so
the passes are optimizations, not correctness requirements.

This diff:
1. Moves 18 passes from opt_level=0 to opt_level=1:
   - replace_ops.py: ReplaceLogicalNotBooleanWhereWithWherePass,
     ReplaceSafeSoftmaxWithSoftmax, ReplaceSqueezeAndUnsqueezeWithViewPass,
     ReplaceFunctionallyEquivalentOpTargets, ReplaceMMWithAddMMPass,
     ReplaceConvolutionOptionalArgsWithConcreteArgsPass, ReplaceRepeatWithCatPass,
     ReplaceScalarTensorWithFullPass, ReplaceFullLikeWithFullPass,
     ReplaceInfArgInFullWithValuePass, ReplaceMatmulWithTransposedMatmulPass
   - remove_ops.py: RemoveCloneOpsTransformImported, RemoveDetachCopyPass,
     RemoveZeroSizedCatArgsPass, RemoveNopExpandOpPass, RemoveToOpsPass,
     RemoveAliasCopyOpPass
   - decompose_ops.py: DecomposeAtenApproxGeluPass
   - simplify_ops.py: SimplifySliceOpPass

2. Updates docstrings to remove incorrect 'not supported' claims and
   clarify these are optimizations with portable fallbacks available.

Reviewed By: ethansfng

Differential Revision: D96766073
meta-codesync bot pushed a commit that referenced this pull request Mar 19, 2026
Summary:

Many passes in the cadence backend were incorrectly placed at opt_level=0
with outdated comments claiming ops like `mm`, `repeat`, `scalar_tensor`, `full_like`
were 'not supported'. These ops have portable kernel fallbacks, so
the passes are optimizations, not correctness requirements.

This diff:
1. Moves 18 passes from opt_level=0 to opt_level=1:
   - replace_ops.py: ReplaceLogicalNotBooleanWhereWithWherePass,
     ReplaceSafeSoftmaxWithSoftmax, ReplaceSqueezeAndUnsqueezeWithViewPass,
     ReplaceFunctionallyEquivalentOpTargets, ReplaceMMWithAddMMPass,
     ReplaceConvolutionOptionalArgsWithConcreteArgsPass, ReplaceRepeatWithCatPass,
     ReplaceScalarTensorWithFullPass, ReplaceFullLikeWithFullPass,
     ReplaceInfArgInFullWithValuePass, ReplaceMatmulWithTransposedMatmulPass
   - remove_ops.py: RemoveCloneOpsTransformImported, RemoveDetachCopyPass,
     RemoveZeroSizedCatArgsPass, RemoveNopExpandOpPass, RemoveToOpsPass,
     RemoveAliasCopyOpPass
   - decompose_ops.py: DecomposeAtenApproxGeluPass
   - simplify_ops.py: SimplifySliceOpPass

2. Updates docstrings to remove incorrect 'not supported' claims and
   clarify these are optimizations with portable fallbacks available.

Reviewed By: ethansfng

Differential Revision: D96766073
Summary:
Pull Request resolved: #18206

Many passes in the cadence backend were incorrectly placed at opt_level=0
with outdated comments claiming ops like `mm`, `repeat`, `scalar_tensor`, `full_like`
were 'not supported'. These ops have portable kernel fallbacks, so
the passes are optimizations, not correctness requirements.

This diff:
1. Moves 18 passes from opt_level=0 to opt_level=1:
   - replace_ops.py: ReplaceLogicalNotBooleanWhereWithWherePass,
     ReplaceSafeSoftmaxWithSoftmax, ReplaceSqueezeAndUnsqueezeWithViewPass,
     ReplaceFunctionallyEquivalentOpTargets, ReplaceMMWithAddMMPass,
     ReplaceConvolutionOptionalArgsWithConcreteArgsPass, ReplaceRepeatWithCatPass,
     ReplaceScalarTensorWithFullPass, ReplaceFullLikeWithFullPass,
     ReplaceInfArgInFullWithValuePass, ReplaceMatmulWithTransposedMatmulPass
   - remove_ops.py: RemoveCloneOpsTransformImported, RemoveDetachCopyPass,
     RemoveZeroSizedCatArgsPass, RemoveNopExpandOpPass, RemoveToOpsPass,
     RemoveAliasCopyOpPass
   - decompose_ops.py: DecomposeAtenApproxGeluPass
   - simplify_ops.py: SimplifySliceOpPass

2. Updates docstrings to remove incorrect 'not supported' claims and
   clarify these are optimizations with portable fallbacks available.

Reviewed By: ethansfng

Differential Revision: D96766073
@meta-codesync meta-codesync bot merged commit bf2243a into main Mar 19, 2026
142 of 145 checks passed
@meta-codesync meta-codesync bot deleted the export-D96766073 branch March 19, 2026 11:08
mcremon-meta added a commit that referenced this pull request Mar 19, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. fb-exported meta-exported

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants