Skip to content

Add MLX op handler for aten.isinf#18936

Open
Ai-chan-0411 wants to merge 2 commits intopytorch:mainfrom
Ai-chan-0411:fix/mlx-isinf-handler
Open

Add MLX op handler for aten.isinf#18936
Ai-chan-0411 wants to merge 2 commits intopytorch:mainfrom
Ai-chan-0411:fix/mlx-isinf-handler

Conversation

@Ai-chan-0411
Copy link
Copy Markdown

Summary

While looking at the MLX backend coverage for numerical-stability ops, I noticed aten.isinf was missing — any model using torch.isinf would silently fall back to CPU, breaking the GPU acceleration pipeline.

This PR adds a decomposed handler that expresses isinf(x) as abs(x) == inf, reusing the existing AbsNode and EqualNode infrastructure. Both positive and negative infinity are correctly detected through the abs step.

Changes:

  • backends/mlx/ops.py — new _isinf_handler registered for torch.ops.aten.isinf.default
  • backends/mlx/test/test_ops.py — added _inf_input_fn (generates tensors with scattered ±inf values) and an isinf entry in _UNARY_OP_TESTS

Closes #18922

Decompose isinf(x) into abs(x) == inf using existing AbsNode and
EqualNode, so the op runs on the Metal GPU via MLX instead of falling
back to CPU execution.

Closes pytorch#18922

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@pytorch-bot
Copy link
Copy Markdown

pytorch-bot bot commented Apr 16, 2026

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/18936

Note: Links to docs will display an error until the docs builds have been completed.

❗ 1 Active SEVs

There are 1 currently active SEVs. If your PR is affected, please view them below:

⚠️ 12 Awaiting Approval

As of commit 7647a7e with merge base ec8d70b (image):

AWAITING APPROVAL - The following workflows need approval before CI can run:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@meta-cla
Copy link
Copy Markdown

meta-cla bot commented Apr 16, 2026

Hi @Ai-chan-0411!

Thank you for your pull request and welcome to our community.

Action Required

In order to merge any pull request (code, docs, etc.), we require contributors to sign our Contributor License Agreement, and we don't seem to have one on file for you.

Process

In order for us to review and merge your suggested changes, please sign at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need to sign the corporate CLA.

Once the CLA is signed, our tooling will perform checks and validations. Afterwards, the pull request will be tagged with CLA signed. The tagging process may take up to 1 hour after signing. Please give it that time before contacting us about it.

If you have received this in error or have any questions, please contact us at cla@meta.com. Thanks!

@github-actions
Copy link
Copy Markdown

This PR needs a release notes: label

If your change should be included in the release notes (i.e. would users of this library care about this change?), please use a label starting with release notes:. This helps us keep track and include your important work in the next release notes.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "release notes: none"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

"""Return a callable(shape, dtype) that generates inputs with some inf values."""

def fn(shape, dtype):
x = torch.randn(shape, dtype=dtype)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add some nans to this generated test input as well?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done — added NaN values to _inf_input_fn. The generated inputs now include float("nan") alongside inf/-inf, so the test verifies that isinf correctly returns False for NaN. Pushed in 7647a7e.

Comment thread backends/mlx/ops.py
return out


@REGISTRY.register(target=[torch.ops.aten.isinf.default])
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there any other variants of this torch op? E.g., isinf.Tensor, etc.

There may not be, just curious.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I checked the PyTorch ATen operator registry and the executorch codebase — the only variant is aten.isinf.default. There is no isinf.Tensor or isinf.out overload defined. The other backends (MPS, Qualcomm) also only register aten.isinf.default, so this handler covers the complete surface.

@metascroy
Copy link
Copy Markdown
Contributor

Looks great @Ai-chan-0411! Can we add nans to the test case (to make sure isinf behaves correctly when its input is nan)?

Include NaN in the generated test data alongside inf/-inf to ensure
isinf correctly returns False for NaN inputs.
@Ai-chan-0411
Copy link
Copy Markdown
Author

Thanks for the review! You're right about adding NaN test cases. Since isinf returns False for NaNs (they're neither inf nor -inf), we should include them in the generated test inputs to ensure that edge case is properly covered. Let me update _inf_input_fn to mix in some NaN values alongside the current ±inf cases.

@Ai-chan-0411
Copy link
Copy Markdown
Author

Good question about variants. In PyTorch's aten ops, aten.isinf is the primary op. The .default suffix we register for is the standard overload. There's no separate isinf.Tensor variant — torch.isinf() works on both Tensors and scalars through operator overloading at the higher level, but at the aten level it maps to the single aten.isinf.default op. I'll verify this is the only variant we need to handle.

@Ai-chan-0411
Copy link
Copy Markdown
Author

I noticed the MLX workflow run (24492915727) failed, but the failure was due to infrastructure issues — specifically, network connectivity errors when cloning submodules (NVIDIA/cutlass and google/flatbuffers). This is not related to the code changes in this PR. The implementation itself should be sound. A retry of the workflow run should succeed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Good First Issue: Add MLX Op Handler for aten.isinf

3 participants