Skip to content

[CoreML EP] Add HardSigmoid support#28182

Merged
yuslepukhin merged 1 commit intomicrosoft:mainfrom
maxwbuckley:coreml-hardsigmoid
Apr 24, 2026
Merged

[CoreML EP] Add HardSigmoid support#28182
yuslepukhin merged 1 commit intomicrosoft:mainfrom
maxwbuckley:coreml-hardsigmoid

Conversation

@maxwbuckley
Copy link
Copy Markdown
Contributor

@maxwbuckley maxwbuckley commented Apr 22, 2026

Description

Adds HardSigmoid to the CoreML Execution Provider's activation op builder. Both MLProgram (sigmoid_hard) and NeuralNetwork (ActivationSigmoidHard) code paths are implemented; the op's ONNX definition matches CoreML MIL's sigmoid_hard exactly, so no decomposition is required.

Adds a dedicated CoreML-EP test (CoreMLExecutionProviderTest.HardSigmoidTest) that builds a single-node HardSigmoid model with non-default alpha/beta and uses RunAndVerifyOutputsWithEP with ExpectedEPNodeAssignment::All to confirm (a) the entire graph is claimed by the CoreML EP in both NN and MLProgram formats, and (b) the output matches the CPU reference. I verified the test is not trivially passing by temporarily unregistering HardSigmoid from the activation builder — the test fails with VerifyEPNodeAssignment emitting a fatal failure, proving it genuinely exercises the CoreML path. (The existing multi-EP test in activation_op_test.cc silently falls back to CPU when an EP rejects the node, so it does not give CoreML coverage on its own.)

Also updates coreml_supported_mlprogram_ops.md.

Motivation and Context

Fixes #28181.

On a DWPose pose-estimation model (dw-ll_ucoco_384.onnx), 4 HardSigmoid ops were each forcing a CoreML → CPU → CoreML round-trip, and also caused downstream ops to be rejected with "unsupported inputs" because their producers had been sent to CPU. Adding HardSigmoid collapses the graph from 5 CoreML subgraphs to 1, and drops inference from 9.22 ms to 6.92 ms (−25%) on Apple Silicon with MLProgram + ComputeUnits=ALL.

@maxwbuckley
Copy link
Copy Markdown
Contributor Author

@microsoft-github-policy-service agree

@maxwbuckley
Copy link
Copy Markdown
Contributor Author

Amended the branch to add a dedicated CoreML-EP test and correct the earlier claim about test coverage.

My original PR description asserted that the existing TEST_F(ActivationOpTest, HardSigmoid) in onnxruntime/test/providers/cpu/activation/activation_op_test.cc would automatically exercise the CoreML path via TestActivationOp. That turned out to be wrong: when CoreML EP rejects a node, the ORT session silently falls back to CPU and OpTester::Run still sees the correct output, so the test passes regardless of CoreML coverage. I confirmed this empirically by building with my patch reverted — the existing multi-EP test still passed.

The new test in coreml_basic_test.cc uses RunAndVerifyOutputsWithEP with ExpectedEPNodeAssignment::All, which asserts that every graph node is actually assigned to the CoreML EP (not just that the final output matches CPU). Verified that this test genuinely catches the regression: temporarily removing "HardSigmoid" from the activation op builder's op_types vector causes the test to fail with VerifyEPNodeAssignment emitting a fatal failure, as expected. With the patch applied, both NN-format and MLProgram-format sub-cases pass.

This pattern worth keeping in mind more broadly — the recent Softplus/Elu addition (#26462) also relies on the multi-EP CPU test and may not be catching CoreML-side regressions either.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds HardSigmoid operator coverage to the CoreML Execution Provider so models using this activation no longer fall back to CPU (avoiding CoreML↔CPU graph breaks) while maintaining output parity with the CPU reference.

Changes:

  • Implement HardSigmoid in CoreML EP activation builder for both MLProgram (sigmoid_hard) and NeuralNetwork (ActivationSigmoidHard) paths, including alpha/beta wiring.
  • Register HardSigmoid in the CoreML op builder factory.
  • Add a dedicated CoreML EP test that verifies full-node assignment and output correctness in both NN and MLProgram formats; update the supported-ops doc list.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated no comments.

File Description
tools/ci_build/github/apple/coreml_supported_mlprogram_ops.md Documents ai.onnx:HardSigmoid as supported for MLProgram.
onnxruntime/test/providers/coreml/coreml_basic_test.cc Adds a single-node HardSigmoid model test verifying full CoreML assignment and correct outputs (NN + MLProgram).
onnxruntime/core/providers/coreml/builders/op_builder_factory.cc Registers HardSigmoid with the activation op builder.
onnxruntime/core/providers/coreml/builders/impl/activation_op_builder.cc Implements HardSigmoid conversion for MLProgram and NeuralNetwork model formats and lists it as a supported activation.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@yuslepukhin
Copy link
Copy Markdown
Member

The PR may require rebase from main when the pipelines are fixed.

Adds `HardSigmoid` to the CoreML Execution Provider's activation op
builder. Both MLProgram (`sigmoid_hard`) and NeuralNetwork
(`ActivationSigmoidHard`) code paths are implemented; the op's ONNX
definition matches CoreML MIL's `sigmoid_hard` exactly, so no
decomposition is required.

Adds a dedicated CoreML-EP test `CoreMLExecutionProviderTest.HardSigmoidTest`
that verifies the entire graph is placed on the CoreML EP (both NN and
MLProgram formats) via `ExpectedEPNodeAssignment::All`, and that the output
matches the CPU reference. The existing multi-EP test in
`activation_op_test.cc` silently falls back to CPU for unsupported-on-EP
ops, so a dedicated test is required to genuinely verify the CoreML path.

Also updates coreml_supported_mlprogram_ops.md.

Fixes microsoft#28181.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@yuslepukhin
Copy link
Copy Markdown
Member

Force push makes it hard to review the changes.

@maxwbuckley
Copy link
Copy Markdown
Contributor Author

Apologies — I force-pushed after rebasing on main and amending. Won't repeat that pattern; I'll stack follow-up commits instead so the review-since-last diff stays usable.

For this round, what changed on top of the original commit (72940ee5ab) is exactly two things:

  1. Dtype gate for HardSigmoid — address your inline comment on line 182. New branch in ActivationOpBuilder::IsOpSupportedImpl:

    if (op_type == "HardSigmoid") {
      const auto input_dtype = node.InputDefs()[0]->TypeAsProto()->tensor_type().elem_type();
      if (input_dtype != ONNX_NAMESPACE::TensorProto_DataType_FLOAT &&
          input_dtype != ONNX_NAMESPACE::TensorProto_DataType_FLOAT16) {
        LOGS(logger, VERBOSE) << ...;
        return false;
      }
    }

    Double / bfloat16 inputs now fall back to CPU at GetCapability time instead of silently being narrowed to fp16 via the else branch of AddToModelBuilderImpl. Scoped to HardSigmoid only — LeakyRelu / Elu kept as-is to match your "pre-existing pattern" note; happy to open a separate PR if you want the same fix for those.

  2. Rebase onto current main — no code changes, just a fast-forward over 4265122712.

Range-diff if helpful: git range-diff 72940ee5ab..81c4421ecd against the forked branch.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated no new comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Copy Markdown
Member

@yuslepukhin yuslepukhin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@yuslepukhin yuslepukhin enabled auto-merge (squash) April 23, 2026 22:38
@yuslepukhin
Copy link
Copy Markdown
Member

/azp run Win_TRT_Minimal_CUDA_Test_CI, Windows GPU Doc Gen CI Pipeline

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 2 pipeline(s).

@yuslepukhin yuslepukhin merged commit 5dd7f15 into microsoft:main Apr 24, 2026
95 of 100 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature Request] CoreML EP: add HardSigmoid support

3 participants