[CoreML EP] Implement Unary & Reduce operators#15532
[CoreML EP] Implement Unary & Reduce operators#15532skottmckay merged 49 commits intomicrosoft:mainfrom
Conversation
## Description Implements support for LeakyReLU in ActivationOpBuilder for CoreML's EP. This speeds up inference on macOS significantly for models using LeakyReLU.
…duceMean ## Description Implements support for mentioned operators in ActivationOpBuilder for CoreML's EP.
edgchen1
left a comment
There was a problem hiding this comment.
Thanks for your contribution!
onnxruntime/core/providers/coreml/builders/impl/reduction_op_builder.cc
Outdated
Show resolved
Hide resolved
onnxruntime/core/providers/coreml/builders/impl/reduction_op_builder.cc
Outdated
Show resolved
Hide resolved
onnxruntime/core/providers/coreml/builders/impl/unary_op_builder.cc
Outdated
Show resolved
Hide resolved
|
Thanks for your review @edgchen1 ! I've followed up on all of your comments/suggestions. Please let me know how it looks now. |
onnxruntime/core/providers/coreml/builders/impl/reduction_op_builder.cc
Outdated
Show resolved
Hide resolved
onnxruntime/core/providers/coreml/builders/impl/reduction_op_builder.cc
Outdated
Show resolved
Hide resolved
onnxruntime/core/providers/coreml/builders/impl/reduction_op_builder.cc
Outdated
Show resolved
Hide resolved
onnxruntime/core/providers/coreml/builders/impl/reduction_op_builder.cc
Outdated
Show resolved
Hide resolved
onnxruntime/core/providers/coreml/builders/impl/reduction_op_builder.cc
Outdated
Show resolved
Hide resolved
|
/azp run MacOS CI Pipeline |
|
Azure Pipelines successfully started running 1 pipeline(s). |
…uilder.cc Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com>
…uilder.cc Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com>
|
Hi @ShukantPal, thank you for this contribution! We're interested in learning more about your use case. Can you tell us a little bit about how you are using ONNX Runtime? Or if you would prefer you can reach me at nakersha@microsoft.com |
|
Hi @natke, I'm developing a goofy macOS virtual camera that uses different video filters like FaceMesh, CenterFace, DFL, etc. To get real-time frame rates, executing on CoreML / ANE is necessary on M1 MacBooks. I wanted to keep my application cross-platform to run on Intel Macs (with discrete GPUs) and Windows in the future. That's why I chose ONNX runtime, but having full CoreML support is still necessary for satisfactory performance on M1. |
|
Sounds pretty awesome. Is it published anywhere? I'd love to see a demo |
|
@natke Haha, not yet − I've been working on it. Can send you a beta build when ready :-) |
|
Azure Pipelines successfully started running 5 pipeline(s). |
|
Azure Pipelines successfully started running 10 pipeline(s). |
But I need help understanding why the test_layer_normalization* tests are failing on CoreML/macOS. The ReduceMean op-tests pass, but still there's some mismatch with downstream "Reshape" layers in the test models. I don't have this issue when testing the ReduceMean layers in my own models. |
|
Given the testing complications, I've commented out the line registering ReduceMean. Hopefully, rest of the PR provides enough value for it to be merged :-) |
|
Can you clarify which tests are failing? I pulled your changes, uncommented the ReduceMean line in onnxruntime/core/providers/coreml/builders/op_builder_factory.cc and did a build on a mac with CoreML enabled and the unit tests passed. |
|
/azp run Windows CPU CI Pipeline,Windows GPU CI Pipeline,Windows GPU TensorRT CI Pipeline,ONNX Runtime Web CI Pipeline,Linux CPU CI Pipeline,Linux CPU Minimal Build E2E CI Pipeline,Linux GPU CI Pipeline,Linux GPU TensorRT CI Pipeline,Linux OpenVINO CI Pipeline,MacOS CI Pipeline |
|
/azp run orttraining-amd-gpu-ci-pipeline,orttraining-linux-ci-pipeline,orttraining-linux-gpu-ci-pipeline,orttraining-ortmodule-distributed,onnxruntime-python-checks-ci-pipeline,onnxruntime-binary-size-checks-ci-pipeline |
|
Azure Pipelines successfully started running 5 pipeline(s). |
|
Azure Pipelines successfully started running 10 pipeline(s). |
|
@skottmckay Here were the failing tests: https://dev.azure.com/onnxruntime/onnxruntime/_build/results?buildId=1012534&view=logs&j=07136875-e4b3-5a15-ff50-026d13f57d01&t=b16e8f5d-fb27-5dd3-278d-cb1fb09fb81a They come from the --build_wheel option. |
|
Definitely want to get it added, but I think we need to figure out the root cause given we're calling AddReductionParams the same way for both ReduceMean and ReduceSum, so if there's an issue I would expect it potentially applies to both operator types and it's possibly that there's no test showing the issue for ReduceSum. The failing tests are ONNX test cases. We create a binary called onnx_test_runner that can be used to execute/debug. The ONNX tests are in ./cmake/external/onnx/onnx/backend/test/data/node. Specify the EP with '-e'. CPU passes, CoreML fails. e.g. I don't think this is the root cause, but the last param for the CoreML parms is reduceAll which might not be 1:1 with noop_with_empty_axes. https://apple.github.io/coremltools/mlmodel/Format/NeuralNetwork.html#reducemeanlayerparams CoreML reduceAll says to ignore the axes parameter and reduce all. So if axes is empty, reduceAll is true, but if noop_with_empty_axes is also true the ONNX spec says do nothing which doesn't seem to have an equivalent in CoreML. In that case we could probably drop the node in CoreML or insert an Identity node to do the value rename given it's turned into a no-op. |
|
Sorry - I overlooked the early exit for noop_with_empty_axes. It would be good to clarify the implementation in ReductionOpBuilder::AddToModelBuilderImpl though and set a reduceAll value instead of passing noop_with_empty_axes into the AddReductionParms call. |
|
And this is the issue: Rogue space in the attribute name so it wasn't reading the actual value from the node of 0. I'll add a test and create a separate PR for that fix, but please test out your changes with just the fix ("axis " -> "axis") so we can check all CIs pass with that. |
|
Should be addressed by #16046. |
|
I can confirm that patching the rogue space in Flatten fixes this! Amazing.
|
|
/azp run Windows CPU CI Pipeline,Windows GPU CI Pipeline,Windows GPU TensorRT CI Pipeline,ONNX Runtime Web CI Pipeline,Linux CPU CI Pipeline,Linux CPU Minimal Build E2E CI Pipeline,Linux GPU CI Pipeline,Linux GPU TensorRT CI Pipeline,Linux OpenVINO CI Pipeline,MacOS CI Pipeline |
|
/azp run orttraining-amd-gpu-ci-pipeline,orttraining-linux-ci-pipeline,orttraining-linux-gpu-ci-pipeline,orttraining-ortmodule-distributed,onnxruntime-python-checks-ci-pipeline,onnxruntime-binary-size-checks-ci-pipeline |
|
Azure Pipelines successfully started running 5 pipeline(s). |
|
Azure Pipelines successfully started running 10 pipeline(s). |
|
/azp run Windows ARM64 QNN CI Pipeline,Linux QNN CI Pipeline |
|
Azure Pipelines successfully started running 2 pipeline(s). |

Description
This change is a follow-up to #15327. It adds Unary operators (Sqrt, Reciprocal) and Reduce operators (ReduceSum, ReduceMean). I've tried to follow existing patterns in the code :-)
Motivation and Context
This reduces fragmentation across EPs when using CoreML on macOS, thereby speeding up execution.