Skip to content

[CoreML EP] Implement Unary & Reduce operators#15532

Merged
skottmckay merged 49 commits intomicrosoft:mainfrom
ShukantPal:main
May 24, 2023
Merged

[CoreML EP] Implement Unary & Reduce operators#15532
skottmckay merged 49 commits intomicrosoft:mainfrom
ShukantPal:main

Conversation

@ShukantPal
Copy link
Contributor

Description

This change is a follow-up to #15327. It adds Unary operators (Sqrt, Reciprocal) and Reduce operators (ReduceSum, ReduceMean). I've tried to follow existing patterns in the code :-)

Motivation and Context

This reduces fragmentation across EPs when using CoreML on macOS, thereby speeding up execution.

ShukantPal and others added 7 commits April 2, 2023 15:10
## Description

Implements support for LeakyReLU in ActivationOpBuilder for CoreML's EP. This speeds up inference on macOS significantly for models using LeakyReLU.
…duceMean

## Description

Implements support for mentioned operators in ActivationOpBuilder for CoreML's EP.
Copy link
Contributor

@edgchen1 edgchen1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your contribution!

@ShukantPal
Copy link
Contributor Author

Thanks for your review @edgchen1 ! I've followed up on all of your comments/suggestions. Please let me know how it looks now.

@ShukantPal ShukantPal requested a review from edgchen1 April 23, 2023 02:43
@edgchen1
Copy link
Contributor

/azp run MacOS CI Pipeline

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

ShukantPal and others added 2 commits April 25, 2023 15:48
…uilder.cc

Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com>
…uilder.cc

Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com>
@natke
Copy link
Contributor

natke commented Apr 26, 2023

Hi @ShukantPal, thank you for this contribution! We're interested in learning more about your use case. Can you tell us a little bit about how you are using ONNX Runtime? Or if you would prefer you can reach me at nakersha@microsoft.com

@ShukantPal
Copy link
Contributor Author

Hi @natke, I'm developing a goofy macOS virtual camera that uses different video filters like FaceMesh, CenterFace, DFL, etc. To get real-time frame rates, executing on CoreML / ANE is necessary on M1 MacBooks. I wanted to keep my application cross-platform to run on Intel Macs (with discrete GPUs) and Windows in the future. That's why I chose ONNX runtime, but having full CoreML support is still necessary for satisfactory performance on M1.

@natke
Copy link
Contributor

natke commented Apr 26, 2023

Sounds pretty awesome. Is it published anywhere? I'd love to see a demo

@ShukantPal
Copy link
Contributor Author

@natke Haha, not yet − I've been working on it. Can send you a beta build when ready :-)

@azure-pipelines
Copy link

Azure Pipelines successfully started running 5 pipeline(s).

@azure-pipelines
Copy link

Azure Pipelines successfully started running 10 pipeline(s).

@ShukantPal
Copy link
Contributor Author

ShukantPal commented May 20, 2023

  • I ran the linter on Windows and fixed the linting issues.
  • It seems like TensorRT, OpenVINO, and DNN don't support ReduceMean opset 18 with axes input, so I disabled those providers in the new tests (copying that from similar tests for ReduceMean).

But I need help understanding why the test_layer_normalization* tests are failing on CoreML/macOS. The ReduceMean op-tests pass, but still there's some mismatch with downstream "Reshape" layers in the test models.

I don't have this issue when testing the ReduceMean layers in my own models.

@ShukantPal
Copy link
Contributor Author

@skottmckay

Given the testing complications, I've commented out the line registering ReduceMean. Hopefully, rest of the PR provides enough value for it to be merged :-)

@skottmckay
Copy link
Contributor

Can you clarify which tests are failing?

I pulled your changes, uncommented the ReduceMean line in onnxruntime/core/providers/coreml/builders/op_builder_factory.cc and did a build on a mac with CoreML enabled and the unit tests passed.

@skottmckay
Copy link
Contributor

/azp run Windows CPU CI Pipeline,Windows GPU CI Pipeline,Windows GPU TensorRT CI Pipeline,ONNX Runtime Web CI Pipeline,Linux CPU CI Pipeline,Linux CPU Minimal Build E2E CI Pipeline,Linux GPU CI Pipeline,Linux GPU TensorRT CI Pipeline,Linux OpenVINO CI Pipeline,MacOS CI Pipeline

@skottmckay
Copy link
Contributor

/azp run orttraining-amd-gpu-ci-pipeline,orttraining-linux-ci-pipeline,orttraining-linux-gpu-ci-pipeline,orttraining-ortmodule-distributed,onnxruntime-python-checks-ci-pipeline,onnxruntime-binary-size-checks-ci-pipeline

@azure-pipelines
Copy link

Azure Pipelines successfully started running 5 pipeline(s).

@azure-pipelines
Copy link

Azure Pipelines successfully started running 10 pipeline(s).

@ShukantPal
Copy link
Contributor Author

@skottmckay
Copy link
Contributor

Definitely want to get it added, but I think we need to figure out the root cause given we're calling AddReductionParams the same way for both ReduceMean and ReduceSum, so if there's an issue I would expect it potentially applies to both operator types and it's possibly that there's no test showing the issue for ReduceSum.

The failing tests are ONNX test cases. We create a binary called onnx_test_runner that can be used to execute/debug. The ONNX tests are in ./cmake/external/onnx/onnx/backend/test/data/node. Specify the EP with '-e'. CPU passes, CoreML fails.

e.g. ./build/MacOS/Debug/onnx_test_runner -e coreml ./cmake/external/onnx/onnx/backend/test/data/node/test_layer_normalization_2d_axis0_expanded

I don't think this is the root cause, but the last param for the CoreML parms is reduceAll which might not be 1:1 with noop_with_empty_axes.

https://apple.github.io/coremltools/mlmodel/Format/NeuralNetwork.html#reducemeanlayerparams

CoreML reduceAll says to ignore the axes parameter and reduce all.

So if axes is empty, reduceAll is true, but if noop_with_empty_axes is also true the ONNX spec says do nothing which doesn't seem to have an equivalent in CoreML. In that case we could probably drop the node in CoreML or insert an Identity node to do the value rename given it's turned into a no-op.

@skottmckay
Copy link
Contributor

Sorry - I overlooked the early exit for noop_with_empty_axes. It would be good to clarify the implementation in ReductionOpBuilder::AddToModelBuilderImpl though and set a reduceAll value instead of passing noop_with_empty_axes into the AddReductionParms call.

@skottmckay
Copy link
Contributor

Might be an issue with Flatten. There's not a lot of places in this particular test where this combination of shapes is possible to get wrong. One path is the Flatten of the model input, which should result in (1, 4) not (4, 1)

2023-05-19T06:58:33.0585720Z ERROR: test_layer_normalization_2d_axis1_expanded_cpu

2023-05-19T06:58:33.0593450Z onnxruntime.capi.onnxruntime_pybind11_state.Fail: [ONNXRuntimeError] : 1 : FAIL : Error compiling model compiler error: Espresso exception: "Invalid blob shape": generic_elementwise_kernel: cannot broadcast:
2023-05-19T06:58:33.0593910Z (3, 4)
2023-05-19T06:58:33.0594210Z (4, 1)

I ran a couple of tools to a) add shape info to all the nodes, and b) run the 'basic' level optimizations like constant folding so we can see what nodes are around when the CoreML EP is creating the model.

python -m onnxruntime.tools.symbolic_shape_infer --input D:\src\github\ort\cmake\external\onnx\onnx\backend\test\data\node\test_layer_normalization_2d_axis1_expanded\model.onnx --output D:\src\github\ort\cmake\external\onnx\onnx\backend\test\data\node\test_layer_normalization_2d_axis1_expanded\model.ssi.onnx
python -m onnxruntime.tools.optimize_onnx_model --opt_level basic D:\src\github\ort\cmake\external\onnx\onnx\backend\test\data\node\test_layer_normalization_2d_axis1_expanded\model.ssi.onnx D:\src\github\ort\cmake\external\onnx\onnx\backend\test\data\node\test_layer_normalization_2d_axis1_expanded\model.ssi.basic.onnx
image

If I comment out Flatten in op_builder_factory.cc the tests pass.

@skottmckay
Copy link
Contributor

And this is the issue:

const int64_t axis = helper.Get("axis ", 1);

Rogue space in the attribute name so it wasn't reading the actual value from the node of 0.

I'll add a test and create a separate PR for that fix, but please test out your changes with just the fix ("axis " -> "axis") so we can check all CIs pass with that.

@skottmckay
Copy link
Contributor

Should be addressed by #16046.

@ShukantPal
Copy link
Contributor Author

ShukantPal commented May 23, 2023 via email

@skottmckay
Copy link
Contributor

/azp run Windows CPU CI Pipeline,Windows GPU CI Pipeline,Windows GPU TensorRT CI Pipeline,ONNX Runtime Web CI Pipeline,Linux CPU CI Pipeline,Linux CPU Minimal Build E2E CI Pipeline,Linux GPU CI Pipeline,Linux GPU TensorRT CI Pipeline,Linux OpenVINO CI Pipeline,MacOS CI Pipeline

@skottmckay
Copy link
Contributor

/azp run orttraining-amd-gpu-ci-pipeline,orttraining-linux-ci-pipeline,orttraining-linux-gpu-ci-pipeline,orttraining-ortmodule-distributed,onnxruntime-python-checks-ci-pipeline,onnxruntime-binary-size-checks-ci-pipeline

@azure-pipelines
Copy link

Azure Pipelines successfully started running 5 pipeline(s).

@azure-pipelines
Copy link

Azure Pipelines successfully started running 10 pipeline(s).

@skottmckay
Copy link
Contributor

/azp run Windows ARM64 QNN CI Pipeline,Linux QNN CI Pipeline

@azure-pipelines
Copy link

Azure Pipelines successfully started running 2 pipeline(s).

@skottmckay skottmckay merged commit f316bc5 into microsoft:main May 24, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants