Update XNNPACK to latest version #18038

skottmckay · 2023-10-20T10:13:50Z

Description

Update XNNPACK to latest version

adds fp16 kernels and various other improvements
requires pthreadpool update as well

Most code updates in the XNNPACK EP are to adjust to the new XNNPACK API

'setup' is split into 'reshape' and 'setup'
some ops use a workspace buffer
- copied workspace allocation from XNNPACK unit test code
some suffixes changed

Added wrapper for XNNPACK caches to base XNNPACK EP kernel

simplifies usage
XNNPACK split out the code and weights caches, but the code cache isn't currently usable via the public API
- we could use the internal types if we think it's required for performance reasons. non-trivial though as we'd need to propagate ifdef values from the XNNPACK build up to the ORT build.
- using XNNPACK internals would also mean we would not be able to support using a pre-build XNNPACK package
  - not an issue currently

Fixed opset registration for internal NHWC domain

was not being tied to the ONNX version, so nodes inserted by layout transformation had the incorrect opset
a number of other places needed updating once this issue was fixed

Remove support for NCHW Resize from XNNPACK EP so it's NHWC only

we only supported NCHW for fp32,
- doing so adds complexity in multiple places (XNNPACK EP kernel implementation, layout transformation and transpose optimization)
- unclear if that complexity provides any benefit. can add back if required by production scenario

Motivation and Context

We're looking at enabling fp16 support for CoreML and NNAPI. If we do that we need a good fallback story if the CPU EP will be used. The XNNPACK fp16 kernels will hopefully provide that.

NOTE: This PR doesn't add fp16 support to the XNNPACK EP kernels. That can be done as required in separate EPs and should be relatively simple to do.

Pending some updates to the cmake config there to work with FetchContent so patch file is WIP.

…internal cache.h Fix Resize registrations Fix opset of internal NHWC domain not matching the ONNX opset for the model

…hanges that required extra patching. Update dependency artifacts in az.

… domain schema available.

…s. As that EP explicitly registers ops in the old opsets I'm assuming they need these parallel schemas.

…n adding NHWC versions of nodes.

onnxruntime/core/providers/xnnpack/nn/resize.cc

snnn · 2023-10-26T16:26:36Z

/azp run Windows CPU CI Pipeline

azure-pipelines · 2023-10-26T16:26:48Z

Azure Pipelines successfully started running 1 pipeline(s).

skottmckay · 2023-10-27T05:49:12Z

I'll address the deps.txt conflict once everyone is happy with the changes as it requires updating the dependencies package in the CI to match the latest main.

onnxruntime/core/providers/xnnpack/nn/average_pool.cc

onnxruntime/core/providers/xnnpack/nn/conv.cc

wejoncy · 2023-10-27T07:13:25Z

LGTM!

- changes to check if an EP had a kernel to speed up unit tests (assumably) didn't take into account some EPs only have NHWC versions of kernels - update xnnpack kernels to cover earlier opsets - had to add schemas for earlier versions for jsep and the operator unit tests also only have good coverage for earlier schemas - if we didn't do this we lose a lot of test coverage - move some files to the correct directories for the operator - fix usage of workspace in a few places - allow zero size allocation to not throw. sometime the workspace has a size of zero (e.g. Resize) when it's not needed

### Description  Update XNNPACK to latest version - adds fp16 kernels and various other improvements - requires pthreadpool update as well Most code updates in the XNNPACK EP are to adjust to the new XNNPACK API - 'setup' is split into 'reshape' and 'setup' - some ops use a workspace buffer - copied workspace allocation from XNNPACK unit test code - some suffixes changed Added wrapper for XNNPACK caches to base XNNPACK EP kernel - simplifies usage - XNNPACK split out the code and weights caches, but the code cache isn't currently usable via the public API - we could use the internal types if we think it's required for performance reasons. non-trivial though as we'd need to propagate ifdef values from the XNNPACK build up to the ORT build. - using XNNPACK internals would also mean we would not be able to support using a pre-build XNNPACK package - not an issue currently Fixed opset registration for internal NHWC domain - was not being tied to the ONNX version, so nodes inserted by layout transformation had the incorrect opset - a number of other places needed updating once this issue was fixed Remove support for NCHW Resize from XNNPACK EP so it's NHWC only - we only supported NCHW for fp32, - doing so adds complexity in multiple places (XNNPACK EP kernel implementation, layout transformation and transpose optimization) - unclear if that complexity provides any benefit. can add back if required by production scenario ### Motivation and Context  We're looking at enabling fp16 support for CoreML and NNAPI. If we do that we need a good fallback story if the CPU EP will be used. The XNNPACK fp16 kernels will hopefully provide that. NOTE: This PR doesn't add fp16 support to the XNNPACK EP kernels. That can be done as required in separate EPs and should be relatively simple to do.

skottmckay added 15 commits October 20, 2023 17:23

Update to new XNNPACK version

d6efbe4

Pending some updates to the cmake config there to work with FetchContent so patch file is WIP.

Fix some setup stuff to minimize patch of XNNPACK and to not use the …

1b38b58

…internal cache.h Fix Resize registrations Fix opset of internal NHWC domain not matching the ONNX opset for the model

Add opset 11 test model now internal NHWC domain usage is fixed

7631901

Merge remote-tracking branch 'origin/main' into skottmckay/UpdateXNNPACK

3e517d3

Switch xnnpack to latest commit which doesn't have the find_package c…

41303ef

…hanges that required extra patching. Update dependency artifacts in az.

Fix xnnpack ops support checker to make sure there's an internal NHWC…

fd228b3

… domain schema available.

Use Conv 11 in CUDA NHWC tests so schema exists in internal NHWC domain

cdb1cfc

Fix web build

4fda29e

Merge remote-tracking branch 'origin/main' into skottmckay/UpdateXNNPACK

3d5ef65

Fix release wasm build

a97d070

Merge remote-tracking branch 'origin/main' into skottmckay/UpdateXNNPACK

ec345b5

Add some older opset version to the NHWC domain so the jsep tests pas…

29b5476

…s. As that EP explicitly registers ops in the old opsets I'm assuming they need these parallel schemas.

Merge remote-tracking branch 'origin/main' into skottmckay/UpdateXNNPACK

070cc64

Fix jsep to adjust for layout transformer using correct opset now whe…

7e01387

…n adding NHWC versions of nodes.

Cleanup xnnpack file includes

05ea5ce

skottmckay marked this pull request as ready for review October 25, 2023 05:17

skottmckay requested a review from a team as a code owner October 25, 2023 05:17

skottmckay requested review from wejoncy, edgchen1 and guschmue October 25, 2023 05:18

wejoncy reviewed Oct 25, 2023

View reviewed changes

onnxruntime/core/providers/xnnpack/nn/resize.cc Outdated Show resolved Hide resolved

skottmckay requested a review from wejoncy October 27, 2023 05:48

wejoncy reviewed Oct 27, 2023

View reviewed changes

onnxruntime/core/providers/xnnpack/nn/average_pool.cc Outdated Show resolved Hide resolved

wejoncy reviewed Oct 27, 2023

View reviewed changes

onnxruntime/core/providers/xnnpack/nn/conv.cc Show resolved Hide resolved

Merge

b1f7896

skottmckay added 11 commits October 27, 2023 21:15

Update deps

c7b1d78

Fix arg order

6d897a9

Merge remote-tracking branch 'origin/main' into skottmckay/UpdateXNNPACK

5867cbe

Copy alignment details from xnnpack source

aafd537

fix typo

076dfc9

Merge remote-tracking branch 'origin/main' into skottmckay/UpdateXNNPACK

6e613bd

Add #include on map for TARGET_OS_IPHONE

54450cc

Merge remote-tracking branch 'origin/main' into skottmckay/UpdateXNNPACK

ad95255

Remove commented out code

5e2866f

Merge remote-tracking branch 'origin/main' into skottmckay/UpdateXNNPACK

16522d1

Merge. Update deps version.

b49dba1

wejoncy approved these changes Nov 3, 2023

View reviewed changes

snnn approved these changes Nov 3, 2023

View reviewed changes

snnn merged commit 4f2096b into main Nov 3, 2023
89 of 91 checks passed

snnn deleted the skottmckay/UpdateXNNPACK branch November 3, 2023 16:04

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update XNNPACK to latest version #18038

Update XNNPACK to latest version #18038

skottmckay commented Oct 20, 2023 •

edited

snnn commented Oct 26, 2023

azure-pipelines bot commented Oct 26, 2023

skottmckay commented Oct 27, 2023

wejoncy commented Oct 27, 2023

Update XNNPACK to latest version #18038

Update XNNPACK to latest version #18038

Conversation

skottmckay commented Oct 20, 2023 • edited

Description

Motivation and Context

snnn commented Oct 26, 2023

azure-pipelines bot commented Oct 26, 2023

skottmckay commented Oct 27, 2023

wejoncy commented Oct 27, 2023

skottmckay commented Oct 20, 2023 •

edited