[aarch64] set the build flag to "-mcpu=generic" #1731

snadampal · 2023-09-30T12:39:30Z

Description

this change is required to support binary compatibility across all armv8 platforms.

Fixes # (github issue)
This fixes issue on PyTorch repo:
pytorch/pytorch#109312

Checklist

General

Do all unit and benchdnn tests (make test and make test_benchdnn_*) pass locally for each commit?
[ x] Have you formatted the code using clang-format?

Performance improvements

Have you submitted performance data that demonstrates performance improvements?

New features

Have you published an RFC for the new feature?
Was the RFC approved?
Have you added relevant tests?

Bug fixes

[ x] Have you included information on how to reproduce the issue (either in a github issue or in this PR)?
Have you added relevant regression tests?

RFC PR

Does RFC document follow the template?
Have you added a link to the rendered document?

cmake/platform.cmake

nSircombe · 2023-10-01T20:35:42Z

The PR setting this flag originally is quite old now - the original intent was to make sure the build picked the right march and mtune for the machine it was built on "out of the box". That was before any work to leverage oneDNN in PyTorch (or TF (where the CMake's not relevant of course)) on AArch64.

I suppose it make sense to be consistent with the other builds which (I believe) set a minimum (e.g. setting -xSSE4.1 on x86), in which case defaulting to optimise for the host arch should go. @dzarukin - its that right? what's oneDNN's stance on this, should the default build aim for portability or performance*?

However, I'm not sure setting -march= is the way to go. -mcpu=generic should be equivalent to setting -mtune=generic and -march=armv8-a. However I think this is the default for both GCC and Clang? So it may be sufficienct to just remove -mcpu=native here entirely (i.e. take these Cmake IF blocks out).

Looking at the PyTorch issue, is it necessary to patch the Cmake during the build? I thought passing Cmake -DDNNL_ARCH_OPT_FLAGS="-mcpu=generic" at build time would trump the DEF_ARCH_OPT_FLAGS set in the Cmake source and get you a 'generic' build without the faff of patching oneDNN, but I may be wrong.

performance - optimised (ACL based) kernels will (assuming ACL has been built with multi_isa set) detect the available HW features and use the appropriate kernel irrespective of what march oneDNN was built with, so the performance there will not be impacted. Possibly building generic will have an impact on the performance of ref. C++ kernels where they are used.

snadampal · 2023-10-09T20:22:19Z

Thanks everyone for the review. I have updated the PR to use "-mcpu=generic" and added a comment on how to override this. Please review and merge if it looks good. Thank you!

nSircombe · 2023-10-09T20:40:41Z

Thanks for updating @snadampal.
I've had a go with swapping native for generic and not found any issues with/without ACL on various targets - the change looks OK to me.

Nit-picking, but the comment does slightly muddle tune and arch maybe something more like...

Defaults to a generic cpu target, equivalent to setting -mtune=generic -march=armv8-a.
This ensures no implementation specific tuning, or architectural features beyond armv8-a are used,
for portability across AArch64 systems.
The DNNL_ARCH_OPT_FLAGS build option can be used to override these defaults to optimise for a specific cpu, or revision of the Armv8 architecture.

This is equivalent to setting -mtune=generic -march=armv8-a. This ensures no implementation specific tuning, or architectural features beyond armv8-a are used,for portability across AArch64 systems.The DNNL_ARCH_OPT_FLAGS build option can be used to override these defaults to optimise for a specific cpu, or revision of the Armv8 architecture.

snadampal · 2023-10-10T13:25:46Z

thanks @nSircombe , I have updated the comments.

nSircombe · 2023-10-12T16:25:19Z

Looks good to me!

dzarukin · 2023-11-13T06:37:52Z

Thank you for the contribution!

dzarukin reviewed Oct 1, 2023

View reviewed changes

cmake/platform.cmake Outdated Show resolved Hide resolved

vpirogov added this to the v3.4 milestone Oct 4, 2023

snadampal force-pushed the aarch64_build_flag branch from 74f8878 to 220c3c7 Compare October 9, 2023 20:17

snadampal force-pushed the aarch64_build_flag branch from 220c3c7 to d129496 Compare October 10, 2023 13:24

snadampal changed the title ~~[aarch64] set the build flag to "-march=armv8-a"~~ [aarch64] set the build flag to "-mcpu=generic" Oct 10, 2023

dzarukin approved these changes Nov 6, 2023

View reviewed changes

dzarukin merged commit d9ae2e9 into oneapi-src:master Nov 13, 2023
10 checks passed

snadampal mentioned this pull request Nov 21, 2023

[cmake] set 'mcpu=generic' as the default build flag for mkldnn on aarch64 pytorch/pytorch#113820

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[aarch64] set the build flag to "-mcpu=generic" #1731

[aarch64] set the build flag to "-mcpu=generic" #1731

snadampal commented Sep 30, 2023

nSircombe commented Oct 1, 2023

snadampal commented Oct 9, 2023

nSircombe commented Oct 9, 2023

snadampal commented Oct 10, 2023

nSircombe commented Oct 12, 2023

dzarukin commented Nov 13, 2023

[aarch64] set the build flag to "-mcpu=generic" #1731

[aarch64] set the build flag to "-mcpu=generic" #1731

Conversation

snadampal commented Sep 30, 2023

Description

Checklist

General

Performance improvements

New features

Bug fixes

RFC PR

nSircombe commented Oct 1, 2023

snadampal commented Oct 9, 2023

nSircombe commented Oct 9, 2023

snadampal commented Oct 10, 2023

nSircombe commented Oct 12, 2023

dzarukin commented Nov 13, 2023