Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[aarch64] set the build flag to "-mcpu=generic" #1731

Merged
merged 1 commit into from
Nov 13, 2023

Conversation

snadampal
Copy link
Contributor

Description

this change is required to support binary compatibility across all armv8 platforms.

Fixes # (github issue)
This fixes issue on PyTorch repo:
pytorch/pytorch#109312

Checklist

General

  • Do all unit and benchdnn tests (make test and make test_benchdnn_*) pass locally for each commit?
  • [ x] Have you formatted the code using clang-format?

Performance improvements

  • Have you submitted performance data that demonstrates performance improvements?

New features

  • Have you published an RFC for the new feature?
  • Was the RFC approved?
  • Have you added relevant tests?

Bug fixes

  • [ x] Have you included information on how to reproduce the issue (either in a github issue or in this PR)?
  • Have you added relevant regression tests?

RFC PR

  • Does RFC document follow the template?
  • Have you added a link to the rendered document?

cmake/platform.cmake Outdated Show resolved Hide resolved
@nSircombe
Copy link
Contributor

The PR setting this flag originally is quite old now - the original intent was to make sure the build picked the right march and mtune for the machine it was built on "out of the box". That was before any work to leverage oneDNN in PyTorch (or TF (where the CMake's not relevant of course)) on AArch64.

I suppose it make sense to be consistent with the other builds which (I believe) set a minimum (e.g. setting -xSSE4.1 on x86), in which case defaulting to optimise for the host arch should go. @dzarukin - its that right? what's oneDNN's stance on this, should the default build aim for portability or performance*?

However, I'm not sure setting -march= is the way to go. -mcpu=generic should be equivalent to setting -mtune=generic and -march=armv8-a. However I think this is the default for both GCC and Clang? So it may be sufficienct to just remove -mcpu=native here entirely (i.e. take these Cmake IF blocks out).

Looking at the PyTorch issue, is it necessary to patch the Cmake during the build? I thought passing Cmake -DDNNL_ARCH_OPT_FLAGS="-mcpu=generic" at build time would trump the DEF_ARCH_OPT_FLAGS set in the Cmake source and get you a 'generic' build without the faff of patching oneDNN, but I may be wrong.

  • performance - optimised (ACL based) kernels will (assuming ACL has been built with multi_isa set) detect the available HW features and use the appropriate kernel irrespective of what march oneDNN was built with, so the performance there will not be impacted. Possibly building generic will have an impact on the performance of ref. C++ kernels where they are used.

@vpirogov vpirogov added this to the v3.4 milestone Oct 4, 2023
@snadampal
Copy link
Contributor Author

Thanks everyone for the review. I have updated the PR to use "-mcpu=generic" and added a comment on how to override this. Please review and merge if it looks good. Thank you!

@nSircombe
Copy link
Contributor

Thanks for updating @snadampal.
I've had a go with swapping native for generic and not found any issues with/without ACL on various targets - the change looks OK to me.

Nit-picking, but the comment does slightly muddle tune and arch maybe something more like...

Defaults to a generic cpu target, equivalent to setting -mtune=generic -march=armv8-a.
This ensures no implementation specific tuning, or architectural features beyond armv8-a are used,
for portability across AArch64 systems.
The DNNL_ARCH_OPT_FLAGS build option can be used to override these defaults to optimise for a specific cpu, or revision of the Armv8 architecture.

This is equivalent to setting -mtune=generic -march=armv8-a.
This ensures no implementation specific tuning, or architectural
features beyond armv8-a are used,for portability across
AArch64 systems.The DNNL_ARCH_OPT_FLAGS build option can be used
to override these defaults to optimise for a specific cpu,
or revision of the Armv8 architecture.
@snadampal snadampal changed the title [aarch64] set the build flag to "-march=armv8-a" [aarch64] set the build flag to "-mcpu=generic" Oct 10, 2023
@snadampal
Copy link
Contributor Author

thanks @nSircombe , I have updated the comments.

@nSircombe
Copy link
Contributor

Looks good to me!

@dzarukin dzarukin merged commit d9ae2e9 into oneapi-src:master Nov 13, 2023
10 checks passed
@dzarukin
Copy link
Contributor

Thank you for the contribution!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants