-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use proper unaligned access attributes #2881
Conversation
hm, i think this doesn't quite work on msvc. gcc "guarantees" (mumblemumble) that |
oh, also for icc, i don't know why this was implemented in the first place since icc only supports intel x86 processors anyways. |
What is the motivation for this patch? I'm not against merging it, just wondering why you want it. Does this measurably improve performance on some compiler/architecture? Does this fix build errors?
This is actually a no-op, since the kernel uses its own implementation of |
Well, originally it was to fix armv6, but that was already "fixed" by #2633. It should still help performance, but I guess it is not that important, since packed structure access is slow on gcc, but only for armv6: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=55218. |
also I think #2633 left legacy broken? |
Thanks for the PR @Hello71! Could you please separate it into two PRs:
|
after extensive investigation (https://gcc.godbolt.org/z/8xYbczn8e), i think this change is important on armv7 gcc too, because gcc produces ok code for packed version on armv7, but terrible code for inlined packed version on armv7 with -O3 |
fb97ce4
to
0b38761
Compare
0b38761
to
c289681
Compare
Instead of using packed attribute hack, just use aligned attribute. It improves code generation on armv6 and armv7, and slightly improves code generation on aarch64. GCC generates identical code to regular aligned access on ARMv6 for all versions between 4.5 and trunk, except GCC 5 which is buggy and generates the same (bad) code as packed access: https://gcc.godbolt.org/z/hq37rz7sb
c289681
to
d2e402f
Compare
Thanks for the PR @Hello71! Sorry we forgot to merge it, but it will be part of the next release :) |
Instead of using packed attribute hack, just use aligned attribute. It
is supported back to at least GCC 3, and __declspec(align) is apparently
supported by all MSVC versions. GCC generates identical code to regular
aligned access on ARMv6 for all versions between 4.5 and trunk, except
GCC 5 which is buggy and generates the same (bad) code as packed access:
https://gcc.godbolt.org/z/hq37rz7sb
Also enable unaligned memory access in kernel using proper macros.
One problem with this approach is that some users may have already set MEM_FORCE_MEMORY_ACCESS=0 based on the original (wrong) implementation, and now with the correct implementation they will be using unnecessarily slow access. Possibly the macro should have new name to reflect new implementation, or it should just be automatic with no macro.