Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use proper unaligned access attributes #2881

Merged
merged 1 commit into from
Dec 15, 2022

Conversation

Hello71
Copy link
Contributor

@Hello71 Hello71 commented Nov 29, 2021

Instead of using packed attribute hack, just use aligned attribute. It
is supported back to at least GCC 3, and __declspec(align) is apparently
supported by all MSVC versions. GCC generates identical code to regular
aligned access on ARMv6 for all versions between 4.5 and trunk, except
GCC 5 which is buggy and generates the same (bad) code as packed access:
https://gcc.godbolt.org/z/hq37rz7sb

Also enable unaligned memory access in kernel using proper macros.

One problem with this approach is that some users may have already set MEM_FORCE_MEMORY_ACCESS=0 based on the original (wrong) implementation, and now with the correct implementation they will be using unnecessarily slow access. Possibly the macro should have new name to reflect new implementation, or it should just be automatic with no macro.

@Hello71
Copy link
Contributor Author

Hello71 commented Nov 29, 2021

hm, i think this doesn't quite work on msvc. gcc "guarantees" (mumblemumble) that __attribute__((aligned(1))) will work as expected: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69502, but msvc doesn't work: https://gcc.godbolt.org/z/EMYn9s6b3. fortunately, memcpy generates correct and fast code, so i think it can be used for msvc. maybe someone can test msvc before 2015, since it's not available on compiler explorer.

@Hello71
Copy link
Contributor Author

Hello71 commented Nov 29, 2021

oh, also for icc, i don't know why this was implemented in the first place since icc only supports intel x86 processors anyways.

@terrelln terrelln self-assigned this Nov 30, 2021
@terrelln
Copy link
Contributor

What is the motivation for this patch? I'm not against merging it, just wondering why you want it. Does this measurably improve performance on some compiler/architecture? Does this fix build errors?

Also enable unaligned memory access in kernel using proper macros.

This is actually a no-op, since the kernel uses its own implementation of mem.h located here. I'm fine removing it, but please put it in its own commit.

@Hello71
Copy link
Contributor Author

Hello71 commented Nov 30, 2021

Well, originally it was to fix armv6, but that was already "fixed" by #2633. It should still help performance, but I guess it is not that important, since packed structure access is slow on gcc, but only for armv6: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=55218.

@Hello71
Copy link
Contributor Author

Hello71 commented Nov 30, 2021

also I think #2633 left legacy broken?

@terrelln
Copy link
Contributor

terrelln commented Dec 3, 2021

Thanks for the PR @Hello71!

Could you please separate it into two PRs:

  1. Apply the fix from Avoid SIGBUS on armv6 #2633 to legacy. I can merge this as soon as you put it up and the tests pass.
  2. The aligned attribute change. I'd like to wait until after our next release (which we're preparing for next week) to merge this. Since I want to give it some time to bake. In case there are any strange performance / correctness regressions.

Hello71 added a commit to Hello71/zstd that referenced this pull request Dec 6, 2021
@Hello71
Copy link
Contributor Author

Hello71 commented Dec 6, 2021

after extensive investigation (https://gcc.godbolt.org/z/8xYbczn8e), i think this change is important on armv7 gcc too, because gcc produces ok code for packed version on armv7, but terrible code for inlined packed version on armv7 with -O3

Instead of using packed attribute hack, just use aligned attribute. It
improves code generation on armv6 and armv7, and slightly improves code
generation on aarch64. GCC generates identical code to regular aligned
access on ARMv6 for all versions between 4.5 and trunk, except GCC 5
which is buggy and generates the same (bad) code as packed access:
https://gcc.godbolt.org/z/hq37rz7sb
@terrelln
Copy link
Contributor

Thanks for the PR @Hello71! Sorry we forgot to merge it, but it will be part of the next release :)

@terrelln terrelln merged commit a78c91a into facebook:dev Dec 15, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants