Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[OpenMP] Align up the size when calling aligned_alloc #65525

Merged
merged 1 commit into from
Sep 6, 2023

Conversation

shiltian
Copy link
Contributor

@shiltian shiltian commented Sep 6, 2023

Based on https://en.cppreference.com/w/c/memory/aligned_alloc, the size is supposed
to be a multiple of alignment, and it is implementation defined behavior if not.
We have a non-conformant use in kmp_barrier.h when allocating distribute barrier.
The size of the barrier is 576 and the alignment is 4*CACHE_LINE, which is 256
on most systems. Apparently it works perfectly fine for Linux and Intel-based Mac,
but not for Apple Silicon based Mac.

Fix #63194.

Based on https://en.cppreference.com/w/c/memory/aligned_alloc, the `size` is supposed
to be a multiple of `alignment`, and it is implementation defined behavior if not.
We have a non-conformant use in `kmp_barrier.h` when allocating distribute barrier.
The size of the barrier is 576 and the alignment is `4*CACHE_LINE`, which is 64
on most systems. Apparently it works perfectly fine for Linux and Intel-based Mac,
but not for Apple Silicon based Mac.

Fix llvm#63194.
@shiltian shiltian requested review from TerryLWilmarth and a team September 6, 2023 20:10
Copy link
Contributor

@jhuber6 jhuber6 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LG

@shiltian shiltian merged commit 99d67fb into llvm:main Sep 6, 2023
1 of 2 checks passed
@shiltian shiltian deleted the shiltian/issue63194 branch September 6, 2023 20:28
avillega pushed a commit to avillega/llvm-project that referenced this pull request Sep 11, 2023
Based on https://en.cppreference.com/w/c/memory/aligned_alloc, the
`size` is supposed
to be a multiple of `alignment`, and it is implementation defined
behavior if not.
We have a non-conformant use in `kmp_barrier.h` when allocating
distribute barrier.
The size of the barrier is 576 and the alignment is `4*CACHE_LINE`,
which is 256
on most systems. Apparently it works perfectly fine for Linux and
Intel-based Mac,
but not for Apple Silicon based Mac.

Fix llvm#63194.
@TerryLWilmarth
Copy link
Contributor

Was this performance tested on Intel hardware? Distributed barrier is designed specifically for use on Intel hardware.

@jdoerfert
Copy link
Member

If this causes issues we need to distinguish apple here or catch the nullptr that is supposed to be returned https://open-std.org/JTC1/SC22/WG14/www/docs/summary.htm#dr_460

@shiltian
Copy link
Contributor Author

shiltian commented Sep 11, 2023

First, I don't think it will affect the performance of Intel hardware since it just "wastes" some memory. Next, as pointed by the two links in this PR, it is UB if the size is not a multiple of the alignment. The fact that it works on Intel hardware doesn't make it a right implementation. Lastly, if this feature is designed specifically for Intel, we need to restrict its use to Intel only hardware and disable the test on other platforms. However, on the other hand, if we only restrict it to Intel, then we can keep the previous implementation since it works on Intel.

@jdoerfert
Copy link
Member

FWIW, I think the last change to the standard clarified it's not ub. I might have read it wrong but that is what my link was about.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

check-openmp: some test cases failed on arm64/MacOS
4 participants