Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Limited to single CPU core on v16.x+ #91098

Closed
kmpanilla opened this issue May 5, 2024 · 7 comments · Fixed by #91391
Closed

Limited to single CPU core on v16.x+ #91098

kmpanilla opened this issue May 5, 2024 · 7 comments · Fixed by #91391

Comments

@kmpanilla
Copy link

kmpanilla commented May 5, 2024

We run ImageMagick with OpenMP linked to leverage multiple CPUs. Have done it for years without issue.
We recently upgraded some of our servers to FreeBSD 13.3 or 14.0 which leverage llvm-16.0.6 (freebsd 14.0) and llvm-17.0.6 (freebsd 13.3). Previously, we were on FreeBSD 13.2 which leveraged llvm-14.0.5.

After upgrading to llvm-16 or llvm-17 we're noticing that ImageMagick when linked with OpenMP is stuck on a single CPU core. Tracked it down to the libomp.so version we're leveraging from the OS, /usr/lib/libomp.so.. Manually installing llvm15 or llvm14 and copying over the older libomp.so from either and the problem goes away.

Question is, what changed between llvm15 and llvm16 that would cause us to be limited to a single CPU core?

Our application is called from PHP using pecl-imagick and ImageMagick7 (compiled with OpenMP). PHP version doesnt matter (we've reproduced on 8.1, 8.2 and 8.3). Only when LLVM gets upgrade from 14 to 16 or 17 do we have the issue. In fact, the hot copy of v15 or v14 libomp.so back to /usr/lib and we're working as expected again.

Here's the breakdown of working vs broken:
llvm14-14.0.6 - WORKS
llvm15-15.0.7 - WORKS
llvm16-16.0.6 - BROKE
llvm17-17.0.6 (13.3 default) - BROKE
llvm19-19.0.d20240426 - BROKE

Anything I can supply to help narrow this down? I'd love to get something pushed to all the branches of 16 and later, if we can solve the issue.

Thanks.

@llvmbot
Copy link
Collaborator

llvmbot commented May 5, 2024

@llvm/issue-subscribers-openmp

Author: None (kmpanilla)

We run ImageMagick with OpenMP linked to leverage multiple CPUs. Have done it for years without issue. We recently upgraded some of our servers to FreeBSD 13.3 or 14.0 which leverage llvm-16.0.6 (freebsd 14.0) and llvm-17.0.6 (freebsd 13.3). Previously, we were on FreeBSD 13.2 which leveraged llvm-14.0.5.

After upgrading to llvm-16 or llvm-17 we're noticing that ImageMagick when linked with OpenMP is stuck on a single CPU core. Tracked it down to the libomp.so version we're leveraging from the OS, /usr/lib/libomp.so.. Manually installing llvm15 or llvm14 and copying over the older libomp.so from either and the problem goes away.

Question is, what changed between llvm15 and llvm16 that would cause us to be limited to a single CPU core?

Our application is called from PHP using pecl-imagick and ImageMagick7 (compiled with OpenMP). PHP version doesnt matter (we've reproduced on 8.1, 8.2 and 8.3). Only when LLVM gets upgrade from 14 to 16 or 17 do we have the issue. In fact, the hot copy of v15 or v14 libomp.so back to /usr/lib and we're working as expected again.

Here's the breakdown of working vs broken:
llvm14-14.0.6 - WORKS
llvm15-15.0.7 - WORKS
llvm16-16.0.6 - BROKE
llvm17-17.0.6 (13.3 default) - BROKE
llvm19-19.0.d20240426 - BROKE

Anything I can supply to help narrow this down? I'd love to get something pushed to all the branches of 16 and later, if we can solve the issue.

Thanks.

@jpeyton52
Copy link
Contributor

@kmpanilla , Do you mind running the app with both these environment variables set:

KMP_SETTINGS=1
KMP_AFFINITY=verbose 

for both a broken OpenMP runtime and a working OpenMP runtime and attaching the OpenMP runtime outputs here?

@kmpanilla
Copy link
Author

kmpanilla commented May 6, 2024

Do you mind running the app with both these environment variables set:

KMP_SETTINGS=1
KMP_AFFINITY=verbose 

for both a broken OpenMP runtime and a working OpenMP runtime and attaching the OpenMP runtime outputs here?

@jpeyton52, See attached for the outputs.. 15-output.txt = 15.x (working fine), 17-output.txt = 17.x (broken). We did these on a single CPU (4 core) server to minimize the output. Same behavior for a multi-cpu server though.

Let me know what else I can supply.

Thank you!

15-output.txt
17-output.txt

@jpeyton52
Copy link
Contributor

Thanks for the logs. Can you try setting KMP_AFFINITY=none with one of the broken OpenMP runtimes and seeing if the desired behavior is restored?

@kmpanilla
Copy link
Author

Setting KMP_AFFINITY=none as suggested, made all the CPU cores start working as expected on previously broken versions! Any reason why this would be required for proper functionality on FreeBSD with v16+, and what would be a permanent fix?

@jpeyton52
Copy link
Contributor

It's a bug in the atfork() handler on Unix systems + logic in reinitializing the child process. The current library incorrectly sets the child process' affinity to compact, which roughly translates to "pin consecutive threads to consecutive cores", even when the user hasn't set KMP_AFFINITY to anything. So every child process was pinned to the first core instead of the entire system. I should have a PR fix shortly.

jpeyton52 added a commit to jpeyton52/llvm-project that referenced this issue May 7, 2024
When a child process is forked with OpenMP already initialized, the
child process resets its affinity mask and sets proc-bind-var to false
so that the entire original affinity mask is used. This patch corrects
an issue with the affinity initialization code setting affinity to
compact instead of none for this special case of forked children.

Fixes: llvm#91098
@kmpanilla
Copy link
Author

Your PR fix works great for me when added to v18.x. Thanks!

jpeyton52 added a commit that referenced this issue May 8, 2024
When a child process is forked with OpenMP already initialized, the
child process resets its affinity mask and sets proc-bind-var to false
so that the entire original affinity mask is used. This patch corrects
an issue with the affinity initialization code setting affinity to
compact instead of none for this special case of forked children.

The test trying to catch this only testing explicit setting of
KMP_AFFINITY=none. Add test run for no KMP_AFFINITY setting.

Fixes: #91098
llvmbot pushed a commit to llvmbot/llvm-project that referenced this issue May 8, 2024
When a child process is forked with OpenMP already initialized, the
child process resets its affinity mask and sets proc-bind-var to false
so that the entire original affinity mask is used. This patch corrects
an issue with the affinity initialization code setting affinity to
compact instead of none for this special case of forked children.

The test trying to catch this only testing explicit setting of
KMP_AFFINITY=none. Add test run for no KMP_AFFINITY setting.

Fixes: llvm#91098
(cherry picked from commit 73bb8d9)
tstellar pushed a commit to llvmbot/llvm-project that referenced this issue May 10, 2024
When a child process is forked with OpenMP already initialized, the
child process resets its affinity mask and sets proc-bind-var to false
so that the entire original affinity mask is used. This patch corrects
an issue with the affinity initialization code setting affinity to
compact instead of none for this special case of forked children.

The test trying to catch this only testing explicit setting of
KMP_AFFINITY=none. Add test run for no KMP_AFFINITY setting.

Fixes: llvm#91098
(cherry picked from commit 73bb8d9)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants