Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compilation failure for OpenMP on older family #133075

Closed
KaruroChori opened this issue Mar 26, 2025 · 6 comments
Closed

Compilation failure for OpenMP on older family #133075

KaruroChori opened this issue Mar 26, 2025 · 6 comments
Labels
openmp question A question, not bug report. Check out https://llvm.org/docs/GettingInvolved.html instead!

Comments

@KaruroChori
Copy link

I tried compiling with clang++-21 test.cpp -fopenmp -fopenmp-targets=nvptx64-nvidia-cuda -Xopenmp-target=nvptx64 -march=sm_60

#include <iostream>
#include <cstdio>
using namespace std;

int main(){
        #pragma omp target parallel
        {
                printf("Hello\n");
        }
        return 0;
}

on an older card which is sm_60.
The compiler failed with this message:

fatal error: error in backend: PTX does not support "atomic" for orderings different than"NotAtomic" or "Monotonic" for sm_60 or older, but order is: "seq_cst".
clang++-21: error: clang frontend command failed with exit code 70 (use -v to see invocation)
Debian clang version 21.0.0 (++20250324101504+0adc672ed4a2-1~exp1~20250324221624.2732)
Target: x86_64-pc-linux-gnu
Thread model: posix
InstalledDir: /usr/lib/llvm-21/bin
clang++-21: note: diagnostic msg: 
********************

PLEASE ATTACH THE FOLLOWING FILES TO THE BUG REPORT:
Preprocessed source(s) and associated run script(s) are located at:
clang++-21: note: diagnostic msg: /tmp/test-sm_60-c20c2d.cpp
clang++-21: note: diagnostic msg: /tmp/test-c40dad.cpp
clang++-21: note: diagnostic msg: /tmp/test-sm_60-c20c2d.sh

I personally don't think it is a "real" bug, just that some features are not available on older cards, but since it asked for it, here I posted it.

@llvmbot
Copy link
Member

llvmbot commented Mar 26, 2025

@llvm/issue-subscribers-openmp

Author: None (KaruroChori)

I tried compiling with `clang++-21 test.cpp -fopenmp -fopenmp-targets=nvptx64-nvidia-cuda -Xopenmp-target=nvptx64 -march=sm_60` ```c++ #include <iostream> #include <cstdio> using namespace std;

int main(){
#pragma omp target parallel
{
printf("Hello\n");
}
return 0;
}

on an older card which is *sm_60*.  
The compiler failed with this message:

fatal error: error in backend: PTX does not support "atomic" for orderings different than"NotAtomic" or "Monotonic" for sm_60 or older, but order is: "seq_cst".
clang++-21: error: clang frontend command failed with exit code 70 (use -v to see invocation)
Debian clang version 21.0.0 (++20250324101504+0adc672ed4a2-1exp120250324221624.2732)
Target: x86_64-pc-linux-gnu
Thread model: posix
InstalledDir: /usr/lib/llvm-21/bin
clang++-21: note: diagnostic msg:


PLEASE ATTACH THE FOLLOWING FILES TO THE BUG REPORT:
Preprocessed source(s) and associated run script(s) are located at:
clang++-21: note: diagnostic msg: /tmp/test-sm_60-c20c2d.cpp
clang++-21: note: diagnostic msg: /tmp/test-c40dad.cpp
clang++-21: note: diagnostic msg: /tmp/test-sm_60-c20c2d.sh


I personally don't think it is a "real" bug, just that some features are not available on older cards, but since it asked for it, here I posted it.
</details>

@jhuber6
Copy link
Contributor

jhuber6 commented Mar 26, 2025

Yeah that's just older cards not being able to use the features we need in the runtime. However, this is most likely from atomic generated from the atomic wrappers, so it's possible that they'll get removed if you add optimizations.

@jhuber6
Copy link
Contributor

jhuber6 commented Mar 26, 2025

Your file compiled just fine for me with clang++ test.cpp -fopenmp --offload-arch=sm_52. I can't actually run it since I don't have access to a card that old, but I'm not sure why it doesn't work for your case.

@KaruroChori
Copy link
Author

KaruroChori commented Mar 27, 2025

Not for me :/. This time I have the latest nvidia tools, drivers (570) and cuda sdk (12.8.1) available in their official debian repository. And llvm-21 straight from here.

I just tested on the workstation I am decommissioning, and there it compiles down to sm_52 as you stated. Maybe a regression on the nvidia side?

@jhuber6
Copy link
Contributor

jhuber6 commented Mar 27, 2025

Does it work if you pass -O3 -foffload-lto? Sometimes that optimizes out the offending instructions. Otherwise, I don't think there's much we can do from the LLVM side.

@KaruroChori
Copy link
Author

The very basic example on top does, a full application does not as there are scenarios in which they cannot be optimized out.
No worries! After tomorrow, that card will go back in the closet as it was just a temporary solution while setting up the new workstation.

@KaruroChori KaruroChori closed this as not planned Won't fix, can't repro, duplicate, stale Mar 27, 2025
@EugeneZelenko EugeneZelenko added the question A question, not bug report. Check out https://llvm.org/docs/GettingInvolved.html instead! label Mar 27, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
openmp question A question, not bug report. Check out https://llvm.org/docs/GettingInvolved.html instead!
Projects
None yet
Development

No branches or pull requests

4 participants