-
Notifications
You must be signed in to change notification settings - Fork 10.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
clang: CUDA error: compiling relocatable device code with dynamically allocated shared memory passed through a lambda expression will cause an error. #65806
Comments
Interesting. So, we end up creating a static variable which is supposed to keep references to external data, and that does not work with shared pointers, because each SM gets its own instance and we don't know the value at compile time.
We use that to preserve things that may be referred to from the host, and shared memory objects can't be accessed that way. We should not put them on that list. @yxsamliu I assume this is also true for AMD GPUs. Is it? |
Right. It was a clang bug introduced by my change. I will fix it. |
Fixes: llvm#65806 Currently clang put extern shared var ODR-used by host device functions in global var __clang_gpu_used_external. This behavior was due to https://reviews.llvm.org/D123441. However, clang should not do that for extern shared vars since their addresses are per warp, therefore cannot be accessed by host code.
Fixes: #65806 Currently clang put extern shared var ODR-used by host device functions in global var __clang_gpu_used_external. This behavior was due to https://reviews.llvm.org/D123441. However, clang should not do that for extern shared vars since their addresses are per warp, therefore cannot be accessed by host code.
Fixes: llvm#65806 Currently clang put extern shared var ODR-used by host device functions in global var __clang_gpu_used_external. This behavior was due to https://reviews.llvm.org/D123441. However, clang should not do that for extern shared vars since their addresses are per warp, therefore cannot be accessed by host code.
Fixes: llvm#65806 Currently clang put extern shared var ODR-used by host device functions in global var __clang_gpu_used_external. This behavior was due to https://reviews.llvm.org/D123441. However, clang should not do that for extern shared vars since their addresses are per warp, therefore cannot be accessed by host code.
Overview
Compiling relocatable device code (
-fcuda-rdc
) with dynamically allocated shared memory (extern __shared__
) passed through a lambda expression will cause afatal: Variable used as initial value not in .global or .const state space
error in the clang-18 compiler.The work around is to pass an object that points to the dynamically allocated shared memory object.
Error
Isolated example that reproduces the error
Supporting files
The code above, requested files for the bug report, and supporting files for reproducing the issue in a Docker container are attached.
clang_rdc_issue.zip
The text was updated successfully, but these errors were encountered: