-
Notifications
You must be signed in to change notification settings - Fork 10.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
clang does not add !fpmath data to OpenCL sqrt calls #64264
Comments
@llvm/issue-subscribers-clang-codegen |
@llvm/issue-subscribers-opencl |
Attaching the lit test sqrt-fpmath.cl.txt |
OpenCL sqrt functions are declared in clang/lib/Headers/opencl-c.h . How about adding a macro (e.g. __CLANG_OPENCL_USE_BULTIN_SQRT) to conditionally define them as inline functions to call clang sqrt builtins. Then AMDGPU target can choose to enable them in clang driver. If all targets are able to handle the sqrt intrinsic then this macro is unnecessary. |
That could work, availabile_externally inline definition with the intrinsic call but the implementation would still need to provide a definition in the linked library. I guess we could use the macro in case of alternative implementation choices |
On second thought the intrinsic should always work in theory. We don't need to speculatively work around buggy implementations |
I tried this and it doesn't work with -fdeclare-opencl-builtins: https://reviews.llvm.org/D156743 |
An alternative could be to take the |
It should be only 1 function times 6 vector overloads, so I'm not too worried about bloating it. There are some plusses to having a separate inline definition compared to fixing up call sites. It maintains inlining flexibility and avoids needing to introduce a non-standard name for the two different variants. All the fast math flag and attribute handling is also free that way. However I'm seeing weird behavior when I move just the f32 variant in there, the double and half variants all try to resolve to calling sqrt f32.
I think this has more in common with printf, which is declared in the base header |
The |
We want the !fpmath metadata to be attached to the sqrt intrinsic to make it to the backend lowering. Emit an available_externally definition which uses the builtin, which emits the !fpmath. Fixes llvm#64264 https://reviews.llvm.org/D156743
With changes implemented in https://reviews.llvm.org/D156743 and discussed in llvm/llvm-project#64264. Upstream llvm added the definition of sqrt(float) and it's variant in clang/lib/Headers/opencl-c-base.h.
XFAILS new test: clang/test/CodeGenOpenCL/sqrt-fpmath.cl We want the !fpmath metadata to be attached to the sqrt intrinsic to make it to the backend lowering. Emit an available_externally definition which uses the builtin, which emits the !fpmath. Fixes llvm#64264 https: //reviews.llvm.org/D156743 Change-Id: I590c7595d8df38ccbf0a3b458c255b44aa68255c
By default (i.e. without -cl-fp32-correctly-rounded-divide-sqrt), calls to float/vector of float sqrt should be annotated with !fpmath metadata. Currently they are not which is interfering with fixing the library from using a global option corresponding to the flag.
I tried working towards this in bac2a07, but this is of limited use for the end use case. The actual user call to the mangled function needs to be annotated (such that a backend pass can swap out the call to the intrinsic with presered metadata) so we can lower it appropriately.
I'm not sure how to really do this. I tried hacking up LANGBUILTINs for all the mangled forms, but they don't seem to be recognized as builtins at that point for interception in CGBuiltin. Another strategy I considered was adding a macro for the correctly rounded option and setting an attribute on the declarations?
The text was updated successfully, but these errors were encountered: