We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
With such code, clang (17.0.1) able to optimize (-fno-math-errno -O3) sin/cos calls to one call of sincos:
#include <cmath> struct Point3D { double x; double y; double z; }; Point3D f(double lat, double lon) noexcept { double lat_sin = sin(lat), lat_cos = cos(lat); double lon_sin = sin(lon), lon_cos = cos(lon); return {lat_cos * lon_cos , lat_cos * lon_sin, lat_sin}; }
but for some reason it can not use pointers properly:
lea rdi, [rsp + 40] lea rsi, [rsp + 32] call sincos@PLT movsd xmm0, qword ptr [rsp + 40] # xmm0 = mem[0],zero movsd qword ptr [rsp + 8], xmm0 # 8-byte Spill lea rdi, [rsp + 24] lea rsi, [rsp + 16] movsd xmm0, qword ptr [rsp] # 8-byte Reload call sincos@PLT
clang/llvm passes pointer to sincos via [rsp + 40] and [rsp + 32], then move stored value to [rsp + 8] and [rsp].
sincos
[rsp + 40]
[rsp + 32]
[rsp + 8]
[rsp]
If it can use [rsp + 8] and [rsp] directly, this removes at least two extra save and loads.
godbolt.org link
The text was updated successfully, but these errors were encountered:
@llvm/issue-subscribers-backend-x86
Author: None (davemilter)
#include <cmath> struct Point3D { double x; double y; double z; }; Point3D f(double lat, double lon) noexcept { double lat_sin = sin(lat), lat_cos = cos(lat); double lon_sin = sin(lon), lon_cos = cos(lon); return {lat_cos * lon_cos , lat_cos * lon_sin, lat_sin}; }
lea rdi, [rsp + 40] lea rsi, [rsp + 32] call sincos@<!-- -->PLT movsd xmm0, qword ptr [rsp + 40] # xmm0 = mem[0],zero movsd qword ptr [rsp + 8], xmm0 # 8-byte Spill lea rdi, [rsp + 24] lea rsi, [rsp + 16] movsd xmm0, qword ptr [rsp] # 8-byte Reload call sincos@<!-- -->PLT
Sorry, something went wrong.
@dtcxzyw , not sure about backend:x86,
for armv8-a direct usage of sincos also gives benefits. clang generates only 22 instruction instead of 25, and use less stack space:
clang
armv8-a godbolt link
No branches or pull requests
With such code, clang (17.0.1) able to optimize (-fno-math-errno -O3)
sin/cos calls to one call of sincos:
but for some reason it can not use pointers properly:
clang/llvm passes pointer to
sincos
via[rsp + 40]
and[rsp + 32]
,then move stored value to
[rsp + 8]
and[rsp]
.If it can use
[rsp + 8]
and[rsp]
directly,this removes at least two extra save and loads.
godbolt.org link
The text was updated successfully, but these errors were encountered: