Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

strange usage of pointers in sin/cos optimization #76152

Open
davemilter opened this issue Dec 21, 2023 · 2 comments
Open

strange usage of pointers in sin/cos optimization #76152

davemilter opened this issue Dec 21, 2023 · 2 comments

Comments

@davemilter
Copy link

With such code, clang (17.0.1) able to optimize (-fno-math-errno -O3)
sin/cos calls to one call of sincos:

#include <cmath>

struct Point3D {
    double x;
    double y;
    double z;
};

Point3D f(double lat, double lon) noexcept
{
   double lat_sin = sin(lat), lat_cos = cos(lat);
   double lon_sin = sin(lon), lon_cos = cos(lon);
   
   return {lat_cos * lon_cos , lat_cos * lon_sin, lat_sin};
}

but for some reason it can not use pointers properly:

        lea     rdi, [rsp + 40]
        lea     rsi, [rsp + 32]
        call    sincos@PLT
        movsd   xmm0, qword ptr [rsp + 40]      # xmm0 = mem[0],zero
        movsd   qword ptr [rsp + 8], xmm0       # 8-byte Spill
        lea     rdi, [rsp + 24]
        lea     rsi, [rsp + 16]
        movsd   xmm0, qword ptr [rsp]           # 8-byte Reload
        call    sincos@PLT

clang/llvm passes pointer to sincos via [rsp + 40] and [rsp + 32],
then move stored value to [rsp + 8] and [rsp].

If it can use [rsp + 8] and [rsp] directly,
this removes at least two extra save and loads.

godbolt.org link

@llvmbot
Copy link
Collaborator

llvmbot commented Dec 21, 2023

@llvm/issue-subscribers-backend-x86

Author: None (davemilter)

With such code, clang (17.0.1) able to optimize (-fno-math-errno -O3) sin/cos calls to one call of sincos:
#include &lt;cmath&gt;

struct Point3D {
    double x;
    double y;
    double z;
};

Point3D f(double lat, double lon) noexcept
{
   double lat_sin = sin(lat), lat_cos = cos(lat);
   double lon_sin = sin(lon), lon_cos = cos(lon);
   
   return {lat_cos * lon_cos , lat_cos * lon_sin, lat_sin};
}

but for some reason it can not use pointers properly:

        lea     rdi, [rsp + 40]
        lea     rsi, [rsp + 32]
        call    sincos@<!-- -->PLT
        movsd   xmm0, qword ptr [rsp + 40]      # xmm0 = mem[0],zero
        movsd   qword ptr [rsp + 8], xmm0       # 8-byte Spill
        lea     rdi, [rsp + 24]
        lea     rsi, [rsp + 16]
        movsd   xmm0, qword ptr [rsp]           # 8-byte Reload
        call    sincos@<!-- -->PLT

clang/llvm passes pointer to sincos via [rsp + 40] and [rsp + 32],
then move stored value to [rsp + 8] and [rsp].

If it can use [rsp + 8] and [rsp] directly,
this removes at least two extra save and loads.

godbolt.org link

@davemilter
Copy link
Author

@dtcxzyw , not sure about backend:x86,

for armv8-a direct usage of sincos also gives benefits.
clang generates only 22 instruction instead of 25,
and use less stack space:

armv8-a godbolt link

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants