Join GitHub today
GitHub is home to over 31 million developers working together to host and review code, manage projects, and build software together.Sign up
[patch] Generate sqrtsd opcode instead of external call to sqrt on amd64 #5795
Original bug ID: 5795
The sqrt function can be optimised by calling directly the processor instruction, avoiding a call.
The patch was not tested on windows
Comment author: @alainfrisch
Has anyone done any performance comparison? I'm curious about the performance gains to be expected, to see if it's worth the trouble extending the 32-bit (x87) backend as well (this backend is still very much useful for Windows...).
Comment author: @xavierleroy
But only if -ffast-math is selected. Not using "fsin", "fcos", etc by default is justified because those instructions are not quite 100% IEEE754 compliant. But it could make sense to always generate "fsqrt" since, AFAIK, this instruction implements proper IEEE754 behavior. This would have the advantage of working around bugs in Win32 CRT's sqrt() function, cf. #6020.