New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Perf: Implement branchless compare #13187
Conversation
df4f4bc
to
98aba34
Compare
Some IL tests are failing due to the change. There are some build errors on my machine (it seems to be new F# features like [x..] indexing... How can I build this locally ? |
Build scripts (from the devguide) should work just fine. You can use it with noVisualStudio switch. Devguide also describes how to update IL baselines (if test uses baseline files). |
I'm curious if we know that this is the fastest implementation. Is there any information on this on the web e.g. for C++ or assembly code? |
I'd like to see a systematic correctness test matrix for comparisons on all the basic types affected by this, including
Also don't forget the |
There could be some improvements in the jit. |
I think the fastest version would be in x64 There is a 4 instr version found on SO sub %1, %0 But it involve a conditional jump that will be slower on branch prediction. Moreover I see no way to make the jit compile something like this. |
I was also looking at movcc (conditional move) instructions, but the source is a register or memory, it cannot be a constant. So it requires extra instructions to load constants and won't we shorter. |
After a check, the requirement on IComparable.CompareTo is:
for shorter types (byte, short, char) it is implemented as:
This is valid since there is no risk of overflow. But the result is not necessarily 0,1,-1 For longer types, it uses conditional jumps and returns always 0,1,-1 Do we have that constraint on the compare function ? |
Since compare fallbacks to ICompare.Compare for other types, there is no guarantee that the output is always 0,1,-1. |
This would not be a breaking change at the specification level, but it would be a breaking change at the implementation level... code like: Same thing for code like: But such code is not following the specification of compare. |
@@ -0,0 +1,381 @@ | |||
namespace FSharp.Compiler.UnitTests |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's fine in this PR, but for future, we suggest creating new tests in FSharp.Compiler.ComponentTests
, it has nicer APIs and better support for IL baseline tests (instead of having an inline IL, it can be a separate file, which does not require recompiling the suite in case of change, and it is easier to re-generate using the env variable).
There are a few tests that compare generated IL to expected IL that fail due to new code generation. How do you usually regenerate expected IL ? |
So, if they are inline (i.e. the IL itself is the string in the test case itself, then, unfortunately, only one-by-one). // on Windows
set TEST_UPDATE_BSL=1
// or, on macOS/Linux
export TEST_UPDATE_BSL=1
// and then
build.cmd -testCoreClr
// or
build.sh --testcoreclr (https://github.com/dotnet/fsharp/blob/main/DEVGUIDE.md#updating-baselines-in-tests) |
There are a lot of culture dependent tests that fail on my french machine . Fixed it. |
Some of the builds fail but it doesn't seem related to the change.. Any idea? |
The latest run failed because of the tests: |
These are baseline tests, but they don't seem to be updated with the instructions above... |
I've updated them + updated the guide, If I may ask, which shell were you using? |
9a53139
to
d7021b3
Compare
Great, now it's even more failures...I will take a look. |
I was using powershell so I used |
Hm, interesting. I will probably add a separate switch to the build scripts which runs tests and updates baselines. |
@thinkbeforecoding I have updated 2 remaining baselines for net472 framework + updated devguide describing how it works. The rest of the tests which are failing are checking the compare result. |
When I used the command it said: |
There are still some cancelled builds... strange |
6f3ef91
to
f2bb4fa
Compare
These canceled builds again. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, looks good. Thanks
Yeah, some infra hiccups, I suppose. |
@dsyme , I think this just needs your review, but should be good. |
@dsyme, this is good to go, do you want to review your change request. |
@thinkbeforecoding Thank you so much for this improvement! |
Wohooo! It landed. Awesome work! |
NOTE: as of .NET 7.0 (or was it 8.0), JIT is expected to use branchless instructions even when F#/C# emits branches (recognizes cmov like idioms) |
Yes I've noted this in some tests in fasmi. I think it's in dotnet 8.0 |
Sure, I just wanted to note that it's better when some sub-optimal codegen is filed against JIT rather than silently (for JIT team) fixed 🙂 |
This is an implementation of #13098
It implements a branchless compare:
cgt x y - clt x y
When x > y: 1 - 0 = 1
When x < y: 0 - 1 = -1
When x = y: 0 - 0 = 0
Benchmarks show that it is very slightly slower (10%) when predictions are always correct, but it is 3x faster on random values where prediction fail more often
The code emitted by the JIT is the same size.