-
Notifications
You must be signed in to change notification settings - Fork 4.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Configurable floating-point behavior (Fast Math) #12753
Comments
hm... so since dotnet/coreclr#9369 was not merged Also, not sure it's related to /fp:fast but: Console.WriteLine(MathF.Asinh(0.48549962f));
Console.WriteLine(MathF.Acos (0.57316583f));
Console.WriteLine(MathF.Log2 (1 / 3.0f)); Output on Windows:
Output on macOS:
Both OSs have |
Also CC. @CarolEidt who I've talked with about this before. |
I think, in general, it would be good to expose something like this long term. Developers have varying needs and sometimes precision is desired (and should be the default) and sometimes speed is desired instead. I think in general, it should be easy enough to allow optimizations at a per-method level and that for a method users should be able to impact how |
I had thought we already had an issue along these lines, but I can't find it, so it probably doesn't yet exist.
This is the tricky design issue, and not just for inlining but whether and how these decisions are made across methods, classes, assemblies, etc. I think a reasonable position can be taken and supported, but it will certainly require some design and discussion. |
As far as I understand, currently
Math/MathF
native code is compiled with/fp:fast
(only on Windows) however, a pure C# code behaves more like/fp:precise
So it could be a problem for e.g. developers who develop games (or backends for them) where they want all floating point operations to be 100% repeatable on all clients/hardware or different software compiled in
/fp:precise
mode. E.g. online games with client-side physics.However, most users don't care about it and could gain an additional performance from
/fp:fast
for both native and managed code./fp:fast
mode for C# could allow us to apply the following optimizations in JIT (inspired by LLVM):1)
a * b + c
tofmadd
could be done in a single instruction
fmadd
(see https://github.com/dotnet/coreclr/issues/17541) instead ofmul
+add
.There are lots of places in BCL where it can be inserted, especially around
System.Numerics.*
Benchmark: (Coffee Lake i7 8700K)
2)
a / c
toa * (1 / c)
See dotnet/coreclr#24584 which currently works only for power-of-two constants. But could handle any constant in the fast math mode.
Benchmark:
3) Comparisons and ternary operations
could generate a single
vmaxss
instruction and in general be less strict around +0.0/-0.0/NaN.Related: dotnet/coreclr#22965 and dotnet/coreclr#16306
4)
a - b - a
to-b
5)
(a * b) + (a * c)
toa * (b + c)
Benchmark:
6)
a * a * a * a
to twovmulss
could be done in two
vmulss
instead of three.7) Combinations of Math calls
This one doesn't really look useful but there can be more, need to check
See godbolt and sharplab playgrounds to compare all of these cases between .NET Core and clang/LLVM.
So we could have 3 modes: precise, mixed (current) and fast to set via some env variable e.g.
COMPlus_FpMode=fast
or an attribute[FloatingPointMode(FloatingPointModeOptions.Fast)]
to be able to set the mode per method. Also, a runtime constant e.g.The only problem - we would need a second version of
Math
internal calls compiled in/fp:precise
mode. If it's a problem then two modes: mixed and fast.PS: Since Mono has LLVM back-end all of these optimizations can be easily turned on there (but only globally):
/cc: @tannergooding @mikedn
category:proposal
theme:floating-point
skill-level:expert
cost:extra-large
The text was updated successfully, but these errors were encountered: