System.Math and System.MathF should be implemented in managed code, rather than as FCALLs to the C runtime #9001
As per the title, both
This will ensure:
Some of the functions (such as
Other functions (such as
The text was updated successfully, but these errors were encountered:
I do not think that this is necessarily a good idea. It makes the runtime less portable. The C runtime implementations of these functions are a fine default implementation.
If we want to make the implementation better on some platforms, that's ok - but it should not be the requirement for all platforms.
@jkotas, how does it make the runtime "less portable"?
Currently, we are tied to a particular implementation of the C runtime for each platform. This leads to:
I would think providing a managed implementation makes it more portable since it means:
For all of these we should obviously ensure that codegen and perf remain on-par with what we have today.
Some of the functions, such as
Others (such as
The remaining (such as
It does not, strictly speaking, require this. It only likely requires this for producing the most performant code.
It is also, strictly speaking, entirely possible to set the "intrinsic" for these functions on new hardware architectures to be the CRT implementation by default (provided they are IEEE compliant), if the software fallback's performance is considered too poor.
That being said, we already have the case today, when using the CRT implementation, that the majority of the functions on some platforms are considerably poorer (330% slower in the worst case): https://github.com/dotnet/coreclr/issues/9373.
This proposal gives us a standard baseline of "correctness" in the software implementation where any hardware specific improvements can readily be checked. It also allows us to check that the underlying CRT implementation (if it were to be used as the intrinsic) is compatible with our expectations.
A few places where this might unlock some perf:
As I have said, I am fine with using C# implementation that depends on intrinsics on platforms where we have deeper codegen investments.
The best implementation of Abs for x64 is actually:
For bring up of new platforms or platforms with less codegen investment, the C runtime implementation is the default. It may not be as good or it can have different behavior in corner cases - but it is good enough. For example, I do not ever want to have a software fallback for
Having a managed version will not do the porting easier? For example suppose you want to port Net Core to MIPS you have not worry to port Math / FMath initially and you have time to debug other issues, the Math optimization using Hardware Intrinsics / calling C run time could be done after.
Or I'm missing something?
For porting, you do not have to worry about the C runtime either (porting CoreCLR to a platform without C runtime is non-scenario).
And the C runtime functions will be better debugged and have better performance than a software based callback written in C# (on platforms without deeper codegen investment).
There are several problems with this proposal, but to me the major problem is that it moves the maintenance and fixing from the underlying platform to us for what is a very well established, maintained and universally understood API.
The stated goal of "consistency across operating systems and platforms" has been troublesome in the past for things like Unicode string collation. Mono emulated the Windows behavior, while .NET Core took the native approach - and I think that the approach that .NET core took of using libICU instead of trying to emulate the Windows behavior was the right one.
Porting to a new platform already requires an advanced libc to be available, or an equivalent. Not only for the managed bridges, but the runtime itself (CoreCLR and Mono) both consume
There is the performance issue as well, which Jan just touched on.
AMD has open sourced a snapshot of the libm implementation that is currently used by Windows x64: https://github.com/amd/win-libm (plus a few improvements that haven't been picked up yet).
This, combined with other open source libm implementations such as the one from ARM: https://github.com/ARM-software/optimized-routines, should allow us to provide a portable implementation that is both IEEE compliant and deterministic.
We could do that by contributing back to the existing libraries and picking an implementation as the "de-facto" standard to pull in, or we could do that by porting the code to C#.
I would lean towards the former (picking an implementation as the "de-facto" standard), but I think the latter has some interesting possibilities as well. Not only would it be able to play with .NET code better (such as being GC aware), but it can also take some optimizations that the standard C code doesn't have available (such as not needing to worry about exception handling or alternative rounding modes, since .NET doesn't currently support those).
Just a comment after many weeks of frustration.
As someone interested in .Net being a first-class IEEE standards high performance computing platform I agree with @tannergooding we should support libm implementation in x64 and ARM. The "C runtime implementations" as stated by @jkotas are not suitable for anything but substandard floating point performance. The C runtime is not good enough for today's machine learning and IEEE computing needs. The C runtime is little more than a all else fails compute fallback option. I have no idea why the .Net team spent all these years making an awesome high performance cross platform library then wants to drop the ball on IEEE and machine learning compute.
How many high performance compute architectures outside of ARM and X64 do we need to support outside of the C runtime? Zero. Should we have a way to all vendors to add compute libraries? Yes! We only need high performance compute on these two platforms. Processor makers should be able to add compute libraries as plugins to .Net rather than having a narrow minded approach to what is good enough.