Adding single-precision math functions. #5492

tannergooding · 2016-06-04T02:41:08Z

Summary

This PR implements dotnet/corefx#1151, by providing scalar single-precision floating-point support for many of the trigonometric, logarithmic, and other common mathematical functions.

Worklist

Provide single-precision math functions in the PAL layer.
Provide single-precision math tests in the PAL layer
Provide single-precision math functions in COMSingle for the FCALL hookups
Provide single-precision math functions in mscorlib#System.MathF for the managed layer
Provide the appropriate definitions in the ecalllist for the VM layer
Provide the appropriate intrinsic implementations for specific single-precision math functions
Provide a set of unit tests over the new single-precision math APIs
Provide a set of performance tests over the new single-precision math APIs

New APIs

The new APIs provide feature-parity with the existing double-precision math functions provided by the framework.

public static class BitConverter
{
    public static float Int32BitsToSingle(int value);
    public static int SingleToInt32Bits(float value) { return default(int); }
}

public static partial class MathF
{
    public const float PI = 3.14159265f;
    public const float E = 2.71828183f;

    public static float Abs(float x);
    public static float Acos(float x);
    public static float Asin(float x);
    public static float Atan(float x);
    public static float Atan2(float y, float x);
    public static float Ceiling(float x);
    public static float Cos(float x);
    public static float Cosh(float x);
    public static float Exp(float x);
    public static float Floor(float x);
    public static float IEEERemainder(float x, float y);
    public static float Log(float x);
    public static float Log(float x, float y);
    public static float Log10(float x);
    public static float Max(float x, float y);
    public static float Min(float x, float y);
    public static float Pow(float x, float y);
    public static float Round(float x);
    public static float Round(float x, int digits);
    public static float Round(float x, int digits, System.MidpointRounding mode);
    public static float Round(float x, System.MidpointRounding mode);
    public static int Sign(float x);
    public static float Sin(float x);
    public static float Sinh(float x);
    public static float Sqrt(float x);
    public static float Tan(float x);
    public static float Tanh(float x);
    public static float Truncate(float x);
}

Perf Numbers

All performance tests are implemented as follows:

100,000 iterations are executed
The time of all iterations are aggregated to compute the Total Time
The time of all iterations are averaged to compute the Average Time
A single iteration executes some simple operation, using the function under test, 5000 times

The execution time below is the Total Time for all 100,000 iterations, measured in seconds.

Hardware: Desktop w/ 3.7GHz Quad-Core A10-7850K (AMD) and 16GB RAM

Function	Improvment	Execution Time - Double	Execution Time - Single
Abs	0.199243555%	0.63752649s	0.63625626s
Acos	12.30220910%	11.5265412s	10.1085220s
Asin	18.66801808%	11.9472425s	9.71692911s
Atan	21.10350002%	10.9964683s	8.67582861s
Atan2	20.51327307%	24.3328097s	19.3413540s
Ceiling	12.91487191%	1.87116459s	1.62950608s
Cos	5.026665542%	7.19916547s	6.83728750s
Cosh	16.46166555%	13.5416170s	11.3124413s
Exp	33.67586387%	6.65578424s	4.41439140s
Floor	10.39208688%	1.74655247s	1.56504922s
Log	19.81117664%	6.42244806s	5.15008553s
Log10	18.40605725%	6.75118866s	5.50856101s
Pow	47.85595440%	31.8820155s	16.6245727s
Round	0.976398142%	4.22620632s	4.18494172s
Sin	15.49539339%	5.98022268s	5.05356365s
Sinh	17.96609899%	14.6242270s	11.9968239s
Sqrt	4.676516651%	2.51281945s	2.39530703s
Tan	30.33470555%	9.07290178s	6.32066374s
Tanh	0.108182099%	8.12724112s	8.11844890s

I believe some extra perf will be squeezed out when the intrinsics (such as CORINFO_INTRINSIC_Sqrt) are properly implemented in the VM layer for single-precision values. Without such functionality, it falls back to the double-precision functionality (extra precision, reduced performance) for certain calls.

tannergooding · 2016-09-28T04:34:06Z

src/mscorlib/src/System/BitConverter.cs

@@ -446,6 +446,16 @@ unsafe public static double ToDouble (byte[] value, int startIndex)
        [SecuritySafeCritical]
        public static unsafe double Int64BitsToDouble(long value) {
            return *((double*)&value);
-        }                    


This line is the 'removed' one. Please note that it has not actually been removed 😄

tannergooding · 2016-09-28T05:18:34Z

Is the arm_emulator_cross_release_ubuntu_prtest known to be flaky? http://dotnet-ci.cloudapp.net/job/dotnet_coreclr/job/master/job/arm_emulator_cross_release_ubuntu_prtest/459 had JIT/Regression/CLR-x86-JIT/V1-M12-Beta2/b80045/b80045 fail, but the test code doesn't look at all related to the changes made here.

tannergooding · 2016-09-28T07:10:29Z

Pretty much all of the intrinsic hookup work, as I understand it, is in the valueenum.cpp file. However, there does not appear to be much (if any) documentation on the ValueNumStore.

It would be really great if I could get an overview of the functionality here (or pointed to the documentation). My main concern here is ensuring any changes don't break back-compat.

As I understand it, the compiler will end up determining that some operation is intrinsic, at which point it will attempt to break apart and evaluate the function.

When evaluating the function, if the arguments are constant, there is folding that occurs (provided precision loss is not a concern).

For functions which do not have constant arguments, they do a lookup to see if the value has already been computed (by checking if it exists in the Value Number store). If the value exists, that is returned; otherwise, a new chunk is created which executes the function defined next to the intrinsic in the external call list.

It appears as though the ValueNumStore is keyed off the function id (so VNF_Acos for example, is the function id for CORINFO_INTRINSIC_Acos) and a value number which identifies the expression. So, while two implementations can share an intrinsic (such as Math.Acos and MathF.Acos both sharing CORINFO_INTRINSIC_Acos), it is important that the function ids be unique (so we should have VNF_Acos and VNF_AcosF, for example).

Is this roughly accurate?

tannergooding · 2016-09-28T07:13:34Z

@mellinoe, could you direct me towards the appropriate people to answer the above question (asking you since you were the last one assigned to the proposal)?

tannergooding · 2016-09-28T07:29:57Z

test Linux ARM Emulator Cross Debug Build please
Build timed out: http://dotnet-ci.cloudapp.net/job/dotnet_coreclr/job/master/job/arm_emulator_cross_debug_ubuntu_prtest/472

mellinoe · 2016-09-28T17:15:41Z

I believe that the ARM Emulator runs have been flaky in the past, but I don't know the current status of them. @jkotas , could you help with some of the other questions above?

jkotas · 2016-09-28T17:24:59Z

cc @dotnet/jit-contrib for valuenum.cpp questions
cc @janvorli for runtime and PAL part

I may be a good idea to add the methods in one PR, and do the JIT optimizations in follow up PR.

tannergooding · 2016-09-28T17:46:16Z

@jkotas, if that is possible that would be fine with me. It would also allow me to add the Perf tests in a separate PR, as they will be dependent on updating System.Runtime.Extensions with the new method contracts.

CarolEidt · 2016-09-28T19:02:57Z

We currently don't have documentation for value numbering. @briansull @JosephTremoulet @erozenfeld would be good candidates to look into the value numbering implications.

tannergooding · 2016-09-28T23:47:53Z

test Linux ARM Emulator Cross Debug Build please
segfault in JIT/Regression/CLR-x86-JIT/V1-M11-Beta1/b36332/b36332: http://dotnet-ci.cloudapp.net/job/dotnet_coreclr/job/master/job/arm_emulator_cross_debug_ubuntu_prtest/476/

JosephTremoulet · 2016-09-29T01:44:19Z

When evaluating the function, if the arguments are constant, there is folding that occurs (provided precision loss is not a concern).

For functions which do not have constant arguments, they do a lookup to see if the value has already been computed (by checking if it exists in the Value Number store). If the value exists, that is returned; otherwise, a new chunk is created which executes the function defined next to the intrinsic in the external call list.

It appears as though the ValueNumStore is keyed off the function id (so VNF_Acos for example, is the function id for CORINFO_INTRINSIC_Acos) and a value number which identifies the expression.

That all sounds right to me.

while two implementations can share an intrinsic (such as Math.Acos and MathF.Acos both sharing CORINFO_INTRINSIC_Acos), it is important that the function ids be unique (so we should have VNF_Acos and VNF_AcosF, for example)

I'm not sure that follows; I'd expect the float and double arguments to have different value numbers because of their different types. Have you written tests / looked at IR for cases where you'd be worried about this sort of collision?

tannergooding · 2016-09-29T01:56:18Z

I'm not sure that follows; I'd expect the float and double arguments to have different value numbers because of their different types. Have you written tests / looked at IR for cases where you'd be worried about this sort of collision?

@JosephTremoulet, I had made this assumption based on the behavior of CORINFO_INTRINSIC_Round, which treats TYP_DOUBLE (VNF_RoundDouble), TYP_FLOAT (VNF_RoundFloat), and TYP_INT (VNF_RoundInt) differently.

If that is not the case, then it certainly makes some things simpler to implement 😄

JosephTremoulet · 2016-09-29T01:59:48Z

CORINFO_INTRINSIC_Round, which treats TYP_DOUBLE (VNF_RoundDouble), TYP_FLOAT (VNF_RoundFloat), and TYP_INT (VNF_RoundInt) differently.

Ah. Yeah, if we're already doing that for a different intrinsic, I agree with your original assessment, it's best to follow suit 😦.

JosephTremoulet · 2016-09-30T15:55:24Z

@tannergooding , I just took a closer look, and I think it's ok to share VNF_ funcs for overloads. It looks like the argument to round is always a double, and that we're distinguishing its return type with the three different VNF_ enum values (which makes me wonder what source generates that intrinsic, since you can't overload on return type at source...). With the functions you're talking about, the different overloads have different argument types (and return types that differ from the other overloads [but agree with the argument]). So I think the (func ID, argument) pair is unambiguous for these in a way that it would be ambiguous in the round case without distinguishing the IDs. I also double-checked, and we do store each VN's type with it (or really encode it in it) -- so in the methods that need to parse this stuff (e.g. EvalMathFuncUnary), you can both extract the type from the argument valnum and you've explicitly been passed in the result type in the typ parameter. This last point makes me unsure why even the kind of ambiguity that round has would be problematic (since that typ parameter gets passed down into VNForFunc and factors into the hashing), but regardless, it looks to me like the code is expecting to support the type of overloading that you want here.

janvorli · 2016-09-30T16:59:40Z

src/pal/tests/palsuite/c_runtime/_isnanf/test1/test1.c

+
+    if (!_isnanf(snan))
+    {
+        Fail("_isnanf() failed to identify %I64x as NaN!\n", lsnan);


This is a wrong format, it should be %I32. There are other three occurences of this issue below.

tannergooding · 2016-10-18T05:59:08Z

All tests passing. Is there any other feedback here?

Additionally, should the remaining three work items be completed as part of this PR, or should bugs be logged to track them being completed in separate PRs?

The three remaining work items are:

Provide the appropriate intrinsic implementations for specific single-precision math functions
Provide a set of unit tests over the new single-precision math APIs
Provide a set of performance tests over the new single-precision math APIs

janvorli · 2016-10-18T09:57:34Z

@tannergooding thank you for all this work!

tannergooding · 2016-10-18T10:49:23Z

@janvorli. Thanks for the merge!

I have logged the following bugs to track the remaining three work items (and have self-assigned them for the time being):
https://github.com/dotnet/coreclr/issues/7689
https://github.com/dotnet/coreclr/issues/7690
https://github.com/dotnet/coreclr/issues/7691

mburbea · 2016-10-18T13:59:07Z

Isn't it somewhat problematic to compile with /fp:fast rather than /fp:struct? Does this mean that depending on your hardware you could get different results?

As far as I understand, the RyuJit uses xmm registers to avoid the chance of unpredictable results. I wouldn't mind if we added the java route of a StrictMath class, if the perf difference is really worth the change.

mikedn · 2016-10-18T15:43:31Z

@mburbea What exactly do you expect /fp:strict to achieve here? On x64 it shouldn't matter, on x86 you'll end up using double precision of functions such as sin and cos which seems exactly the opposite of this change's intent.

janvorli · 2016-10-19T15:46:13Z

@tannergooding we have found that the change breaks NGEN. The issue is that Abs, Min, Max and Sign functions for float exists in the Math class too and their native implementation in the runtime is the same, so linker ends up folding those and violates the invariant that each method has to have unique entrypoint.

So did you know that these methods already exist for float in the Math and added them just to make the MathF "complete"?

It seems we can fix the problem in two ways. One is to remove these from the MathF and the other is to keep them, but implement them in the managed code as calls to their Math counterparts.

@KrzysztofCwalina, @jkotas do you have any opinion on those two options?

tannergooding · 2016-10-19T16:18:37Z

I added them to make it complete (and they were part of the proposed API change as such).

I believe we should keep the APIs and have them call their legacy counterpart. I think a user who is working with System.MathF would prefer to have all of their API calls coming from the same location, if possible (it gets confusing having to mix System.MathF and System.Math depending on whether you are calling a new or old API).

tannergooding · 2016-10-19T16:25:23Z

I implemented such a fix here: #7721

Let me know if you opt to go the other route.

* Adding single-precision math functions to floatsingle * Adding single-precision math functions to the PAL layer. * Adding single-precision math tests to the PAL layer. * Adding single-precision math functions to mscorlib. * Adding single-precision math function support to the vm. * Updating floatsingle.cpp to define a _isnanf macro for Windows ARM.

ghost · 2018-01-30T17:06:34Z

Good, but why not rewriting the Math class methods to have overloaded versions with float parameters instead of creating a different class? I don't know untill now where is that MathF class!

tannergooding · 2018-01-30T17:18:50Z

@MohammadHamdyGhanem, because it would be a breaking change for recompiled code.

Math.Sqrt(4) today resolves to Math.Sqrt(double) (because an implicit conversion to double exists, and the only overload is for double). However, if you add a new Math.Sqrt(float) overload, overload resolution comes into play. int can be implicitly converted to either float or double, but float is preferred, so the recompiled code would call Math.Sqrt(float), which can cause an observable difference in results for certain inputs.

ghost · 2018-01-30T22:02:01Z

So, what about Math.SqrtF instead of MathF.Sqrt?

tannergooding · 2018-01-30T22:36:30Z

See response in the other thread on CoreFX.

dnfclas added the cla-already-signed label Jun 4, 2016

tannergooding mentioned this pull request Sep 28, 2016

New API for Expanded Math Library #710

Closed

14 tasks

tannergooding commented Sep 28, 2016

View reviewed changes

tannergooding mentioned this pull request Sep 29, 2016

Updating the 'System.Runtime.Extensions' contracts to include the new single-precision Math APIs. dotnet/corefx#12183

Merged

janvorli suggested changes Sep 30, 2016

View reviewed changes

tannergooding added 7 commits October 17, 2016 20:34

Adding single-precision math functions to the PAL layer.

c0c915e

Adding single-precision math tests to the PAL layer.

041faa8

Adding single-precision math functions to floatsingle

bdfaa93

Adding single-precision math functions to mscorlib.

6274231

Adding single-precision math function support to the vm.

a4427b5

Change 'I64X' to 'I32X' in the new _isnanf tests.

7756fa6

Updadting floatsingle.cpp to define a _isnanf macro for Windows ARM.

82f3004

janvorli approved these changes Oct 18, 2016

View reviewed changes

janvorli merged commit 6057b18 into dotnet:master Oct 18, 2016

tannergooding deleted the math branch October 18, 2016 14:10

rahku mentioned this pull request Oct 18, 2016

_isnanf is not defined for arm64 use isnan instead #7701

Merged

jkotas mentioned this pull request Oct 22, 2016

Issue #832; add BitConverter.SingleToInt32Bits and Int32BitsToSingle #833

Closed

karelz modified the milestone: 2.0.0 Aug 28, 2017

Daniel-Svensson mentioned this pull request Jan 31, 2020

Add generic Math methods for numeric types dotnet/runtime#18244

Closed

tannergooding mentioned this pull request Jan 31, 2020

The new System.MathF APIs should have wrappers created so they are available on .NETStandard dotnet/runtime#20113

Closed

Adding single-precision math functions. #5492

Adding single-precision math functions. #5492

Uh oh!

Conversation

tannergooding commented Jun 4, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Worklist

New APIs

Perf Numbers

Uh oh!

tannergooding Sep 28, 2016

Choose a reason for hiding this comment

Uh oh!

tannergooding commented Sep 28, 2016

Uh oh!

tannergooding commented Sep 28, 2016

Uh oh!

tannergooding commented Sep 28, 2016

Uh oh!

tannergooding commented Sep 28, 2016

Uh oh!

mellinoe commented Sep 28, 2016

Uh oh!

jkotas commented Sep 28, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tannergooding commented Sep 28, 2016

Uh oh!

CarolEidt commented Sep 28, 2016

Uh oh!

tannergooding commented Sep 28, 2016

Uh oh!

JosephTremoulet commented Sep 29, 2016

Uh oh!

tannergooding commented Sep 29, 2016

Uh oh!

JosephTremoulet commented Sep 29, 2016

Uh oh!

JosephTremoulet commented Sep 30, 2016

Uh oh!

janvorli Sep 30, 2016

Choose a reason for hiding this comment

Uh oh!

tannergooding Oct 18, 2016

Choose a reason for hiding this comment

Uh oh!

tannergooding commented Oct 18, 2016

Uh oh!

janvorli commented Oct 18, 2016

Uh oh!

tannergooding commented Oct 18, 2016

Uh oh!

mburbea commented Oct 18, 2016

Uh oh!

mikedn commented Oct 18, 2016

Uh oh!

janvorli commented Oct 19, 2016

Uh oh!

tannergooding commented Oct 19, 2016

Uh oh!

tannergooding commented Oct 19, 2016

Uh oh!

ghost commented Jan 30, 2018

Uh oh!

tannergooding commented Jan 30, 2018

Uh oh!

ghost commented Jan 30, 2018

Uh oh!

tannergooding commented Jan 30, 2018

Uh oh!

Uh oh!

tannergooding commented Jun 4, 2016 •

edited

Loading

jkotas commented Sep 28, 2016 •

edited

Loading