-
Notifications
You must be signed in to change notification settings - Fork 2.6k
Adding single-precision math functions. #5492
Conversation
@@ -446,6 +446,16 @@ unsafe public static double ToDouble (byte[] value, int startIndex) | |||
[SecuritySafeCritical] | |||
public static unsafe double Int64BitsToDouble(long value) { | |||
return *((double*)&value); | |||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line is the 'removed' one. Please note that it has not actually been removed 😄
Is the |
Pretty much all of the intrinsic hookup work, as I understand it, is in the It would be really great if I could get an overview of the functionality here (or pointed to the documentation). My main concern here is ensuring any changes don't break back-compat. As I understand it, the compiler will end up determining that some operation is When evaluating the function, if the arguments are For functions which do not have It appears as though the ValueNumStore is keyed off the function id (so Is this roughly accurate? |
@mellinoe, could you direct me towards the appropriate people to answer the above question (asking you since you were the last one assigned to the proposal)? |
test Linux ARM Emulator Cross Debug Build please |
I believe that the ARM Emulator runs have been flaky in the past, but I don't know the current status of them. @jkotas , could you help with some of the other questions above? |
cc @dotnet/jit-contrib for valuenum.cpp questions I may be a good idea to add the methods in one PR, and do the JIT optimizations in follow up PR. |
@jkotas, if that is possible that would be fine with me. It would also allow me to add the Perf tests in a separate PR, as they will be dependent on updating |
We currently don't have documentation for value numbering. @briansull @JosephTremoulet @erozenfeld would be good candidates to look into the value numbering implications. |
test Linux ARM Emulator Cross Debug Build please |
That all sounds right to me.
I'm not sure that follows; I'd expect the float and double arguments to have different value numbers because of their different types. Have you written tests / looked at IR for cases where you'd be worried about this sort of collision? |
@JosephTremoulet, I had made this assumption based on the behavior of If that is not the case, then it certainly makes some things simpler to implement 😄 |
Ah. Yeah, if we're already doing that for a different intrinsic, I agree with your original assessment, it's best to follow suit 😦. |
@tannergooding , I just took a closer look, and I think it's ok to share |
|
||
if (!_isnanf(snan)) | ||
{ | ||
Fail("_isnanf() failed to identify %I64x as NaN!\n", lsnan); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a wrong format, it should be %I32. There are other three occurences of this issue below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
All tests passing. Is there any other feedback here? Additionally, should the remaining three work items be completed as part of this PR, or should bugs be logged to track them being completed in separate PRs? The three remaining work items are:
|
@tannergooding thank you for all this work! |
@janvorli. Thanks for the merge! I have logged the following bugs to track the remaining three work items (and have self-assigned them for the time being): |
Isn't it somewhat problematic to compile with /fp:fast rather than /fp:struct? Does this mean that depending on your hardware you could get different results? As far as I understand, the RyuJit uses xmm registers to avoid the chance of unpredictable results. I wouldn't mind if we added the java route of a StrictMath class, if the perf difference is really worth the change. |
@mburbea What exactly do you expect |
@tannergooding we have found that the change breaks NGEN. The issue is that Abs, Min, Max and Sign functions for float exists in the Math class too and their native implementation in the runtime is the same, so linker ends up folding those and violates the invariant that each method has to have unique entrypoint. So did you know that these methods already exist for float in the Math and added them just to make the MathF "complete"? It seems we can fix the problem in two ways. One is to remove these from the MathF and the other is to keep them, but implement them in the managed code as calls to their Math counterparts. @KrzysztofCwalina, @jkotas do you have any opinion on those two options? |
I added them to make it complete (and they were part of the proposed API change as such). I believe we should keep the APIs and have them call their legacy counterpart. I think a user who is working with |
I implemented such a fix here: #7721 Let me know if you opt to go the other route. |
* Adding single-precision math functions to floatsingle * Adding single-precision math functions to the PAL layer. * Adding single-precision math tests to the PAL layer. * Adding single-precision math functions to mscorlib. * Adding single-precision math function support to the vm. * Updating floatsingle.cpp to define a _isnanf macro for Windows ARM.
Good, but why not rewriting the Math class methods to have overloaded versions with float parameters instead of creating a different class? I don't know untill now where is that MathF class! |
@MohammadHamdyGhanem, because it would be a breaking change for recompiled code.
|
So, what about Math.SqrtF instead of MathF.Sqrt? |
See response in the other thread on CoreFX. |
Summary
This PR implements dotnet/corefx#1151, by providing scalar single-precision floating-point support for many of the trigonometric, logarithmic, and other common mathematical functions.
Worklist
COMSingle
for the FCALL hookupsmscorlib#System.MathF
for the managed layerecalllist
for the VM layerNew APIs
The new APIs provide feature-parity with the existing double-precision math functions provided by the framework.
Perf Numbers
All performance tests are implemented as follows:
Total Time
Average Time
The execution time below is the
Total Time
for all 100,000 iterations, measured in seconds.Hardware: Desktop w/ 3.7GHz Quad-Core A10-7850K (AMD) and 16GB RAM
I believe some extra perf will be squeezed out when the intrinsics (such as
CORINFO_INTRINSIC_Sqrt
) are properly implemented in the VM layer for single-precision values. Without such functionality, it falls back to the double-precision functionality (extra precision, reduced performance) for certain calls.