New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix Frecpe_S/V and Frsqrte_S/V (full FP emu.). Add Sse Opt. & SoftFloat Impl. for Fcmeq/ge/gt/le/lt_S/V (Reg & Zero), Faddp_S/V, Fmaxp_V, Fminp_V Inst.; add Sse Opt. for Shll_V, S/Ushll_V Inst.; improve Sse Opt. for Xtn_V Inst.. Add Tests. #543
Conversation
InstEmitSimdArithmetic: Fixed Frecpe_S/V and Frsqrte_S/V (full FP emu.). Tests provided for all instructions involved. Nits. |
Added Fast (Intrinsics) & Slow (SoftFloat) paths:InstEmitSimdArithmetic: Fabd_S;
Tests provided for all instructions involved. |
Added Fast (Intrinsics) paths:InstEmitSimdArithmetic: Frecpe_S, Frsqrte_S;
Re-tested all instructions involved. |
@@ -2249,6 +2629,11 @@ private static double FPNeg(this double value) | |||
return -value; | |||
} | |||
|
|||
private static double ZerosOrOnes(bool zeros) | |||
{ | |||
return BitConverter.Int64BitsToDouble(!zeros ? 0L : -1L); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a bit weird to me that a function called ZerosOrOnes
with a zeros
arguments, returns zero when the argument is false and not true.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed; I will take this into account in a later PR.
Thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apart from this single comment, lgtm. Once again, thanks for the excellent work fixing and optimizing those cpu instructions.
Added Fast (Intrinsics) & Slow (SoftFloat) paths:
InstEmitSimdCmp:
Fcmeq, Fcmge, Fcmgt, Fcmle, Fcmlt (Scalar & Vector, Reg & Zero):
InstEmitSimdArithmetic:
Faddp_S:
Faddp_V;
Fmaxp_V, Fminp_V:
Added Fast (Intrinsics) paths:
InstEmitSimdShift:
Shll_V;
Sshll_V, Ushll_V:
InstEmitSimdMove:
Xtn_V (improved):
Tests provided for all instructions involved.
Nits.