Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Frecpe_S/V and Frsqrte_S/V (full FP emu.). Add Sse Opt. & SoftFloat Impl. for Fcmeq/ge/gt/le/lt_S/V (Reg & Zero), Faddp_S/V, Fmaxp_V, Fminp_V Inst.; add Sse Opt. for Shll_V, S/Ushll_V Inst.; improve Sse Opt. for Xtn_V Inst.. Add Tests. #543

Merged
merged 21 commits into from Dec 26, 2018

Conversation

LDj3SNuD
Copy link
Contributor

@LDj3SNuD LDj3SNuD commented Dec 14, 2018

Added Fast (Intrinsics) & Slow (SoftFloat) paths:

InstEmitSimdCmp:

Fcmeq, Fcmge, Fcmgt, Fcmle, Fcmlt (Scalar & Vector, Reg & Zero):

  • Max Sse Req.: Sse2
  • Types Covered: All

InstEmitSimdArithmetic:

Faddp_S:

  • Max Sse Req.: Sse3
  • Types Covered: All

Faddp_V;
Fmaxp_V, Fminp_V:

  • Max Sse Req.: Sse2
  • Types Covered: All

Added Fast (Intrinsics) paths:

InstEmitSimdShift:

Shll_V;
Sshll_V, Ushll_V:

  • Max Sse Req.: Sse41
  • Types Covered: All

InstEmitSimdMove:

Xtn_V (improved):

  • Max Sse Req.: Ssse3
  • Types Covered: All

Tests provided for all instructions involved.

Nits.

@AcK77 AcK77 added the cpu Related to ARMeilleure label Dec 14, 2018
@LDj3SNuD
Copy link
Contributor Author

LDj3SNuD commented Dec 18, 2018

InstEmitSimdArithmetic:

Fixed Frecpe_S/V and Frsqrte_S/V (full FP emu.).

Tests provided for all instructions involved.

Nits.

@LDj3SNuD LDj3SNuD changed the title Add Sse Opt. & SoftFloat Impl. for Fcmeq/ge/gt/le/lt_S/V (Reg & Zero), Faddp_S/V, Fmaxp_V, Fminp_V Inst.; add Sse Opt. for Shll_V, S/Ushll_V Inst.; improve Sse Opt. for Xtn_V Inst.. Add Tests. Fix Frecpe_S/V and Frsqrte_S/V (full FP emu.). Add Sse Opt. & SoftFloat Impl. for Fcmeq/ge/gt/le/lt_S/V (Reg & Zero), Faddp_S/V, Fmaxp_V, Fminp_V Inst.; add Sse Opt. for Shll_V, S/Ushll_V Inst.; improve Sse Opt. for Xtn_V Inst.. Add Tests. Dec 18, 2018
@LDj3SNuD
Copy link
Contributor Author

LDj3SNuD commented Dec 19, 2018

Added Fast (Intrinsics) & Slow (SoftFloat) paths:

InstEmitSimdArithmetic:

Fabd_S;
Fabd_V:

  • Max Sse Req.: Sse2
  • Types Covered: All

Tests provided for all instructions involved.

@LDj3SNuD
Copy link
Contributor Author

LDj3SNuD commented Dec 20, 2018

Added Fast (Intrinsics) paths:

InstEmitSimdArithmetic:

Frecpe_S, Frsqrte_S;
Frecpe_V, Frsqrte_V:

  • Max Sse Req.: Sse
  • Types Covered: S

Re-tested all instructions involved.

@@ -2249,6 +2629,11 @@ private static double FPNeg(this double value)
return -value;
}

private static double ZerosOrOnes(bool zeros)
{
return BitConverter.Int64BitsToDouble(!zeros ? 0L : -1L);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a bit weird to me that a function called ZerosOrOnes with a zeros arguments, returns zero when the argument is false and not true.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed; I will take this into account in a later PR.
Thanks.

Copy link
Member

@gdkchan gdkchan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apart from this single comment, lgtm. Once again, thanks for the excellent work fixing and optimizing those cpu instructions.

@gdkchan gdkchan merged commit 0f5b6df into Ryujinx:master Dec 26, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cpu Related to ARMeilleure
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants