Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JitArm64: Avoid double rounding in fctiwzx #9016

Merged
merged 1 commit into from Aug 8, 2020

Conversation

JosJuice
Copy link
Member

@JosJuice JosJuice commented Aug 7, 2020

FCVT doesn't necessarily round to zero, so the result might be inaccurate if we use it. To ensure correct rounding, we use FCVTS from double FPR to 32-bit GPR. Unfortunately, FCVTS can't do double FPR to single FPR.

This is my first time making a JitArm64 PR, so apologies if the change doesn't make sense :)

FCVT doesn't necessarily round to zero, so the result
might be inaccurate if we use it. To ensure correct
rounding, we use FCVTS from double FPR to 32-bit GPR.
Unfortunately, FCVTS can't do double FPR to single FPR.
@Tilka
Copy link
Member

Tilka commented Aug 8, 2020

(The reason we cannot use FCVT is not that it doesn't always round-towards-zero. The problem is it only converts between different floating-point formats but fctiwzx converts to integer format.)

LGTM. Please test though.

@JosJuice
Copy link
Member Author

JosJuice commented Aug 8, 2020

That is the reason why we can't use just FCVT, forgoing FCVTS. What I am arguing for is why we shouldn't use FVCT in combination with FCVTS.

I have tested two games that break when intentionally messing up the emitted code for this instruction, and they seem to work fine.

@Tilka Tilka merged commit 3201944 into dolphin-emu:master Aug 8, 2020
10 checks passed
@JosJuice JosJuice deleted the jitarm64-fctiwzx-fcvt branch August 8, 2020 16:56
ARM64Reg V1 = gpr.GetReg();

m_float_emit.FCVTS(V1, EncodeRegToDouble(VB), ROUND_Z);
m_float_emit.FMOV(EncodeRegToSingle(VD), V1);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you need a temp register here?

    m_float_emit.FCVTS(EncodeRegToSingle(VD), EncodeRegToDouble(VB), ROUND_Z);
    m_float_emit.FMOV(EncodeRegToSingle(VD), EncodeRegToSingle(VD));

Copy link
Member

@Tilka Tilka Aug 9, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding is that the FCVT*S instruction family supports:

  • f64-to-s32 with the result in a general purpose register
  • f64-to-s64 with the result in a SIMD&FP register

but we need f64-to-s32 and the result in a SIMD&FP register.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I have not noticed that this temporary register was a GPR one, my bad.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
3 participants