Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Arm64Emitter: Simplify LogicalImm logic #10807

Merged
merged 4 commits into from Jul 10, 2022
Merged

Conversation

merryhime
Copy link
Contributor

@merryhime merryhime commented Jul 3, 2022

Simplify logical immediate encoding logic.

This is based on the observation that if a valid repeating element exists, it repeats through value. Thus it does not matter which one you analyse. Thus we skip over the least significant element if LSB = 1 by masking it out with inverse_mask_from_trailing_ones, to avoid the degenerate case of a stretch of 1 bits going 'round the end' of the word.

// it.
n = out_n;
const u64 inverse_mask_from_trailing_ones = ~value | (value + 1);
const size_t rotation = Common::LeastSignificantSetBit(value & inverse_mask_from_trailing_ones);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LeastSignificantSetBit(x) can just return __builtin_ctzll(x) - the result of which is undefined if x is 0, which it can be in this case. Maybe add a Common::CountTrailingZeros matching the zero-handling of Common::CountLeadingZeros?

I'd usually clear the trailing ones with just value & (value + 1), but otherwise this looks good.

(In case it's of interest, I wrote a blog post on optimising logical immediate encoding a while ago, which also removes the initial loop - it'd probably look something like this.)

Copy link
Contributor Author

@merryhime merryhime Jul 7, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ha! Why am I not surprised this isn't a novel insight! Neat!

I intuitively knew you could mask trailing ones again after rotation to determine z+o, but the observation that verifying rotation also ensured esize was a power of two wasn't obvious to me.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Yeah, I don't know if that one's obvious to anyone. (I'd consider adding an explicit "is power of two" check to spare future people who stumble upon this code from having to understand that, but it feels silly to add code that I know is redundant.)

Heavily simplify logical immediate encoding.

This is based on the observation that if a valid repeating element
exists, it repeats through `value`. Thus it does not matter which
one you analyse. Thus we skip over the least significent element
if LSB = 1 by masking it out with `inverse_mask_from_trailing_ones`,
to avoid the degenerate case of a stretch of 1 bits going 'round
the end' of the word.
Copy link
Member

@JosJuice JosJuice left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't review the intermediate version of the algorithm (the one in the first commit), but the final version looks good. Just some style nits.

Source/UnitTests/Core/PowerPC/JitArm64/MovI2R.cpp Outdated Show resolved Hide resolved
Source/UnitTests/Core/PowerPC/JitArm64/MovI2R.cpp Outdated Show resolved Hide resolved
Source/UnitTests/Core/PowerPC/JitArm64/MovI2R.cpp Outdated Show resolved Hide resolved
Source/Core/Common/Arm64Emitter.h Outdated Show resolved Hide resolved
Source/Core/Common/Arm64Emitter.h Outdated Show resolved Hide resolved
@JosJuice
Copy link
Member

LGTM after squashing the changes in the last commit into the appropriate commits.

@JMC47 JMC47 merged commit 38cb76d into dolphin-emu:master Jul 10, 2022
@merryhime merryhime deleted the LogicalImm branch July 10, 2022 23:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
4 participants