Skip to content

improved shifting by 48-63 in llshr#784

Merged
mateoconlechuga merged 1 commit intomasterfrom
opt_llshr
Apr 18, 2026
Merged

improved shifting by 48-63 in llshr#784
mateoconlechuga merged 1 commit intomasterfrom
opt_llshr

Conversation

@ZERICO2005
Copy link
Copy Markdown
Contributor

@ZERICO2005 ZERICO2005 commented Apr 15, 2026

The compiler often emits __llshru(63) to get the signbit of (u)int64_t, which takes around ~1550F clock cycles instead of 2F bit 7, b. This contributes to the very slow performance of long double operations which often need to shift by 52 or 63 bits. Similarly, __llshrs(63) also appears when doing a "branchless" long long llabs(long long).

I have added an optimized path for __llshr(u/s) that handles shift amounts of 48-63, taking no more than 100F to complete. The new worst case is shifting by 47 (1165F + 110R + 108W + 141)

@ZERICO2005 ZERICO2005 added the crt label Apr 15, 2026
@mateoconlechuga mateoconlechuga merged commit 27b4ca6 into master Apr 18, 2026
9 checks passed
@mateoconlechuga mateoconlechuga deleted the opt_llshr branch April 18, 2026 02:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Development

Successfully merging this pull request may close these issues.

2 participants