Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/compile: regalloc fails to reserve key register (CX) for future code #44671

Open
josharian opened this issue Feb 27, 2021 · 0 comments
Open

cmd/compile: regalloc fails to reserve key register (CX) for future code #44671

josharian opened this issue Feb 27, 2021 · 0 comments

Comments

@josharian
Copy link
Contributor

@josharian josharian commented Feb 27, 2021

After CLs 297049 and 297050, generate code for shrVU_g in math/big.

The tight inner loop compiles to:

v94     00020 (+182) MOVQ -8(DX)(CX*8), R8
v106    00021 (182) MOVQ (DX)(CX*8), R9
v133    00022 (181) MOVQ CX, R10
v117    00023 (182) MOVQ SI, CX
v109    00024 (182) SHRQ CX, R9, R8
v119    00025 (182) MOVQ R8, -8(DI)(R10*8)
v121    00026 (+181) LEAQ 1(R10), R8
v107    00027 (181) MOVQ R8, CX

This contains unnecessary copies. One is #44670.

The others are CX contention. Regalloc decided to put i in CX. Unfortunately, CX is the required register for the shift amount input for the SHRDQ instruction. So we shuffle i in and out of CX on every iteration.

I thought regalloc had some amount of lookahead to avoid using highly restricted registers in upcoming code, to avoid this situation. Either I was wrong, or it isn't working as desired here.

I believe that fixing this issue would make the generated code as fast (or nearly so) as the hand-written assembly, allowing us to delete the assembly.

cc @randall77 @cherrymui

@dmitshur dmitshur added this to the Backlog milestone Mar 5, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
2 participants