Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

this optimization doesn't look good on SIMD #2812

Open
MkazemAkhgary opened this issue Mar 26, 2024 · 1 comment
Open

this optimization doesn't look good on SIMD #2812

MkazemAkhgary opened this issue Mar 26, 2024 · 1 comment
Labels
Bugs Performance All issues related to performance/code generation

Comments

@MkazemAkhgary
Copy link

MkazemAkhgary commented Mar 26, 2024

this function with AVX2

unmasked varying int32 RoundIndex(varying int32 Index)
{
    return (Index >> 2) << 2;
}

compiles to

.LCPI1_0:
        .long   4294967292                      # 0xfffffffc
vbroadcastss    ymm1, dword ptr [rip + .LCPI1_0] # ymm1 = [4294967292,4294967292,4294967292,4294967292,4294967292,4294967292,4294967292,4294967292]
vandps  ymm0, ymm0, ymm1
ret

wouldn't it be better to just do this?

vpsrad  ymm0, ymm0, 2
vpslld  ymm0, ymm0, 2
ret

Using shifts avoids loading from memory and results in less register pressure. The second approach should complete in about 2 cycles, whereas the broadcast alone takes about 5 to 8 cycles (if I'm not mistaken). am I missing something?


P.S. there is also a vpand instruction with 3rd operand as memory location. I'm not sure why ispc doesn't use this. (it does use vandps zmm, zmm, m512 for avx512)

If there are additional shift operations similar to this, or if the value 0xfffffffc is already in a register, it would be more efficient to use AND instead.

@MkazemAkhgary
Copy link
Author

This issue could be generalized into use of broadcasting constants. I think some times it's better to just do simple computations rather than broadcasting a constant from memory.

@pbrubaker pbrubaker added Performance All issues related to performance/code generation Bugs labels Mar 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bugs Performance All issues related to performance/code generation
Projects
Status: No status
Development

No branches or pull requests

2 participants