Possible suboptimal code generation for SIMD any
function
#72413
Labels
bug
A deviation from expected or documented behavior. Also: expected but undesirable behavior.
SILOptimizer
Area → compiler: SIL optimization passes
simd
Description
Test code:
Building this with
-O
produces:Which is nice 👍
Unfortunately, when I widen the vector to 16+ elements, the
any
function becomes a massive, outlined, glob of code:The SIMD mask is 16 bytes, and the
any
function basically amounts tomask != 0
, so... even though I'm not an expert at SIMD instruction sets, it feels like this is probably not optimal.Even if I enable all the advanced modern instruction sets I can think of (
-O -Xcc -msse -Xcc -msse2 -Xcc -mavx -Xcc -mavx2
), the code generated for theany
function still feels suboptimal:Reproduction
See above.
Also Godbolt
Expected behavior
Intuitively, I would expect
any(SIMDMask<SIMD16<Int16>>)
to compile down to far fewer instructions than it does. At the very least, it seems it could be implemented using two 64-bit comparisons to zero, which I have to believe it more efficient than the code we're generating today.Environment
Swift version 6.0-dev (LLVM d1625da873daa4c, Swift bae6450)
Target: x86_64-unknown-linux-gnu
Additional information
No response
The text was updated successfully, but these errors were encountered: