-
Notifications
You must be signed in to change notification settings - Fork 15.2k
Closed
Description
Source:
#include <immintrin.h>
void zip_float(const double *src, double *dst) {
__m256d s0 = _mm256_broadcast_pd((__m128d*)src);
__m256d s1 = _mm256_broadcast_pd((__m128d*)src + 2);
__m256d s = _mm256_shuffle_pd(s0, s1, 0xc);
s = _mm256_mul_pd(s, s);
_mm256_store_pd(dst, s);
}LLVM:
zip_float:
vmovupd xmm0, xmmword ptr [rdi]
vmovupd xmm1, xmmword ptr [rdi + 32]
vunpcklpd xmm2, xmm0, xmm1
vunpckhpd xmm0, xmm0, xmm1
vinsertf128 ymm0, ymm2, xmm0, 1
vmulpd ymm0, ymm0, ymm0
vmovapd ymmword ptr [rsi], ymm0
vzeroupper
ret
GCC:
zip_float:
vbroadcastf128 ymm0, XMMWORD PTR [rdi]
vbroadcastf128 ymm1, XMMWORD PTR [rdi+32]
vshufpd ymm0, ymm0, ymm1, 12
vmulpd ymm0, ymm0, ymm0
vmovapd YMMWORD PTR [rsi], ymm0
vzeroupper
ret
Godbolt: https://godbolt.org/z/ffz1YEhPE
Tweeted by FFmpeg: https://x.com/FFmpeg/status/1853326818008514900
dtcxzyw and Swyter