Add fmaddsub/ fmsubadd / faddsub/ fsubadd #4

heltonmc · 2023-04-03T17:44:25Z

This adds some fancier llvm instructions that might target complex arithmetic (if using interleaved format) but in general our instruction set dependent.

For example, on my x86_64-linux-gnu which has the available instruction

julia> @code_native SIMDMath.faddsub(a, b)

; │┌ @ none within `macro expansion`
	vaddsubpd	%ymm1, %ymm0, %ymm0
	retq

compared to my ARM computer which doesn't have this specific instrinsic.

julia> @code_native SIMDMath.faddsub(a, b)

; %bb.0:                                ; %top
; │┌ @ none within `macro expansion`
fsub v2.2d, v0.2d, v1.2d
fadd v0.2d, v0.2d, v1.2d
mov v2.d[1], v0.d[1]
mov v0.16b, v2.16b
ret

So the LLVM code is doing an ok job in this case. Unfortunately, the fmaddsub could be better..

julia> @code_native SIMDMath.fmaddsub(a, b, c)
julia_fmaddsub_1809:                    # @julia_fmaddsub_1809
; ┌ @ none within `fmaddsub`
	.cfi_startproc
# %bb.0:                                # %top
; │┌ @ none within `macro expansion`
	vmulpd	%ymm1, %ymm0, %ymm0
	vaddsubpd	%ymm2, %ymm0, %ymm0
	retq

LLVM should be able to generate a single _m128d _mm_fmaddsub_pd instruction there.... that seems like an LLVM issue or just on my computer. I don't have a different CPU to test it out.

There is an open in issue in SIMD.jl for this that I'll comment on.

codecov-commenter · 2023-04-03T17:50:02Z

Codecov Report

Merging #4 (bbc007b) into main (04ea4a7) will increase coverage by 4.27%.
The diff coverage is 83.33%.

📣 This organization is not using Codecov’s GitHub App Integration. We recommend you install it so Codecov can continue to function properly for your repositories. Learn more

@@            Coverage Diff             @@
##             main       #4      +/-   ##
==========================================
+ Coverage   85.22%   89.50%   +4.27%     
==========================================
  Files           6        6              
  Lines         176      200      +24     
==========================================
+ Hits          150      179      +29     
+ Misses         26       21       -5

Impacted Files	Coverage Δ
src/SIMDMath.jl	`100.00% <ø> (ø)`
src/intrinsics.jl	`75.47% <80.00%> (+26.98%)`	⬆️
src/complex.jl	`100.00% <100.00%> (ø)`

... and 1 file with indirect coverage changes

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

heltonmc added 3 commits April 3, 2023 12:54

add fmaddsub llvm

3ffcaea

add error message

d7d6ae8

add addsub / subadd

63ed560

heltonmc mentioned this pull request Apr 3, 2023

Requesting Support for fmaddsub eschnett/SIMD.jl#88

Open

heltonmc added 3 commits April 3, 2023 14:00

del

3a83825

add complex fnmadd

deabb2f

improve test coverage

bbc007b

heltonmc merged commit 8e0236a into main Apr 5, 2023

heltonmc deleted the fmaddsub branch April 5, 2023 14:09

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add fmaddsub/ fmsubadd / faddsub/ fsubadd #4

Add fmaddsub/ fmsubadd / faddsub/ fsubadd #4

heltonmc commented Apr 3, 2023

codecov-commenter commented Apr 3, 2023 •

edited

Loading

Add fmaddsub/ fmsubadd / faddsub/ fsubadd #4

Add fmaddsub/ fmsubadd / faddsub/ fsubadd #4

Conversation

heltonmc commented Apr 3, 2023

codecov-commenter commented Apr 3, 2023 • edited Loading

Codecov Report

codecov-commenter commented Apr 3, 2023 •

edited

Loading