sdpa fma f16 #3422

syurkevi · 2025-06-13T16:39:30Z

This PR adds FMA support to SDPA for the fp16 data type. This allows SDPA to work on older platforms that don't have systolic support.

Do all unit and benchdnn tests (make test and make test_benchdnn_*) pass locally for each commit?
Have you formatted the code using clang-format?

Performance improvements

Performance was measured and tuned relative to a primitives only based implementation:

src/gpu/intel/ocl/tile_ops.h

TaoLv · 2025-06-14T02:45:27Z

Hi @syurkevi what does this mean from user perspective? Is it f16 acc automatically or users will need to turn on any flag to enable? With this, do we expect the same numerical behavior between micro-kernel implementation and matmul primitive based implementation? Thanks!

syurkevi · 2025-06-16T15:50:48Z

Hi @syurkevi what does this mean from user perspective? Is it f16 acc automatically or users will need to turn on any flag to enable? With this, do we expect the same numerical behavior between micro-kernel implementation and matmul primitive based implementation? Thanks!

F16 support is now automatically available on MTL where it was previously unsupported. The accumulation mode matches the behavior of the systolic SDPA kernel (f32 acc) so results should match our current implementation.

syurkevi · 2025-06-16T16:36:18Z

make test
disable benchdnn_all
enable benchdnn_graph
enable test_device_gpu
enable arch_gpu_xe-hpc
enable arch_gpu_xe-hpg-atsm
enable arch_gpu_xe-hpg-dg2
enable arch_gpu_xe-lp
enable arch_gpu_xe-lpg
enable arch_gpu_xe-lpg+
enable arch_gpu_xe2-hpg-bmg
enable arch_gpu_xe2-lpg

syurkevi · 2025-06-19T04:06:58Z

make test
disable benchdnn_all
enable benchdnn_graph
enable test_device_gpu
enable arch_gpu_xe-hpc
enable arch_gpu_xe-hpg-atsm
enable arch_gpu_xe-hpg-dg2
enable arch_gpu_xe-lp
enable arch_gpu_xe-lpg
enable arch_gpu_xe-lpg+
enable arch_gpu_xe2-hpg-bmg
enable arch_gpu_xe2-lpg

syurkevi requested a review from a team as a code owner June 13, 2025 16:39

github-actions bot added the platform:gpu-intel Codeowner: @oneapi-src/onednn-gpu-intel label Jun 13, 2025

syurkevi force-pushed the syurkevi/sdpa_fma_f16 branch from 8574d21 to 66937bb Compare June 13, 2025 16:55

syurkevi requested a review from a team as a code owner June 13, 2025 16:55

github-actions bot added the component:tests Codeowner: @oneapi-src/onednn-arch label Jun 13, 2025

syurkevi force-pushed the syurkevi/sdpa_fma_f16 branch 2 times, most recently from 5c9dc1f to 90c7bfb Compare June 13, 2025 17:13

dzarukin approved these changes Jun 13, 2025

View reviewed changes

petercad reviewed Jun 13, 2025

View reviewed changes

src/gpu/intel/ocl/tile_ops.h Outdated Show resolved Hide resolved

petercad approved these changes Jun 13, 2025

View reviewed changes

syurkevi force-pushed the syurkevi/sdpa_fma_f16 branch from 90c7bfb to 51d3588 Compare June 13, 2025 18:05

github-actions bot removed the component:tests Codeowner: @oneapi-src/onednn-arch label Jun 13, 2025

syurkevi force-pushed the syurkevi/sdpa_fma_f16 branch from 51d3588 to 4285c75 Compare June 13, 2025 18:13

syurkevi added 3 commits June 18, 2025 16:16

xe: sdpa: add support for f16 FMA ukernel

97cfa7f

xe: sdpa: add configs to support varying FMA sizes

e80b51b

xe: sdpa: enable bf16 fma support

fbfab8b

syurkevi force-pushed the syurkevi/sdpa_fma_f16 branch from 4285c75 to 6b68be7 Compare June 19, 2025 04:01

xe: sdpa: refactor tile load/store to slm src1

de3cdf9

syurkevi force-pushed the syurkevi/sdpa_fma_f16 branch from 6b68be7 to de3cdf9 Compare June 19, 2025 04:04

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

sdpa fma f16 #3422

sdpa fma f16 #3422

Uh oh!

syurkevi commented Jun 13, 2025 •

edited

Loading

Uh oh!

Uh oh!

TaoLv commented Jun 14, 2025

Uh oh!

syurkevi commented Jun 16, 2025

Uh oh!

syurkevi commented Jun 16, 2025

Uh oh!

syurkevi commented Jun 19, 2025

Uh oh!

Uh oh!

sdpa fma f16 #3422

Are you sure you want to change the base?

sdpa fma f16 #3422

Uh oh!

Conversation

syurkevi commented Jun 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Performance improvements

Uh oh!

Uh oh!

TaoLv commented Jun 14, 2025

Uh oh!

syurkevi commented Jun 16, 2025

Uh oh!

syurkevi commented Jun 16, 2025

Uh oh!

syurkevi commented Jun 19, 2025

Uh oh!

Uh oh!

syurkevi commented Jun 13, 2025 •

edited

Loading