-
Notifications
You must be signed in to change notification settings - Fork 1k
sdpa fma f16 #3422
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
sdpa fma f16 #3422
Conversation
8574d21
to
66937bb
Compare
5c9dc1f
to
90c7bfb
Compare
90c7bfb
to
51d3588
Compare
51d3588
to
4285c75
Compare
Hi @syurkevi what does this mean from user perspective? Is it f16 acc automatically or users will need to turn on any flag to enable? With this, do we expect the same numerical behavior between micro-kernel implementation and matmul primitive based implementation? Thanks! |
F16 support is now automatically available on MTL where it was previously unsupported. The accumulation mode matches the behavior of the systolic SDPA kernel (f32 acc) so results should match our current implementation. |
make test |
4285c75
to
6b68be7
Compare
6b68be7
to
de3cdf9
Compare
make test |
This PR adds FMA support to SDPA for the fp16 data type. This allows SDPA to work on older platforms that don't have systolic support.
make test
andmake test_benchdnn_*
) pass locally for each commit?Performance improvements
Performance was measured and tuned relative to a primitives only based implementation:
