-
Notifications
You must be signed in to change notification settings - Fork 998
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bad speed for f32:s8:f32 matmul #1893
Comments
Hi @WilliamTambellini , |
Tks @igorsafo
Does the fpmath really matters? |
Thanks for the verbose log. |
Tks. We have followed these examples but atm the speed is still bad even with bf16 src/dst:
Seen in your src code:
|
If possible, I think it would be helpful for the example page (or doc on the matmul primitive page) to specify that this is only supported with a non-reference implementation for bf16:s8 to bf16 or f32. Still it's hard to see what is going wrong when using these datatypes given what seems to be saying the datatype combination is invalid in brgremm_matmul.cpp
See
from
|
Hi! I saw you're using v3.4.0 while optimized version is in v3.5. Could you please update the version and check it again? |
aha yes that was certainly the main issue. Thanks @xuxinzen -- I now see the diff on brgremm_matmul.cpp not accepting datatypes for weight decompression. We were copying some values for
Now that I see this working I wanted to confirm expected behavior for 3.4 and 3.5:
|
|
@WilliamTambellini, will do! The release is planned for May 30. You can find oneDNN release schedule here. |
Hello 1dnn team,
Just wondering if we are missing something:
ran on recent CPU
The text was updated successfully, but these errors were encountered: