-
Notifications
You must be signed in to change notification settings - Fork 56
Support fp32 accumulation for bf16 gemm and grouped gemm #482
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
4736337
a8aefe0
8b36b71
0d733e8
04e3da0
a1a6121
3998cb9
28cb9fb
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -87,7 +87,7 @@ struct DefaultGemmGroupConfiguration< | |
|
||
using TiledMma = typename CollectiveMainloop::TiledMma; | ||
|
||
using EpilogueOp = epilogue::fusion::LinearCombination<float, float>; | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Currently, while the BF16 Currently, the unwritten/implicit contract for this code in the current main branch seems to be:
It seems that when this PR would be ready, it will rectify the API usage of Thanks! |
||
using EpilogueOp = epilogue::fusion::LinearCombination<ElementOutput, float>; | ||
|
||
using FusionCallBacks = epilogue::fusion::FusionCallbacks< | ||
epilogue::IntelXeXMX16Group, | ||
|
@@ -101,7 +101,7 @@ struct DefaultGemmGroupConfiguration< | |
TileShape, Shape<_1, _1, _1>, | ||
epilogue::collective::EpilogueTileAuto, | ||
float, float, | ||
float, LayoutC, 1, | ||
ElementOutput, LayoutC, 1, | ||
ElementOutput, LayoutC, 1, | ||
epilogue::IntelXeXMX16Group, | ||
EpilogueOp | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this change is wrong. Why change this? Did you check this is correct if dtype of C and output is different?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this issue can be detected by pre-ci checks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It assums C and D has same dtype here. I also think we need to support more dtype combinations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why change it if you assume C and D is the same? We should add a static_assert if it only works under that assumption.