Example of FP32 -> BF16 conversion in epilogue of GEMM #506
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Counterpart of
examples/00_bmg_gemm/00_bmg_gemm.cpp
with BF16 output instead of FP32.A
isBF16
,B
isBF16
,C
isFP32
, andD
isBF16
.Barring a few lines (some of which have been adapted/copy-pasted from https://github.com/intel/cutlass-sycl/blob/e83f147263dd8ca3589b34d76ce6fbec58bac048/test/unit/gemm/device/default_gemm_group_configuration.hpp), the code is almost as same as
examples/00_bmg_gemm/00_bmg_gemm.cpp
, so ideally, both files' code should be combined. Please use a diff tool such as BeyondCompare to see the difference in both files.This code isn't adding any new functionality, but is merely an example.
Thanks!