Implement Gemm<bf16> for CudaBlas. #167

LaurentMazare · 2023-07-07T06:57:34Z

(first thanks for this neat crate, it's very helpful)
This PR adds the Gemm trait for bf16 so that cublas can easily be used on this type, the code is very similar to the f16 variant but with the cuda types adapted to bf16.
The gemm f16 test has also been adapted to test this variant too, all this being behind the f16 feature flag.

coreylowman · 2023-07-07T15:24:40Z

Awesome, thanks!

Implement Gemm<bf16> for CudaBlas.

c8c6aca

coreylowman merged commit 5434e2b into coreylowman:main Jul 7, 2023
3 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement Gemm<bf16> for CudaBlas. #167

Implement Gemm<bf16> for CudaBlas. #167

LaurentMazare commented Jul 7, 2023

coreylowman commented Jul 7, 2023

Implement Gemm<bf16> for CudaBlas. #167

Implement Gemm<bf16> for CudaBlas. #167

Conversation

LaurentMazare commented Jul 7, 2023

coreylowman commented Jul 7, 2023