Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for Mixed precision f32-f16 #6

Open
mert-kurttutan opened this issue May 15, 2023 · 1 comment
Open

Support for Mixed precision f32-f16 #6

mert-kurttutan opened this issue May 15, 2023 · 1 comment

Comments

@mert-kurttutan
Copy link

mert-kurttutan commented May 15, 2023

Hi @sarah-ek,

I have an operation where I need to apply matrix multiplication to f32,f16 to obtain f32 matrix. The addition and multiplication should be done in f32 format.

So: Matmul(f16,f32) -> f32

To do this operation with your package, at the moment I saving( and turning) f16 matrix into f32 buffer matrix, then using gemm for f32xf32 matrix multiplication.

More specifically, I used your f16 matmul code and transformed some part of it so that it is mixed precision.

I am not sure how optimized my code is. If possible I would like to know if you will add support for this.

This type of operation is being adapted more and more in the context of large ML models.

@sarah-ek
Copy link
Owner

is it an option to convert the f16 matrix to f32 outside the matrix multiplication? then multiply the two f32 matrices together? if you can spare the memory for the conversion then this shouldn't add much overhead since the conversion is O(n^2) while the multiplication is O(n^3)

Narsil added a commit to Narsil/gemm that referenced this issue Aug 9, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants