Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Standalone sparse-dense matrix multiplication benchmark #390

Closed
marsupialtail opened this issue Jul 9, 2020 · 7 comments
Closed

Standalone sparse-dense matrix multiplication benchmark #390

marsupialtail opened this issue Jul 9, 2020 · 7 comments

Comments

@marsupialtail
Copy link

Hi, I am wondering if FBGEMM supports standalone sparse matrix dense matrix multiplication using the unrolling approach to get register blocking as mentioned in the new release notes. It seems like the test involves the operation fused with another matrix multiplication. I am wondering if an API similar to say MKL's SpMM exists for FBGEMM. Thank you!

@dskhudia
Copy link
Contributor

dskhudia commented Jul 9, 2020

@marsupialtail Such an API doesn't exist in FBGEMM at the moment.

@marsupialtail
Copy link
Author

So is the easiest way of testing a SpMM fusing it with a quantized matmul at the moment?

@dskhudia
Copy link
Contributor

@marsupialtail
Copy link
Author

So FBGEMM currently only supports int8 SpMM rn? Does it support fp32?

@dskhudia
Copy link
Contributor

Currently it's int8 only.

@CorbinFoucart
Copy link

CorbinFoucart commented May 16, 2024

Regarding the SparseDenseMMInt8Benchmark.cc example, it seems that both the input matrix and the output matrix must be transposed in order to use the API. Namely, if I have an input matrix A and am interested in the output matrix C, I must first transpose A to use the API and must transpose the output C^T as well to get C.

As these operations require memory copies which may be quite expensive, I've looked at using the CSC matrix format; for example the function doSpmdmOnInpBuffer. Is there a way to use this function, doSpmdmOnInpBuffer, standalone, followed by a requantization? Similar to the original poster, I've seen it only used in the context of an output pipeline.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants