New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Question] packed GEMM #537
Comments
Hi @ykim362, long time no see. At this point packed GEMM in Intel MKL-DNN is used in RNN cells and not exposed directly. We are considering the ways to expose packed GEMM and batched GEMM as part of the public API. Packed GEMM implementation Intel MKL-DNN has is very similar to the one in Intel MKL. It allows to pack either A, or B, or both matrices. The matrices are packed independently, i.e. packing of A is dependent on A only and packing of B is dependent on B only. |
Hi @ykim362, We need the information of both A and B at packing to decide the threading algorithm to be use. We save the threading algorithm to be used as metadata together with the packed data for A or B matrices. In the case of both matrices being packed, we check if the threading is the same, if not we return an error. |
What happens if only A is packed and N changes between pack and compute calls? |
Thanks for the quick reply. @aaraujom Is the threading for the multi core parallelism? If I only use a single core, could it be different? In most application scenarios I am dealing with, the size of one matrix can be fixed, but the other can vary. Also, for the inference environment, single core instance is used in most cases. There are multiple instances running with single thread. |
@vpirogov If only A is packed and @ykim362 If A is packed and B is not packed, on the compute calls it should be okay to use different values of |
Thanks for the clarification, @aaraujom ! Is it same for packing B matrix only? I had an error (some randomly added garbage values) when I pack only B and used different m on a single thread. At that time, I linked against to the single threaded version of mkl lib. I thought that's by a design. But, I have to debug my code, if this is not the case. |
Yes it should work if you pack B only as well. If not it could be a bug in the implementation. If you have a reproducer we can take a look. |
Thanks, @aaraujom ! |
Hi. I noticed that mkl-dnn added packed GEMM implementation.
Is it available only via RNN cells?
Or, can I use packed GEMM outside of RNN cells?
And, is the packing operation dependent on both A and B matrices when I pack only B matrix? Or, just dependent on B? In case of MKL's API, it was dependent on both.
Thank you!
The text was updated successfully, but these errors were encountered: