Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Question] packed GEMM #537

Closed
ykim362 opened this issue Aug 17, 2019 · 10 comments
Closed

[Question] packed GEMM #537

ykim362 opened this issue Aug 17, 2019 · 10 comments
Assignees
Labels

Comments

@ykim362
Copy link

ykim362 commented Aug 17, 2019

Hi. I noticed that mkl-dnn added packed GEMM implementation.

Is it available only via RNN cells?
Or, can I use packed GEMM outside of RNN cells?

And, is the packing operation dependent on both A and B matrices when I pack only B matrix? Or, just dependent on B? In case of MKL's API, it was dependent on both.

Thank you!

@vpirogov vpirogov self-assigned this Aug 19, 2019
@vpirogov
Copy link
Member

Hi @ykim362, long time no see.

At this point packed GEMM in Intel MKL-DNN is used in RNN cells and not exposed directly. We are considering the ways to expose packed GEMM and batched GEMM as part of the public API.

Packed GEMM implementation Intel MKL-DNN has is very similar to the one in Intel MKL. It allows to pack either A, or B, or both matrices. The matrices are packed independently, i.e. packing of A is dependent on A only and packing of B is dependent on B only.

@ykim362
Copy link
Author

ykim362 commented Aug 19, 2019

Thank you, @vpirogov!
Looking forward to seeing packed GEMM and batched GEMM in mkl-dnn.

And, just to make my question clear, this is current Intel MKL's pack operation which requires m, n and k whatever matrix is packed. It's good to have packing operation B only requires n and k.

@ykim362 ykim362 closed this as completed Aug 19, 2019
@vpirogov
Copy link
Member

vpirogov commented Aug 19, 2019

@rsdubtso pointed out that pack functions take all three dimensions as input, so in theory packing of one matrix may be dependent on the shape of another. It's not clear from the documentation whether it's the case though.

@aaraujom, could you please comment?

@aaraujom
Copy link
Contributor

Hi @ykim362,

We need the information of both A and B at packing to decide the threading algorithm to be use.

We save the threading algorithm to be used as metadata together with the packed data for A or B matrices. In the case of both matrices being packed, we check if the threading is the same, if not we return an error.

@vpirogov
Copy link
Member

@aaraujom,

What happens if only A is packed and N changes between pack and compute calls?

@ykim362 ykim362 reopened this Aug 19, 2019
@ykim362
Copy link
Author

ykim362 commented Aug 19, 2019

Thanks for the quick reply.
As we continue to discuss, I reopened this.

@aaraujom Is the threading for the multi core parallelism? If I only use a single core, could it be different?

In most application scenarios I am dealing with, the size of one matrix can be fixed, but the other can vary. Also, for the inference environment, single core instance is used in most cases. There are multiple instances running with single thread.

@aaraujom
Copy link
Contributor

@vpirogov If only A is packed and n changes between pack and compute calls it should still work. Perhaps it will not have the best threading algorithm selection.

@ykim362 If A is packed and B is not packed, on the compute calls it should be okay to use different values of n. Since you are interested in single thread there is no threading algorithm selection to worry about. Threading selection is done at packing based on m, n and k passed to pack function.

@ykim362
Copy link
Author

ykim362 commented Aug 20, 2019

Thanks for the clarification, @aaraujom !

Is it same for packing B matrix only? I had an error (some randomly added garbage values) when I pack only B and used different m on a single thread. At that time, I linked against to the single threaded version of mkl lib. I thought that's by a design. But, I have to debug my code, if this is not the case.
Thank you.

@aaraujom
Copy link
Contributor

Yes it should work if you pack B only as well. If not it could be a bug in the implementation. If you have a reproducer we can take a look.

@ykim362
Copy link
Author

ykim362 commented Aug 20, 2019

Thanks, @aaraujom !
Will try it again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants