-
Notifications
You must be signed in to change notification settings - Fork 144
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Question] elementiwise-multiplication, dot product functionality #165
Comments
Hmm, I'm not sure if I understand your question. Are you trying to multiply MxM matrices with Mx1 vectors? If so, you don't need to pad the Mx1 vector with zeros. You can just move and Mx1 vector into the scratchpad, and then do a matmul with that Mx1 vector. Gemmini will pad it with zeros before feeding it into the systolic array. There are examples here of matmuls with matrices that have less than |
For example, if I have 2 256 dimensional vectors, A and B. and I'd like to first, take the dot products of them to get output C which should also be a 256 dimensional vector. And then, elementwise-add this output C to another 256 dimensional vector D to get the final answer E. |
Hi @hngenc , I tried using matmul to implement dot-product but its not efficient, which is understandable . As I explained above, I want to perform dot product of two 256-dimensional vectors, so theoretically, only 256 multiply operation is needed for this computation . I tried passing them both in the form of matrices so one vector, A's shape is (256 x 1) and the other vector, B's shape is (1x256) and the output is of shape (256x256). And then I only get elements on the diagonal. This is understandably inefficient since there are a lot of redundant operations performed. So i'm wondering if you have any suggestions as to deploying dot product of two vectors on Gemmini. Thanks! |
Well, the spatial array wasn't really built for element-wise operations. It was rather designed for matmuls. I'm not sure if Gemmini is the right accelerator for your use-case. Another option could be to generate a Hwacha accelerator on the same SoC. You can have both Gemmini and Hwacha on the same SoC. Hwacha is a vector processor, and will probably do a lot better on element-wise vector multiplications. If you really want to use Gemmini for this, then I think your diagonal solution might be the best way. Another option would be to create your own Gemmini datatype, and define the We describe how to create your own datatypes in our recent tutorial: https://sites.google.com/berkeley.edu/gemminitutorialiiswc2021/ |
Got it. Thanks for the detailed explanation! |
Since dot product is a very common operation in the ML community, I'm wondering what's an efficient way of deploying vector dot-product/elementwise multiplication (followed by elementwise addition) on to Gemmini. I tried zero-padding Mx1 vectors to MxM matrix but padding all the zeros is time-consuming and makes the required memory unnecessarily large to accommodate the zeros. (Or is there a zero insertion mechanism in Gemmini such that I only need to movein the Mx1 elements for input and Mx1 elements for weights from DRAM to perform dot products on Gemmini ?) Thanks!
The text was updated successfully, but these errors were encountered: