-
Notifications
You must be signed in to change notification settings - Fork 530
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
3D and 4D support for Linear Algebra functions #483
Comments
These functions error out in 3.0. Pushing the feature request for a later release. |
Pavan, can you clarify explicitly whether this means that matrix multiply is not available within gfor? I am currently getting incorrect results when I use it inside gfor for batch sizes larger than ~5 (small matricies on the order of [40,20] x [20, 40]) - interestingly it seems to work correctly in small batches and I'm nowhere near out of memory. An aside - is there a registry of functions that are and are not supported in gfor that's available somewhere? Maybe we could put something together for easy reference. Thanks! |
It is most definitely not supported in the open source version. We were not throwing errors before. I changed this behavior couple of days ago. We will add support for it before next release. |
Thanks for the quick reply! Glad to know the issue 👍 |
Will release 3.2 allow for solve() inside a gfor() loop? Thanks, |
Maybe adding and batched version of |
What would it take to get this implemented? I could really use this feature in my code. Right now I'm using a workaround that involves tiling out matrices (and transposing one) and doing: af::sum(A * B) followed by a moddims to get the proper shape. It does work, although memory usage is a concern and I know this probably isn't the most efficient method. |
Could I also request this feature |
CPU matmul seems now to be done from 3.6.2, right ? |
Yes, Updated the list |
Do this inside a for loop with offsets. This will be needed for GFOR
CUDA
[ ] dotOpenCL
[ ] dotCPU
[ ] dotThe first draft can be done inside a for loop using multiple queues / streams.
Note: dot product function has been removed from the above list since a batch dot product can be easily done using
matmul
function.The text was updated successfully, but these errors were encountered: