-
Notifications
You must be signed in to change notification settings - Fork 89
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix GEMM for A^T * A, A * A^T, A * A... operation #36
Comments
Such collisions are usually caught at a higher level (as for x = prod(A, x); ). This needs to be checked for CUDA and the CPU backend as well, so it's reasonable to assume different arguments in the kernel. A runtime check using assert() should nevertheless be applied - just in case. |
Well, concerning for the backend, I plan to check it when creating a custom_operation(). |
I'm not talking about the backend, I'm taking about the front-end, i.e. at the point the user specifies the operation. |
I see ... Hmmm, the problem with the generator is that it will generate a 2013/6/4 Karl Rupp notifications@github.com
|
In this case, A and A^T have different semantics in the kernel, but refer to the same handle and are considered equal by the generator... I am really not sure on how to handle this. Plus, I'm pretty sure A_A^T and A_A can be implemented using a better kernel... Should I just forbid the handle of LHS and RHS to be the same in that case (and in a later version dispatch to different kernels)? I will try to find out a way to handle this, but this problem seems to lay deep down in the generator's structure... I had really not anticipated that the same handle could refer to two different was of accessing memory in the same kernel !
The text was updated successfully, but these errors were encountered: