New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CUDA problems in causal linear product #58
Comments
Same issue here. When the data are put in devices other than To reproduce the err: import torch
from fast_transformers.causal_product import causal_dot_product
q = k = v = torch.randn(5, 10, 10, 10).to(0)
print(causal_dot_product(q, k, v)) # this should produce the right result.
q = k = v = torch.randn(5, 10, 10, 10).to(1)
print(causal_dot_product(q, k, v)) #the output is all zero's |
Hi @angeloskath! |
@katie-cathy-hunt I will push a fix today. Sorry this took so long. Cheers, |
@angeloskath |
@angeloskath I just rebuilt my environment to try your patch, but am running into a new issue
I can import fast_transformers, but if I try to import fast_transformers.causal_product I get the same error. I verified I had pulled your fix
and it's in the environment
No errors in the build/install log |
Hmm, that is weird. What did you do to rebuild? Could I bother you to do a (Next step should be to provide prebuilt binaries for common setups to avoid all these issues) |
I thought I may have induced the error myself, I am using a conda environment with cuda installed via conda, which only installs the shared libraries, not nvcc. Looked through your setup.py and it doesn't produce an error/message if it doesn't find nvcc. I then loaded the module to add cuda 11 (same version pytorch is compiled against) into my path. I verified then I removed build and dist directories, then Still no luck
This is on RHEL 8.2, Python 3.7.9, Pytorch 1.7.1 |
@angeloskath I apologize, everything is working correctly, I started a python repl in the fast transformers directory after the install and it was looking for a local library first since there is a subdirectory called fast transformers... my mistake |
Hi,
My machine has 4 gpus, but when I use the gpu-1 (where the default gpu is 0), I found the cuda code be computed on the gpu-0. And, the code can not be computed when I use multiple gpus one time. There is a out of memory error.
The text was updated successfully, but these errors were encountered: