New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TypeError exception in AxialPositionalEncoding when using DataParallel #17
Comments
Nice! I'll fix this by tomorrow! (Both of my GPUs are in use at the moment) What are you using it for? |
@kl0211 oh no pytorch/pytorch#36035 No worries, I'll think of something |
@lucidrains I tried your latest commit. That let the example run to completion. However, I then tried running it on the enwik8_simple script and I got a different error:
Seems like AxialPositionalEncoding is not properly replicating across different GPUs.
I'm looking into using Sinkhorn (or Reformer) to create document embeddings. Want to see how it compares with embedding sentences (with sentence-transformers for example) and merging them. |
@kl0211 oh ok, I put in a temporary fix, should work now! very cool! I'd like to know how that turns out! |
@kl0211 you should try Deepspeed. DataParallel actually doesn't give you a very big speed up |
@lucidrains, Looks like your fix got it to work! Thanks a bunch!
Cool! I'll see if I can try it out. Thanks for the tip! |
@kl0211 do share your results! this repository is still in the exploratory phase! |
Hello,
I want to run SinkhornTransformerLM using multiple GPUs, so I'm wrapping the model into torch.nn.DataParallel. However, when I do this, I get an exception:
Looking at the code, it would seem that
self.weights
does not get populated. To reproduce this error, I took the first example in README.md and changedto
The text was updated successfully, but these errors were encountered: