New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
pytorch was blocked at loss.backward #16654
Comments
Almost the same issue, waiting for response... |
you both are probably running into a distributed deadlock bug. |
Hi, |
I seem to run into this problem with PyTorch 1.2.0 (8 August 2019). I'm sharing a PyTorch neural network model between a main thread which My main thread gets stuck (deadlocked) on More details: This happens every time seed-independently. Ctrl+C only stops the 3 worker threads but not the main thread (strange!), so I cannot get a stack trace. This also seems to happen for >=3 workers only. Also see my question on StackOverflow: https://stackoverflow.com/questions/57940151/pytorch-sharing-a-model-between-threads-do-i-need-to-lock-it-myself |
I got the issue to disappear by removing a |
The pytorch version is 0.4.0
The call stack in python as follow:
The call stack in c++ as follow:
any body can solve this problem?
The text was updated successfully, but these errors were encountered: