New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DDP training question #1096
Labels
Comments
and my code is here
|
Hi, do you have a problem with the application getting stuck after starting multiple nodes? On my side, too, running the official multi-node example would get stuck |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Hi, I'm using the tutorial https://github.com/pytorch/tutorials/blob/master/intermediate_source/ddp_tutorial.rst for DDP train,using 4 gpus in myself code, reference Basic Use Case. But when I finished the modification, it was stuck during run the demo,meanwhile,video memory has been occupied.Could you help me?
The text was updated successfully, but these errors were encountered: