Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Train the model using PyTorch's DDP mode, and when OOM occurs, the next trial will not proceed. #5190

Closed
Qianshaowei opened this issue Jan 17, 2024 · 0 comments

Comments

@Qianshaowei
Copy link

Qianshaowei commented Jan 17, 2024

Thanks for your work.
I train the model using PyTorch's DDP mode, and when OOM occurs, the next trial will not proceed.
How should this problem be solved?

@nzw0301 nzw0301 converted this issue into discussion #5191 Jan 17, 2024

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant