-
Notifications
You must be signed in to change notification settings - Fork 5.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Ray Train] Explain how to set timeout when using PyTorch Lightning Trainer #36315
Comments
Yeah that's a problem, and I think this one of the action items for unifying configurations between lightning and AIR. Currently we have two sets of configuration for Lightning and Ray AIR (checkpoint configs, backend configs, and scaling configs), which makes it hard for users to figure out the right place to provide them. The ideal state from my mind is the user only need to provide lightning config, and we'll create a corresponding AIR config for them. But still there are many details to consider. I can draft a proposal on this topic later. |
Hi, I'm a bot from the Ray team :) To help human contributors to focus on more relevant issues, I will automatically add the stale label to issues that have had no activity for more than 4 months. If there is no further activity in the 14 days, the issue will be closed!
You can always ask for help on our discussion forum or Ray's public slack channel. |
This should be straightforward in 2.7? @woshiyyya |
Actually not. For the new API, they still need to specify timeouts in I think we still need to update the docstring for RayDDPStrategy to clarify this. |
Description
It seems that users need to set
timeout
using torchConfig.It's not clear when I should set it in ray.train.lightning.LightningConfigBuilder.strategy or in TorchCnfig.
Link
No response
The text was updated successfully, but these errors were encountered: