-
Notifications
You must be signed in to change notification settings - Fork 98
Add env support for the training script argument #128
Comments
thanks for the feedback, should be simple enough to add. If we follow a convention of making the env var setting as |
LGTM, thanks for the support. |
…name PET_ARG Summary: See: pytorch#128 Allows users to specify program args via env var as such: ``` PET_NNODES="1:2" python -m torchelastic.distributed.launch \ --rdzv_id 123 \ my_script.py script_args ``` Differential Revision: D24098270 fbshipit-source-id: e6cccaecc852840a44fc941017bf361a4769b698
I had to use
positionals (script and script args) cannot be set via env vars. so you can't do something like this:
Also flags can be set via env vars:
|
…name PET_ARG (#129) Summary: Pull Request resolved: #129 See: #128 Allows users to specify program args via env var as such: ``` PET_NNODES="1:2" python -m torchelastic.distributed.launch \ --rdzv_id 123 \ my_script.py script_args ``` Reviewed By: yifuwang Differential Revision: D24098270 fbshipit-source-id: 5501c4331939df468fba5811f7b7e3b74e100da3
PR merged and released as part of torchelastic.0.2.1 |
…name PET_ARG (pytorch#129) Summary: Pull Request resolved: pytorch#129 See: pytorch#128 Allows users to specify program args via env var as such: ``` PET_NNODES="1:2" python -m torchelastic.distributed.launch \ --rdzv_id 123 \ my_script.py script_args ``` Reviewed By: yifuwang Differential Revision: D24098270 fbshipit-source-id: 5501c4331939df468fba5811f7b7e3b74e100da3
Description
A custom use for this elastic tools is as below:
Is that possible support args: nnodes, rdzv_id, rdzv_backend, rdzv_endpoint ... in env like $NUM_NODES, $JOB_ID,
$RDZV_BACKEND, $RDZV_ENDPOINT, and no need to present in the args.
Motivation/Background
It can make this elastic tools more smoothly in k8s and no need to take care of the args in the controller reconcile logic.
Detailed Proposal
One possible proposal is that support env in torchelastic/distributed/launch.py. if args not present, it will look for the env.
Alternatives
Additional context/links
The text was updated successfully, but these errors were encountered: