You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
If the seed is not set in hparams, it is randomly selected in __init__. Each DDP process, when it starts up, gets a different random seed.
The seed from the rank 0 process is saved in checkpoints
When resuming from a checkpoint, the seed from the rank 0 process is restored across all DDP processes.
This leads to inconsistent behavior, since the non-rank-0 process now resume with a different seed than they first trained with.
To fix: add the seed to the RNG state, and sync across all DDP processes
The text was updated successfully, but these errors were encountered:
If the seed is not set in hparams, it is randomly selected in
__init__
. Each DDP process, when it starts up, gets a different random seed.The seed from the rank 0 process is saved in checkpoints
When resuming from a checkpoint, the seed from the rank 0 process is restored across all DDP processes.
This leads to inconsistent behavior, since the non-rank-0 process now resume with a different seed than they first trained with.
To fix: add the
seed
to the RNG state, and sync across all DDP processesThe text was updated successfully, but these errors were encountered: