-
Notifications
You must be signed in to change notification settings - Fork 493
How can I use multi-GPU to train UNMT #23
Comments
another question, in the UNMT model, only one encoder and one decoder? Thanks. |
You should not handle the
And no, there are 2 separate models for UNMT, one encoder and one decoder, but they are initialized with the same weights (apart from the parameters of the source attention in the decoder that remain randomly initialized). |
I using multi-GPU to pre-training the model like |
What do you mean by the training time is the same? Is the perplexity the same at the end of a few epochs? Or do you look at the number of words per second? The number of words per second in the log is given per GPU, so this will be the same. But the loss / perplexity should decrease much faster. |
Looks good :) |
I add --local_rank, but raise error.
SLURM job: False
Traceback (most recent call last):
File "train.py", line 322, in
main(params)
File "train.py", line 198, in main
init_distributed_mode(params)
File "XLM/src/slurm.py", line 110, in init_distributed_mode
params.global_rank = int(os.environ['RANK'])
File "/usr/lib/python3.5/os.py", line 725, in getitem
raise KeyError(key) from None
KeyError: 'RANK'
The text was updated successfully, but these errors were encountered: