You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
torch==1.10.2+cu113
transformers==4.18.0
Python 3.6.9
Linux "18.04.6 LTS (Bionic Beaver)"
I am training a T5 transformer (T5ForConditionalGeneration.from_pretrained(model_params["MODEL"])) to generate text. The model works well when I train it on a single GPU. But when I want to parallel the data across several GPUs by doing model = nn.DataParallel(model), I can't save the model.
The error is:
File "run.py", line 288, in T5Trainer
model.save_pretrained(path)
File "/home/USER_NAME/venv/pt_110/lib/python3.6/site-packages/torch/nn/modules/module.py", line 1178, in getattr
type(self).name, name))
AttributeError: 'DataParallel' object has no attribute 'save_pretrained'
Reproduction
Wrap the model with model = nn.DataParallel(model).
Expected behavior
The model should be saved without any issues.
The text was updated successfully, but these errors were encountered:
System Info
torch==1.10.2+cu113 transformers==4.18.0 Python 3.6.9 Linux "18.04.6 LTS (Bionic Beaver)"
I am training a T5 transformer (T5ForConditionalGeneration.from_pretrained(model_params["MODEL"])) to generate text. The model works well when I train it on a single GPU. But when I want to parallel the data across several GPUs by doing
model = nn.DataParallel(model)
, I can't save the model.The error is:
Reproduction
Wrap the model with
model = nn.DataParallel(model)
.Expected behavior
The text was updated successfully, but these errors were encountered: