You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
The model can be launched on single GPU, but not multiples.
Basic environments:
OS information: [e.g., Linux amax 4.15.0-45-generic SWBD recipe #48~16.04.1-Ubuntu SMP Tue Jan 29 18:03:48 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux]
python version: [e.g. 3.8.5 ]
espnet version: [e.g. espnet 1.10.6]
pytorch version [e.g. pytorch 1.7.1]
Python version: 3.8.5
Is CUDA available: Yes
CUDA runtime version: 11.1
Nvidia driver version:455.23.04
To solve the x_masks = make_non_pad_mask(ilens).to(next(self.parameters()).device) in Fastpseech2
I change the code to x_masks = make_non_pad_mask(ilens).to(xs.device)
Traceback (most recent call last): File "<input>", line 1, in <module> File "/home/bme2/.pycharm_helpers/pydev/_pydev_bundle/pydev_umd.py", line 197, in runfile pydev_imports.execfile(filename, global_vars, local_vars) # execute the script File "/home/bme2/.pycharm_helpers/pydev/_pydev_imps/_pydev_execfile.py", line 18, in execfile exec(compile(contents+"\n", file, 'exec'), glob, loc) File "/data/data-lhy/emg2speech/05_Mandarin_dataset/fastspeech2/train.py", line 249, in <module> output = model(xs=xs, ilens=ilens, ys=ys, olens=olens, ds=ds, ps=ps, es=es,spembs=spembs) File "/home/bme2/miniconda3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl result = self.forward(*input, **kwargs) File "/home/bme2/miniconda3/lib/python3.8/site-packages/torch/nn/parallel/data_parallel.py", line 162, in forward return self.gather(outputs, self.output_device) File "/home/bme2/miniconda3/lib/python3.8/site-packages/torch/nn/parallel/data_parallel.py", line 174, in gather return gather(outputs, output_device, dim=self.dim) File "/home/bme2/miniconda3/lib/python3.8/site-packages/torch/nn/parallel/scatter_gather.py", line 68, in gather res = gather_map(outputs) File "/home/bme2/miniconda3/lib/python3.8/site-packages/torch/nn/parallel/scatter_gather.py", line 63, in gather_map return type(out)(map(gather_map, zip(*outputs))) File "/home/bme2/miniconda3/lib/python3.8/site-packages/torch/nn/parallel/scatter_gather.py", line 55, in gather_map return Gather.apply(target_device, dim, *outputs) File "/home/bme2/miniconda3/lib/python3.8/site-packages/torch/nn/parallel/_functions.py", line 71, in forward return comm.gather(inputs, ctx.dim, ctx.target_device) File "/home/bme2/miniconda3/lib/python3.8/site-packages/torch/nn/parallel/comm.py", line 230, in gather return torch._C._gather(tensors, dim, destination) RuntimeError: Input tensor at index 1 has invalid shape [4, 170, 80], but expected [4, 303, 80]
The text was updated successfully, but these errors were encountered:
Describe the bug
The model can be launched on single GPU, but not multiples.
Basic environments:
Python version: 3.8.5
Is CUDA available: Yes
CUDA runtime version: 11.1
Nvidia driver version:455.23.04
To solve the
x_masks = make_non_pad_mask(ilens).to(next(self.parameters()).device)
in Fastpseech2I change the code to
x_masks = make_non_pad_mask(ilens).to(xs.device)
However, I still have the problem that is similar with https://discuss.pytorch.org/t/dimension-problem-by-multiple-gpus/76075,
![image](https://user-images.githubusercontent.com/45890541/161193467-4585c693-5733-49f9-bc7c-342d339c394f.png)
Error logs are
Traceback (most recent call last): File "<input>", line 1, in <module> File "/home/bme2/.pycharm_helpers/pydev/_pydev_bundle/pydev_umd.py", line 197, in runfile pydev_imports.execfile(filename, global_vars, local_vars) # execute the script File "/home/bme2/.pycharm_helpers/pydev/_pydev_imps/_pydev_execfile.py", line 18, in execfile exec(compile(contents+"\n", file, 'exec'), glob, loc) File "/data/data-lhy/emg2speech/05_Mandarin_dataset/fastspeech2/train.py", line 249, in <module> output = model(xs=xs, ilens=ilens, ys=ys, olens=olens, ds=ds, ps=ps, es=es,spembs=spembs) File "/home/bme2/miniconda3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl result = self.forward(*input, **kwargs) File "/home/bme2/miniconda3/lib/python3.8/site-packages/torch/nn/parallel/data_parallel.py", line 162, in forward return self.gather(outputs, self.output_device) File "/home/bme2/miniconda3/lib/python3.8/site-packages/torch/nn/parallel/data_parallel.py", line 174, in gather return gather(outputs, output_device, dim=self.dim) File "/home/bme2/miniconda3/lib/python3.8/site-packages/torch/nn/parallel/scatter_gather.py", line 68, in gather res = gather_map(outputs) File "/home/bme2/miniconda3/lib/python3.8/site-packages/torch/nn/parallel/scatter_gather.py", line 63, in gather_map return type(out)(map(gather_map, zip(*outputs))) File "/home/bme2/miniconda3/lib/python3.8/site-packages/torch/nn/parallel/scatter_gather.py", line 55, in gather_map return Gather.apply(target_device, dim, *outputs) File "/home/bme2/miniconda3/lib/python3.8/site-packages/torch/nn/parallel/_functions.py", line 71, in forward return comm.gather(inputs, ctx.dim, ctx.target_device) File "/home/bme2/miniconda3/lib/python3.8/site-packages/torch/nn/parallel/comm.py", line 230, in gather return torch._C._gather(tensors, dim, destination) RuntimeError: Input tensor at index 1 has invalid shape [4, 170, 80], but expected [4, 303, 80]
The text was updated successfully, but these errors were encountered: