Is DDP enabled for speaker identification using wav2vec2? #479

Sreeni1204 · 2023-05-02T10:56:41Z

Hello,

I am trying to finetune the wav2vec2 model for speaker identification on my custom dataset based on VoxCeleb. I tried to enable the ddp for the same and I am facing the following issue.

I am trying to use 4 GPUs and below is the error:

run_downstream.py: error: unrecognized arguments: --local-rank=2
run_downstream.py: error: unrecognized arguments: --local-rank=3
run_downstream.py: error: unrecognized arguments: --local-rank=1
run_downstream.py: error: unrecognized arguments: --local-rank=0
ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 2) local_rank: 0 (pid: 22843) of binary: /home/shikkalven/jupyter-conda-base-environment/conda/envs/W2v2/bin/python
Traceback (most recent call last):
File "/home/shikkalven/jupyter-conda-base-environment/conda/envs/W2v2/lib/python3.8/runpy.py", line 194, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/home/shikkalven/jupyter-conda-base-environment/conda/envs/W2v2/lib/python3.8/runpy.py", line 87, in _run_code
exec(code, run_globals)
File "/home/shikkalven/jupyter-conda-base-environment/conda/envs/W2v2/lib/python3.8/site-packages/torch/distributed/launch.py", line 196, in
main()
File "/home/shikkalven/jupyter-conda-base-environment/conda/envs/W2v2/lib/python3.8/site-packages/torch/distributed/launch.py", line 192, in main
launch(args)
File "/home/shikkalven/jupyter-conda-base-environment/conda/envs/W2v2/lib/python3.8/site-packages/torch/distributed/launch.py", line 177, in launch
run(args)
File "/home/shikkalven/jupyter-conda-base-environment/conda/envs/W2v2/lib/python3.8/site-packages/torch/distributed/run.py", line 785, in run
elastic_launch(
File "/home/shikkalven/jupyter-conda-base-environment/conda/envs/W2v2/lib/python3.8/site-packages/torch/distributed/launcher/api.py", line 134, in call
return launch_agent(self._config, self._entrypoint, list(args))
File "/home/shikkalven/jupyter-conda-base-environment/conda/envs/W2v2/lib/python3.8/site-packages/torch/distributed/launcher/api.py", line 250, in launch_agent
raise ChildFailedError(
torch.distributed.elastic.multiprocessing.errors.ChildFailedError:

Please support.

jingru-lin · 2023-06-23T05:14:13Z

Hi, I have the same problem, is this solved?

hank0316 · 2023-07-13T08:05:30Z

Hi, I solve it by downgrade my torch version from 2.0.1 to 1.13.0. You can refer to this page to find the instructions to install the older version of torch/torchaudio.

https://pytorch.org/get-started/previous-versions/

raotnameh · 2023-11-27T11:20:41Z

check the arg if --local-rank in the run_downstream.py. Maybe it is required as --local_rank OR the other way around.

kaen2891 · 2024-02-16T01:51:52Z

I solved this problem when I changed the --local_rank to --local-rank in run_downstream.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Is DDP enabled for speaker identification using wav2vec2? #479

Is DDP enabled for speaker identification using wav2vec2? #479

Sreeni1204 commented May 2, 2023

jingru-lin commented Jun 23, 2023

hank0316 commented Jul 13, 2023

raotnameh commented Nov 27, 2023

kaen2891 commented Feb 16, 2024

Is DDP enabled for speaker identification using wav2vec2? #479

Is DDP enabled for speaker identification using wav2vec2? #479

Comments

Sreeni1204 commented May 2, 2023

jingru-lin commented Jun 23, 2023

hank0316 commented Jul 13, 2023

raotnameh commented Nov 27, 2023

kaen2891 commented Feb 16, 2024