Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

torch.multiprocessing.spawn.ProcessExitedException: process 0 terminated with exit code 3221225477 #24

Closed
1635200412 opened this issue Jun 19, 2024 · 1 comment

Comments

@1635200412
Copy link

"I want to try training with zluda, but I always encounter the error 'torch.multiprocessing.spawn.ProcessRaisedException:'. I found that this is when using torch.nn.parallel.DistributedDataParallel. The program will exit automatically and there will be no logs about DistributedDataParallel. I have tried increasing virtual memory and running as an administrator, but neither has solved the problem."

Traceback (most recent call last):
File "D:\GPT-SoVITS-beta0217\GPT_SoVITS\s2_train.py", line 600, in
main()
File "D:\GPT-SoVITS-beta0217\GPT_SoVITS\s2_train.py", line 56, in main
mp.spawn(
File "D:\GPT-SoVITS-beta0217\runtime\lib\site-packages\torch\multiprocessing\spawn.py", line 239, in spawn
return start_processes(fn, args, nprocs, join, daemon, start_method='spawn')
File "D:\GPT-SoVITS-beta0217\runtime\lib\site-packages\torch\multiprocessing\spawn.py", line 197, in start_processes
while not context.join():
File "D:\GPT-SoVITS-beta0217\runtime\lib\site-packages\torch\multiprocessing\spawn.py", line 160, in join
raise ProcessRaisedException(msg, error_index, failed_process.pid)
torch.multiprocessing.spawn.ProcessRaisedException:

-- Process 0 terminated with the following error:
Traceback (most recent call last):
File "D:\GPT-SoVITS-beta0217\runtime\lib\site-packages\torch\multiprocessing\spawn.py", line 69, in _wrap
fn(i, *args)
File "D:\GPT-SoVITS-beta0217\GPT_SoVITS\s2_train.py", line 85, in run
train_dataset = TextAudioSpeakerLoader(hps.data) ########
File "D:\GPT-SoVITS-beta0217\GPT_SoVITS\module\data_utils.py", line 35, in init
assert os.path.exists(self.path2)
AssertionError

Traceback (most recent call last):
File "train.py", line 329, in
main()
File "train.py", line 44, in main
mp.spawn(run, nprocs=n_gpus, args=(n_gpus, hps,))
File "D:\so-vits-svc_2\so-vits-svc\workenv\lib\site-packages\torch\multiprocessing\spawn.py", line 240, in spawn
return start_processes(fn, args, nprocs, join, daemon, start_method='spawn')
File "D:\so-vits-svc_2\so-vits-svc\workenv\lib\site-packages\torch\multiprocessing\spawn.py", line 198, in start_processes
while not context.join():
File "D:\so-vits-svc_2\so-vits-svc\workenv\lib\site-packages\torch\multiprocessing\spawn.py", line 149, in join
raise ProcessExitedException(
torch.multiprocessing.spawn.ProcessExitedException: process 0 terminated with exit code 3221225477

@lshqqytiger lshqqytiger added the bug Something isn't working label Jun 19, 2024
@lshqqytiger lshqqytiger removed the bug Something isn't working label Jul 14, 2024
@lshqqytiger
Copy link
Owner

It doesn't seem to be an issue of ZLUDA.
Please check if self.path2 exists.

assert os.path.exists(self.path2)
AssertionError

@lshqqytiger lshqqytiger closed this as not planned Won't fix, can't repro, duplicate, stale Jul 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants