Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RuntimeError: context has already been set(multiprocessing) #3492

Closed
zhoumingyi opened this issue Nov 5, 2017 · 10 comments
Closed

RuntimeError: context has already been set(multiprocessing) #3492

zhoumingyi opened this issue Nov 5, 2017 · 10 comments

Comments

@zhoumingyi
Copy link

I use a spawn start methods to share CUDA tensors between processes

import torch
torch.multiprocessing.set_start_method("spawn")
import torch.multiprocessing as mp


def sub_processes(A, B, D, i, j, size):

    D[(j * size):((j + 1) * size), i] = torch.mul(B[:, i], A[j, i])


def task(A, B):
    size1 = A.shape
    size2 = B.shape
    D = torch.zeros([size1[0] * size2[0], size1[1]]).cuda()
    D.share_memory_()

    for i in range(1):
        processes = []
        for j in range(size1[0]):
            p = mp.Process(target=sub_processes, args=(A, B, D, i, j, size2[0]))
            p.start()
            processes.append(p)
        for p in processes:
            p.join()

    return D

A = torch.rand(3, 3).cuda()
B = torch.rand(3, 3).cuda()
C = task(A,B)
print(C)

It returns a wrong result and shows an error

/usr/bin/python3.5 /home/mingyi/桌面/test/test.py
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/usr/lib/python3.5/multiprocessing/spawn.py", line 106, in spawn_main
    exitcode = _main(fd)
  File "/usr/lib/python3.5/multiprocessing/spawn.py", line 115, in _main
    prepare(preparation_data)
  File "/usr/lib/python3.5/multiprocessing/spawn.py", line 226, in prepare
    _fixup_main_from_path(data['init_main_from_path'])
  File "/usr/lib/python3.5/multiprocessing/spawn.py", line 278, in _fixup_main_from_path
    run_name="__mp_main__")
  File "/usr/lib/python3.5/runpy.py", line 254, in run_path
    pkg_name=pkg_name, script_name=fname)
  File "/usr/lib/python3.5/runpy.py", line 96, in _run_module_code
    mod_name, mod_spec, pkg_name, script_name)
  File "/usr/lib/python3.5/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/home/mingyi/桌面/test/test.py", line 2, in <module>
    torch.multiprocessing.set_start_method("spawn")
  File "/usr/lib/python3.5/multiprocessing/context.py", line 231, in set_start_method
    raise RuntimeError('context has already been set')
RuntimeError: context has already been set
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/usr/lib/python3.5/multiprocessing/spawn.py", line 106, in spawn_main
    exitcode = _main(fd)
  File "/usr/lib/python3.5/multiprocessing/spawn.py", line 115, in _main
    prepare(preparation_data)
  File "/usr/lib/python3.5/multiprocessing/spawn.py", line 226, in prepare
    _fixup_main_from_path(data['init_main_from_path'])
  File "/usr/lib/python3.5/multiprocessing/spawn.py", line 278, in _fixup_main_from_path
    run_name="__mp_main__")
  File "/usr/lib/python3.5/runpy.py", line 254, in run_path
    pkg_name=pkg_name, script_name=fname)
  File "/usr/lib/python3.5/runpy.py", line 96, in _run_module_code
    mod_name, mod_spec, pkg_name, script_name)
  File "/usr/lib/python3.5/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/home/mingyi/桌面/test/test.py", line 2, in <module>
    torch.multiprocessing.set_start_method("spawn")
  File "/usr/lib/python3.5/multiprocessing/context.py", line 231, in set_start_method
    raise RuntimeError('context has already been set')
RuntimeError: context has already been set
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/usr/lib/python3.5/multiprocessing/spawn.py", line 106, in spawn_main
    exitcode = _main(fd)
  File "/usr/lib/python3.5/multiprocessing/spawn.py", line 115, in _main
    prepare(preparation_data)
  File "/usr/lib/python3.5/multiprocessing/spawn.py", line 226, in prepare
    _fixup_main_from_path(data['init_main_from_path'])
  File "/usr/lib/python3.5/multiprocessing/spawn.py", line 278, in _fixup_main_from_path
    run_name="__mp_main__")
  File "/usr/lib/python3.5/runpy.py", line 254, in run_path
    pkg_name=pkg_name, script_name=fname)
  File "/usr/lib/python3.5/runpy.py", line 96, in _run_module_code
    mod_name, mod_spec, pkg_name, script_name)
  File "/usr/lib/python3.5/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/home/mingyi/桌面/test/test.py", line 2, in <module>
    torch.multiprocessing.set_start_method("spawn")
  File "/usr/lib/python3.5/multiprocessing/context.py", line 231, in set_start_method
    raise RuntimeError('context has already been set')
RuntimeError: context has already been set

    0     0     0
    0     0     0
    0     0     0
    0     0     0
    0     0     0
    0     0     0
    0     0     0
    0     0     0
    0     0     0
[torch.cuda.FloatTensor of size 9x3 (GPU 0)]


Process finished with exit code 0
@apaszke
Copy link
Contributor

apaszke commented Nov 5, 2017

When using spawn you should guard the part that launches the job in if __name__ == '__main__':. set_start_method should also go there, and everything will run fine.

@apaszke apaszke closed this as completed Nov 5, 2017
@franciscorubin
Copy link

I have the same problem and the solution provided by @apaszke doesn't work for me..

@naba89
Copy link

naba89 commented Apr 19, 2018

Hi @pancho111203 ,

You might have other files in your project which also have a if __name__ == '__main__':. One workaround is to call the set_start_method with the force argument as: set_start_method('forkserver', force=True). This solved the issue for me.

Regards
Nabarun

@chrischute
Copy link

chrischute commented May 29, 2018

Even inside the if __name__ == '__main__' block I saw the runtime error described above. Using force=True caused me to leak semaphores. This was the right solution for me:

try:
    mp.set_start_method('spawn')
except RuntimeError:
    pass

@KiddoZhu
Copy link

I found a solution, which is to use a context object in multiprocessing.

ctx = multiprocessing.get_context("spawn")

And then replace multiprocessing.xxx with ctx.xxx.

kan-bayashi added a commit to kan-bayashi/espnet that referenced this issue Oct 13, 2019
Setting 'spawn' multiple times in the process causes RuntimeError. To
avoid this issue we can use force=True but according to following issue,
it causes leaking of semephores. Therefore I decided to use try-except
to avoid this RuntimeError.

See also: pytorch/pytorch#3492
unixpickle added a commit to unixpickle/reptile-gen that referenced this issue Oct 14, 2019
@TimZaman
Copy link

For me the problem was that i did a

print(torch.multiprocessing.get_start_method())

which will explicitly set the start method. After calling this - you will be unable to set the change method. Minor bug, if you could even call it that.

@JiJunhua
Copy link

For me the problem was that i did a

print(torch.multiprocessing.get_start_method())

which will explicitly set the start method. After calling this - you will be unable to set the change method. Minor bug, if you could even call it that.

yep!!!

@poedator
Copy link

poedator commented Oct 30, 2022

I still see this issue in 2022. Apparently some other modules, namely datasets from huggingface may trigger this error.
This is a simple code to reproduce it:

import datasets
wnut = datasets.load_dataset("wnut_17")
import torch 
torch.multiprocessing.set_start_method('spawn')

result:
RuntimeError: context has already been set

Torch version == 1.12.1, Python 3.9.13, Ubuntu+WSL, conda

A quick workaround is to import torch and set start_method first, but you may consider reopening and fixing this issue.

@ad8e
Copy link
Contributor

ad8e commented Apr 2, 2024

My solution: when using torchrun, use: torchrun --start-method=forkserver

@anthonyweidai
Copy link

For me the problem was that i did a

print(torch.multiprocessing.get_start_method())

which will explicitly set the start method. After calling this - you will be unable to set the change method. Minor bug, if you could even call it that.

Solved problems of “Cannot re-initialize CUDA in forked subprocess” and "context has already been set(multiprocessing)" by

torch.multiprocessing.set_start_method("spawn", force=True)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests