RuntimeError: context has already been set(multiprocessing) #3492

zhoumingyi · 2017-11-05T06:51:48Z

I use a spawn start methods to share CUDA tensors between processes

import torch
torch.multiprocessing.set_start_method("spawn")
import torch.multiprocessing as mp


def sub_processes(A, B, D, i, j, size):

    D[(j * size):((j + 1) * size), i] = torch.mul(B[:, i], A[j, i])


def task(A, B):
    size1 = A.shape
    size2 = B.shape
    D = torch.zeros([size1[0] * size2[0], size1[1]]).cuda()
    D.share_memory_()

    for i in range(1):
        processes = []
        for j in range(size1[0]):
            p = mp.Process(target=sub_processes, args=(A, B, D, i, j, size2[0]))
            p.start()
            processes.append(p)
        for p in processes:
            p.join()

    return D

A = torch.rand(3, 3).cuda()
B = torch.rand(3, 3).cuda()
C = task(A,B)
print(C)

It returns a wrong result and shows an error

/usr/bin/python3.5 /home/mingyi/桌面/test/test.py
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/usr/lib/python3.5/multiprocessing/spawn.py", line 106, in spawn_main
    exitcode = _main(fd)
  File "/usr/lib/python3.5/multiprocessing/spawn.py", line 115, in _main
    prepare(preparation_data)
  File "/usr/lib/python3.5/multiprocessing/spawn.py", line 226, in prepare
    _fixup_main_from_path(data['init_main_from_path'])
  File "/usr/lib/python3.5/multiprocessing/spawn.py", line 278, in _fixup_main_from_path
    run_name="__mp_main__")
  File "/usr/lib/python3.5/runpy.py", line 254, in run_path
    pkg_name=pkg_name, script_name=fname)
  File "/usr/lib/python3.5/runpy.py", line 96, in _run_module_code
    mod_name, mod_spec, pkg_name, script_name)
  File "/usr/lib/python3.5/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/home/mingyi/桌面/test/test.py", line 2, in <module>
    torch.multiprocessing.set_start_method("spawn")
  File "/usr/lib/python3.5/multiprocessing/context.py", line 231, in set_start_method
    raise RuntimeError('context has already been set')
RuntimeError: context has already been set
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/usr/lib/python3.5/multiprocessing/spawn.py", line 106, in spawn_main
    exitcode = _main(fd)
  File "/usr/lib/python3.5/multiprocessing/spawn.py", line 115, in _main
    prepare(preparation_data)
  File "/usr/lib/python3.5/multiprocessing/spawn.py", line 226, in prepare
    _fixup_main_from_path(data['init_main_from_path'])
  File "/usr/lib/python3.5/multiprocessing/spawn.py", line 278, in _fixup_main_from_path
    run_name="__mp_main__")
  File "/usr/lib/python3.5/runpy.py", line 254, in run_path
    pkg_name=pkg_name, script_name=fname)
  File "/usr/lib/python3.5/runpy.py", line 96, in _run_module_code
    mod_name, mod_spec, pkg_name, script_name)
  File "/usr/lib/python3.5/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/home/mingyi/桌面/test/test.py", line 2, in <module>
    torch.multiprocessing.set_start_method("spawn")
  File "/usr/lib/python3.5/multiprocessing/context.py", line 231, in set_start_method
    raise RuntimeError('context has already been set')
RuntimeError: context has already been set
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/usr/lib/python3.5/multiprocessing/spawn.py", line 106, in spawn_main
    exitcode = _main(fd)
  File "/usr/lib/python3.5/multiprocessing/spawn.py", line 115, in _main
    prepare(preparation_data)
  File "/usr/lib/python3.5/multiprocessing/spawn.py", line 226, in prepare
    _fixup_main_from_path(data['init_main_from_path'])
  File "/usr/lib/python3.5/multiprocessing/spawn.py", line 278, in _fixup_main_from_path
    run_name="__mp_main__")
  File "/usr/lib/python3.5/runpy.py", line 254, in run_path
    pkg_name=pkg_name, script_name=fname)
  File "/usr/lib/python3.5/runpy.py", line 96, in _run_module_code
    mod_name, mod_spec, pkg_name, script_name)
  File "/usr/lib/python3.5/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/home/mingyi/桌面/test/test.py", line 2, in <module>
    torch.multiprocessing.set_start_method("spawn")
  File "/usr/lib/python3.5/multiprocessing/context.py", line 231, in set_start_method
    raise RuntimeError('context has already been set')
RuntimeError: context has already been set

    0     0     0
    0     0     0
    0     0     0
    0     0     0
    0     0     0
    0     0     0
    0     0     0
    0     0     0
    0     0     0
[torch.cuda.FloatTensor of size 9x3 (GPU 0)]


Process finished with exit code 0

The text was updated successfully, but these errors were encountered:

apaszke · 2017-11-05T11:12:20Z

When using spawn you should guard the part that launches the job in if __name__ == '__main__':. set_start_method should also go there, and everything will run fine.

franciscorubin · 2018-03-22T20:39:27Z

I have the same problem and the solution provided by @apaszke doesn't work for me..

naba89 · 2018-04-19T08:52:57Z

Hi @pancho111203 ,

You might have other files in your project which also have a if __name__ == '__main__':. One workaround is to call the set_start_method with the force argument as: set_start_method('forkserver', force=True). This solved the issue for me.

Regards
Nabarun

chrischute · 2018-05-29T23:09:56Z

Even inside the if __name__ == '__main__' block I saw the runtime error described above. Using force=True caused me to leak semaphores. This was the right solution for me:

try:
    mp.set_start_method('spawn')
except RuntimeError:
    pass

KiddoZhu · 2019-08-19T02:59:52Z

I found a solution, which is to use a context object in multiprocessing.

ctx = multiprocessing.get_context("spawn")

And then replace multiprocessing.xxx with ctx.xxx.

Setting 'spawn' multiple times in the process causes RuntimeError. To avoid this issue we can use force=True but according to following issue, it causes leaking of semephores. Therefore I decided to use try-except to avoid this RuntimeError. See also: pytorch/pytorch#3492

Thanks to pytorch/pytorch#3492 (comment)

TimZaman · 2019-11-10T08:18:59Z

For me the problem was that i did a

print(torch.multiprocessing.get_start_method())

which will explicitly set the start method. After calling this - you will be unable to set the change method. Minor bug, if you could even call it that.

JiJunhua · 2020-10-23T09:24:24Z

For me the problem was that i did a
print(torch.multiprocessing.get_start_method())
which will explicitly set the start method. After calling this - you will be unable to set the change method. Minor bug, if you could even call it that.

yep!!!

from pytorch/pytorch#3492 (comment)

See: pytorch/pytorch#3492 (comment)

poedator · 2022-10-30T09:34:43Z

I still see this issue in 2022. Apparently some other modules, namely datasets from huggingface may trigger this error.
This is a simple code to reproduce it:

import datasets
wnut = datasets.load_dataset("wnut_17")
import torch 
torch.multiprocessing.set_start_method('spawn')

result:
RuntimeError: context has already been set

Torch version == 1.12.1, Python 3.9.13, Ubuntu+WSL, conda

A quick workaround is to import torch and set start_method first, but you may consider reopening and fixing this issue.

context does not get messed up - seems to be a known bug pytorch/pytorch#3492

ad8e · 2024-04-02T05:57:54Z

My solution: when using torchrun, use: torchrun --start-method=forkserver

anthonyweidai · 2024-04-16T02:38:15Z

For me the problem was that i did a
print(torch.multiprocessing.get_start_method())
which will explicitly set the start method. After calling this - you will be unable to set the change method. Minor bug, if you could even call it that.

Solved problems of “Cannot re-initialize CUDA in forked subprocess” and "context has already been set(multiprocessing)" by

torch.multiprocessing.set_start_method("spawn", force=True)

apaszke closed this as completed Nov 5, 2017

Jongchan mentioned this issue Jul 2, 2018

torch.utils.data.DataLoader并行处理h5文件时错误,单线程正常,并行报错. #3415

Closed

mohamad-hasan-sohan-ajini mentioned this issue Jul 27, 2018

When using no-shared = False, the process is blocked ikostrikov/pytorch-a3c#37

Open

lcswillems mentioned this issue Nov 13, 2018

Support for MiniWorld (3D indoor environment)? lcswillems/rl-starter-files#13

Open

rusty1s mentioned this issue Mar 14, 2019

RuntimeError: cuda runtime error (3) : initialization error at /opt/conda/conda-bld/pytorch_1544174967633/work/aten/src/THC/THCGeneral.cpp:51 pyg-team/pytorch_geometric#131

Closed

ezyang mentioned this issue May 10, 2019

Better documentation / molly-guards around use of multiprocessing with spawn in Jupyter/ipython notebooks #20375

Open

kan-bayashi mentioned this issue Oct 13, 2019

Fix RuntimeError in setting spawn multiple times espnet/espnet#1267

Merged

unixpickle added a commit to unixpickle/reptile-gen that referenced this issue Oct 14, 2019

prevent semaphore leak

da1c091

Thanks to pytorch/pytorch#3492 (comment)

danieltomasz mentioned this issue Oct 24, 2019

RuntimeError('context has already been set') cbrnr/mnelab#96

Closed

sneaxiy mentioned this issue Nov 26, 2019

Training in multiprocessing.Process falied to run startup program when using GPU PaddlePaddle/Paddle#20169

Closed

$@refraction-ray$ refraction-ray mentioned this issue Aug 9, 2020

Possible deadlock with multiprocessing tensorflow/quantum#335

Open

cpene1 mentioned this issue Jun 9, 2021

Single GPU Multiprocessing with PaddlePaddleOCR and pytorch PaddlePaddle/PaddleOCR#3070

Closed

fijipants mentioned this issue Aug 24, 2021

Update trainer.py coqui-ai/TTS#762

Closed

edwinnglabs mentioned this issue Dec 22, 2021

PyStan Compiling Issue with Python 3.9.x uber/orbit#653

Closed

sebastientourbier added a commit to sebastientourbier/connectomemapper3 that referenced this issue Jan 13, 2022

FIX: Set force=True in multiprocessing.set_start_method('forkserver')

22ad2c7

from pytorch/pytorch#3492 (comment)

joshuacwnewton added a commit to spinalcordtoolbox/spinalcordtoolbox that referenced this issue Jan 31, 2022

test_qc_parallel.py: Use a context object instead of setting globally

17afec6

See: pytorch/pytorch#3492 (comment)

unexge mentioned this issue Sep 27, 2022

Simplify Python Pokemon service smithy-lang/smithy-rs#1773

Merged

jackievaleri added a commit to jackievaleri/BioAutoMATED that referenced this issue Nov 8, 2022

switch from forkserver multiproc to spawn

021a4d1

context does not get messed up - seems to be a known bug pytorch/pytorch#3492

hgfernan mentioned this issue Jan 4, 2023

I found a solution, which is to use a context object in multiprocessing. #91707

Closed

andreped mentioned this issue Apr 14, 2023

pytest-cov with multiprocessing results in occational deadlock andreped/GradientAccumulator#57

Closed

hvasbath mentioned this issue Jul 10, 2023

Multiple GPU usage and init tbenthompson/cutde#33

Closed

Sean1572 mentioned this issue Jul 23, 2023

Moves the multi-processing stuff out of main functions UCSD-E4E/acoustic-multiclass-training#90

Merged

YibinLiu666 mentioned this issue Jul 24, 2023

FEAT: Support nccl in collective communication xorbitsai/xoscar#52

Closed

2 tasks

ly19965 mentioned this issue Oct 24, 2023

RuntimeError: context has already been set modelscope/facechain#297

Closed

JesseSenior mentioned this issue Dec 23, 2023

Fix RuntimeError: context has already been set guofei9987/scikit-opt#223

Merged

claesnl mentioned this issue Sep 17, 2024

Getting context already set error CAAI/rh-node#42

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RuntimeError: context has already been set(multiprocessing) #3492

RuntimeError: context has already been set(multiprocessing) #3492

zhoumingyi commented Nov 5, 2017

apaszke commented Nov 5, 2017

franciscorubin commented Mar 22, 2018

naba89 commented Apr 19, 2018

chrischute commented May 29, 2018 •

edited

Loading

KiddoZhu commented Aug 19, 2019

TimZaman commented Nov 10, 2019

JiJunhua commented Oct 23, 2020

poedator commented Oct 30, 2022 •

edited

Loading

ad8e commented Apr 2, 2024

anthonyweidai commented Apr 16, 2024

RuntimeError: context has already been set(multiprocessing) #3492

RuntimeError: context has already been set(multiprocessing) #3492

Comments

zhoumingyi commented Nov 5, 2017

apaszke commented Nov 5, 2017

franciscorubin commented Mar 22, 2018

naba89 commented Apr 19, 2018

chrischute commented May 29, 2018 • edited Loading

KiddoZhu commented Aug 19, 2019

TimZaman commented Nov 10, 2019

JiJunhua commented Oct 23, 2020

poedator commented Oct 30, 2022 • edited Loading

ad8e commented Apr 2, 2024

anthonyweidai commented Apr 16, 2024

chrischute commented May 29, 2018 •

edited

Loading

poedator commented Oct 30, 2022 •

edited

Loading