Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

multiprocessing.Process generates FileNotFoundError when argument isn't explicitly referenced #94765

Open
JZerf opened this issue Jul 11, 2022 · 14 comments
Labels
topic-multiprocessing type-bug An unexpected behavior, bug, or error

Comments

@JZerf
Copy link

JZerf commented Jul 11, 2022

Bug report
This is a continuation for the possible bug mentioned in issue #82236 which was closed because DonnyBrown, the submitter, didn't provide enough information.

DonnyBrown was getting a FileNotFoundError when starting a process with multiprocessing.Process that uses an argument that doesn't have an explicit reference. I'm able to reproduce the same error using the test code DonnyBrown provided in that issue on Ubuntu Desktop LTS 22.04 x86-64 with CPython 3.10.4. @iritkatriel mentioned that they were unable to reproduce the error on Windows 10 with Python 3.10.

I can also reproduce the error using this slightly modified/simpler version of DonnyBrown's test code that I have been testing:

import multiprocessing

def demo(argument):
    print(argument)

if __name__=="__main__":
    multiprocessing.set_start_method("spawn") # Changing this to "fork" (on platforms where it is
                                              # available) can also cause the below code to work.


    process=multiprocessing.Process(target=demo, args=[multiprocessing.Value("i", 0)]) # FAILS

    #process=multiprocessing.Process(target=demo, args=[0])                            # WORKS

    #reference_To_Number=multiprocessing.Value("i", 0)                                 # WORKS
    #process=multiprocessing.Process(target=demo, args=[reference_To_Number])


    process.start()
    process.join()

The traceback I get with the above code is:

Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/usr/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main
    exitcode = _main(fd, parent_sentinel)
  File "/usr/lib/python3.10/multiprocessing/spawn.py", line 126, in _main
    self = reduction.pickle.load(from_parent)
  File "/usr/lib/python3.10/multiprocessing/synchronize.py", line 110, in __setstate__
    self._semlock = _multiprocessing.SemLock._rebuild(*state)
FileNotFoundError: [Errno 2] No such file or directory

The above code can be made to work on my test system by making any of the following changes:

  • Change the process start method to "fork" instead.
  • Change the process argument to a simple integer instead of a multiprocessing.Value.
  • Assign the multiprocessing.Value to a variable and change the process argument to use the variable.

I'm not a Python expert so maybe this is the expected behavior when spawning a process directly with a multiprocessing.Value but it does seem odd that making any of the above mentioned changes causes the code to work or that (based on @iritkatriel's success with DonnyBrown's test code) running it on Windows 10 (which uses the "spawn" start method) will probably cause the code to work.

Your environment

  • CPython versions tested on: 3.10.4
  • Operating system and architecture: Linux, Ubuntu Desktop LTS 22.04, x86-64
@JZerf JZerf added the type-bug An unexpected behavior, bug, or error label Jul 11, 2022
@akulakov
Copy link
Contributor

Confirmed the issue on Python 3.9 and 3.12 on MacOS 11.5.2 .

@curonny
Copy link

curonny commented Nov 22, 2022

Confirmed the issue on python 3.8 on macos 13.0.1

@whitedemong
Copy link

Confirmed the issue on Python 3.8.16 on MacOS 13.1 (22C65).

@david-thrower
Copy link

david-thrower commented Apr 22, 2023

Same here on Ubuntu 22.04 with Python 3.10.6. File it is looking for has 777 permissions. Specifically [working_directory]/lib/tom-select/tom-select.css, a file created by pyvis v 0.3.1 by a past run of the same script. If I rm rf lib and run it again: I get the error: [Errno 39] Directory not empty: 'vis-9.0.4'. This was stable without a lock context previously (ran thousands of times without a problem). When I nest this in a function, it still works. When I call that function as the target of a Process, I get these errors.

These errors were also very opaque, as they required a chain of try: .., except Exception as err: ... print(err) clauses to get the error to print to the console. I presume there is an issue with the pipe of stderr in this context as well.

Additionally, running the process in the context of "fork" does not resolve the issue (same error). The workaround of using args = [multiprocessing.Value(...)] instead of args=(0) throws the error TypeError: this type has no size:

Traceback (most recent call last):
  File "/[redacted]/my-script.py", line 346, in <module>
    processes_list = [ctx.Process(target=objective,
  File "/[redacted]/my-script.py", line 347, in <listcomp>
    args=[Value("trial", 0)]
  File "/usr/lib/python3.10/multiprocessing/context.py", line 135, in Value
    return Value(typecode_or_type, *args, lock=lock,
  File "/usr/lib/python3.10/multiprocessing/sharedctypes.py", line 74, in Value
    obj = RawValue(typecode_or_type, *args)
  File "/usr/lib/python3.10/multiprocessing/sharedctypes.py", line 49, in RawValue
    obj = _new_value(type_)
  File "/usr/lib/python3.10/multiprocessing/sharedctypes.py", line 40, in _new_value
    size = ctypes.sizeof(type_)
TypeError: this type has no size

As a (not ideal) workaround, I ultimately made the offending process operate under a subprocess.run() context as a parameterized script and used a Process() as a proxy between the main script and the actual process. That "worked for now".

@RinatV
Copy link

RinatV commented Jun 3, 2023

https://superfastpython.com/filenotfounderror-multiprocessing-python/

Тут расписано

мне помог простой time.sleep(1) после p.start()

@Luferov
Copy link

Luferov commented Aug 11, 2023

Confirmed the issue on Python 3.9.17 on MacOS 14.

@Starbuck5
Copy link

I was having a similar issue with sharing concurrency primitives (a multiprocessing.Queue in my case) across processes when using the spawn backend.

I believe this is happening because of ref counts / garbage collection. If there's a possibility the object gets deleted by the main process while/during/before being shared, the file isn't around when the other processes look for it and it is a FileNotFoundError. This explains why putting it a variable (preventing the object from being deallocated) works and explicitly putting into a progress argument does not.

The object getting deleted could also happen if the main process ends too soon, as referenced in this article: https://superfastpython.com/filenotfounderror-multiprocessing-python/

@haimat
Copy link

haimat commented Sep 15, 2023

We can reproduce this problem with the following piece of code using Python 3.8.10 on Ubuntu Linux 20.04:

import multiprocessing as mp

def demo(argument):
    print(argument)

def create_process():
    arg = mp.Value("i", 0)
    return mp.Process(target=demo, args=[arg])

if __name__ == "__main__":
    mp.set_start_method("spawn")  # fails
    # mp.set_start_method("fork")  # works
    # mp.set_start_method("forkserver")  # also fails

    process = create_process()
    process.start()
    process.join()

This leads to the same stacktrace as in the OP. The issue does not seem to be related to the garbage collector, as disabling it before creating the process and enabling it after join() does also not help.

Is this a bug in CPython, or are we supposed to perform these steps in a different way?

@Starbuck5
Copy link

Starbuck5 commented Sep 19, 2023

This leads to the same stacktrace as in the OP. The issue does not seem to be related to the garbage collector, as disabling it before creating the process and enabling it after join() does also not help.

The object gets deallocated anyway because the refcount reaches 0. That's not part of the garbage collector I think.

After your create_process method finishes the value of arg gets deallocated. If you create arg in the name=main block and pass it to create_process I think the issue will be solved.

@ziegenbalg
Copy link

Hitting this problem as well.

Exception ignored in: <Finalize object, dead>
Traceback (most recent call last):
  File "/usr/lib64/python3.10/multiprocessing/util.py", line 224, in __call__
    res = self._callback(*self._args, **self._kwargs)
  File "/usr/lib64/python3.10/multiprocessing/synchronize.py", line 87, in _cleanup
    sem_unlink(name)
FileNotFoundError: [Errno 2] No such file or directory

@zuliani99
Copy link

zuliani99 commented Mar 21, 2024

In my conda environment I'm using python=3.12.2 however I get the warning that is referring to the multiprocessing module of python 3.8.

I've already double checked the python version using python --version and I'm using the latest version.

File "/opt/anaconda/anaconda3/lib/python3.8/multiprocessing/util.py", line 300, in _run_finalizers
    finalizer()
  File "/opt/anaconda/anaconda3/lib/python3.8/multiprocessing/util.py", line 224, in __call__
    res = self._callback(*self._args, **self._kwargs)
  File "/opt/anaconda/anaconda3/lib/python3.8/multiprocessing/synchronize.py", line 87, in _cleanup
    sem_unlink(name)
FileNotFoundError: [Errno 2] No such file or directory

To be specific I'm using pytorch multiprocessing in order to spawn multiple process for multi-GPUs training.
This the issue that I'm facing.

@Chidu2000
Copy link

Process Process-1:
Traceback (most recent call last):
File "/usr/lib/python3.10/multiprocessing/process.py", line 314, in _bootstrap
self.run()
File "/usr/lib/python3.10/multiprocessing/process.py", line 108, in run
self._target(*self._args, **self._kwargs)
File "/home/ubuntu/Downloads/Pensieve-DRL-Master-thesis/pensieve-pytorch/hyp_param_test.py", line 114, in central_agent
s_batch, a_batch, r_batch, terminal, info, net_env = exp_queues[i].get() # for all the 3 agents , so a vector of size 3 (i.e s,a,r_batch)
File "/usr/lib/python3.10/multiprocessing/queues.py", line 122, in get
return _ForkingPickler.loads(res)
File "/home/ubuntu/.local/lib/python3.10/site-packages/torch/multiprocessing/reductions.py", line 495, in rebuild_storage_fd
fd = df.detach()
File "/usr/lib/python3.10/multiprocessing/resource_sharer.py", line 57, in detach
with _resource_sharer.get_connection(self._id) as conn:
File "/usr/lib/python3.10/multiprocessing/resource_sharer.py", line 86, in get_connection
c = Client(address, authkey=process.current_process().authkey)
File "/usr/lib/python3.10/multiprocessing/connection.py", line 502, in Client
c = SocketClient(address)
File "/usr/lib/python3.10/multiprocessing/connection.py", line 630, in SocketClient
s.connect(address)
FileNotFoundError: [Errno 2] No such file or directory

Is this issue on all Python 3.x versions?

@yeonfish6040
Copy link

Confirmed the issue on python 3.10 on macos 14.4.1

@LSimon95
Copy link

https://superfastpython.com/filenotfounderror-multiprocessing-python/

Тут расписано

мне помог простой time.sleep(1) после p.start()

time.sleep work on Ubuntu 22.04.3 on python 3.10 but still a issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
topic-multiprocessing type-bug An unexpected behavior, bug, or error
Projects
Status: No status
Development

No branches or pull requests