Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Possible deadlock on sys.stdout/stderr when combining multiprocessing with threads #72568

Closed
Hadhoke mannequin opened this issue Oct 6, 2016 · 4 comments
Closed

Possible deadlock on sys.stdout/stderr when combining multiprocessing with threads #72568

Hadhoke mannequin opened this issue Oct 6, 2016 · 4 comments
Labels
3.7 (EOL) end of life stdlib Python modules in the Lib dir topic-multiprocessing type-bug An unexpected behavior, bug, or error

Comments

@Hadhoke
Copy link
Mannequin

Hadhoke mannequin commented Oct 6, 2016

BPO 28382
Nosy @pitrou, @applio, @hadhoke

Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

Show more details

GitHub fields:

assignee = None
closed_at = None
created_at = <Date 2016-10-06.23:20:37.260>
labels = ['3.7', 'type-bug', 'library']
title = 'Possible deadlock on sys.stdout/stderr when combining multiprocessing with threads'
updated_at = <Date 2017-07-23.12:16:39.930>
user = 'https://github.com/Hadhoke'

bugs.python.org fields:

activity = <Date 2017-07-23.12:16:39.930>
actor = 'pitrou'
assignee = 'none'
closed = False
closed_date = None
closer = None
components = ['Library (Lib)']
creation = <Date 2016-10-06.23:20:37.260>
creator = 'Hadhoke'
dependencies = []
files = []
hgrepos = []
issue_num = 28382
keywords = []
message_count = 3.0
messages = ['278221', '298875', '298900']
nosy_count = 3.0
nosy_names = ['pitrou', 'davin', 'Hadhoke']
pr_nums = []
priority = 'normal'
resolution = None
stage = None
status = 'open'
superseder = None
type = 'behavior'
url = 'https://bugs.python.org/issue28382'
versions = ['Python 3.5', 'Python 3.6', 'Python 3.7']

@Hadhoke
Copy link
Mannequin Author

Hadhoke mannequin commented Oct 6, 2016

I am launching a process inside a pool worker, using the multiprocessing module.
After a while, a deadlock append when I am trying to join the process.

Here is a simple version of the code:

import sys, time, multiprocessing
from multiprocessing.pool import ThreadPool

def main():
    # Launch 8 workers
    pool = ThreadPool(8)
    it = pool.imap(run, range(500))
    while True:
        try:
            it.next()
        except StopIteration:
            break

def run(value):
    # Each worker launch its own Process
    process = multiprocessing.Process(target=run_and_might_segfault,     args=(value,))
    process.start()

    while process.is_alive():
        sys.stdout.write('.')
        sys.stdout.flush()
        time.sleep(0.1)

    # Will never join after a while, because of a mystery deadlock
    process.join()

def run_and_might_segfault(value):
    print(value)

if __name__ == '__main__':
    main()

And here is a possible output:

~ python m.py
..0
1
........8
.9
.......10
......11
........12
13
........14
........16
........................................................................................
As you can see, process.is_alive() is alway true after few iterations, the process will never join.

If I CTRL-C the script a get this stacktrace:

Traceback (most recent call last):
  File "/usr/local/Cellar/python3/3.5.1/Frameworks/Python.framework/Versions/3.5/lib/python3.5/multiprocessing/pool.py", line 680, in next
    item = self._items.popleft()
IndexError: pop from an empty deque

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "m.py", line 30, in <module>
    main()
  File "m.py", line 9, in main
    it.next()
  File "/usr/local/Cellar/python3/3.5.1/Frameworks/Python.framework/Versions/3.5    /lib/python3.5/multiprocessing/pool.py", line 684, in next
    self._cond.wait(timeout)
  File "/usr/local/Cellar/python3/3.5.1/Frameworks/Python.framework/Versions/3.5    /lib/python3.5/threading.py", line 293, in wait
    waiter.acquire()
KeyboardInterrupt

Error in atexit._run_exitfuncs:
Traceback (most recent call last):
  File "/usr/local/Cellar/python3/3.5.1/Frameworks/Python.framework/Versions/3.5    /lib/python3.5/multiprocessing/popen_fork.py", line 29, in poll
    pid, sts = os.waitpid(self.pid, flag)
KeyboardInterrupt

Using python 3.5.1 on macos, also tried with 3.5.2 with same issue.
Same result on Debian.
I tried using python 2.7, and it is working well. May be a python 3.5 issue only?

Here is the link of the stackoverflow question:
http://stackoverflow.com/questions/39884898/large-amount-of-multiprocessing-process-causing-deadlock

@Hadhoke Hadhoke mannequin added stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error labels Oct 6, 2016
@pitrou
Copy link
Member

pitrou commented Jul 22, 2017

Ok, after a bit of diagnosing, the issue is combining multi-threading with use of fork(). The problem is file objects (such as sys.stdout) have locks but those locks may be taken at the exact point where fork() happens, in which case the child will block when trying to take the lock again.

This is mostly a duplicate of bpo-6721, but perhaps multiprocessing could at least improve things for sys.stdout and sys.stderr (though I'm not sure how).

This is also compounded by the fact that Process._bootstrap() flushed the standard streams at the end.

@pitrou pitrou added the 3.7 (EOL) end of life label Jul 22, 2017
@pitrou pitrou changed the title Possible deadlock after many multiprocessing.Process are launch Possible deadlock on sys.stdout/stderr when combining multiprocessing with threads Jul 22, 2017
@pitrou
Copy link
Member

pitrou commented Jul 23, 2017

Also, a reliable fix is to use the "forkserver" (or "spawn", but it is much slower) method:
https://docs.python.org/3/library/multiprocessing.html#contexts-and-start-methods

@ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
@gpshead
Copy link
Member

gpshead commented Dec 13, 2022

infeasible. workaround: never use the (still default on posix in 3.11) fork start method in a process that might have threads. use spawn or forkserver.

@gpshead gpshead closed this as not planned Won't fix, can't repro, duplicate, stale Dec 13, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.7 (EOL) end of life stdlib Python modules in the Lib dir topic-multiprocessing type-bug An unexpected behavior, bug, or error
Projects
Development

No branches or pull requests

3 participants