Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Threads within multiprocessing Process terminate early #63166

Closed
pietvo mannequin opened this issue Sep 8, 2013 · 17 comments
Closed

Threads within multiprocessing Process terminate early #63166

pietvo mannequin opened this issue Sep 8, 2013 · 17 comments
Labels
3.7 stdlib type-feature

Comments

@pietvo
Copy link
Mannequin

@pietvo pietvo mannequin commented Sep 8, 2013

BPO 18966
Nosy @tim-one, @pitrou, @bitdancer, @eryksun, @applio
PRs
  • #3111
  • Files
  • processthread.py: Test program
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2017-08-16.18:54:02.032>
    created_at = <Date 2013-09-08.01:20:27.821>
    labels = ['3.7', 'type-feature', 'library']
    title = 'Threads within multiprocessing Process terminate early'
    updated_at = <Date 2017-08-16.18:54:02.031>
    user = 'https://bugs.python.org/pietvo'

    bugs.python.org fields:

    activity = <Date 2017-08-16.18:54:02.031>
    actor = 'pitrou'
    assignee = 'none'
    closed = True
    closed_date = <Date 2017-08-16.18:54:02.032>
    closer = 'pitrou'
    components = ['Library (Lib)']
    creation = <Date 2013-09-08.01:20:27.821>
    creator = 'pietvo'
    dependencies = []
    files = ['31658']
    hgrepos = []
    issue_num = 18966
    keywords = []
    message_count = 17.0
    messages = ['197210', '197293', '270338', '270343', '270351', '270352', '270353', '270355', '270375', '270382', '270415', '270419', '270421', '270422', '270433', '300385', '300386']
    nosy_count = 8.0
    nosy_names = ['tim.peters', 'pitrou', 'r.david.murray', 'pietvo', 'neologix', 'sbt', 'eryksun', 'davin']
    pr_nums = ['3111']
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'enhancement'
    url = 'https://bugs.python.org/issue18966'
    versions = ['Python 3.7']

    @pietvo
    Copy link
    Mannequin Author

    @pietvo pietvo mannequin commented Sep 8, 2013

    When a process started as a multiprocessing Process spawns a thread, it doesn't wait until the thread terminates. It terminates the thread early when the main thread of the process terminates, as if the thread would be daemonic (it isn't).

    It may sound a bit weird to start a Thread within multiprocessing, but it isn't prohibited. Neither is this behavior documented.

    In the attached program the thread doesn't complete. However when the mythread.join() statement is uncommented it does run to completion.

    @pietvo pietvo mannequin added type-bug stdlib labels Sep 8, 2013
    @pitrou
    Copy link
    Member

    @pitrou pitrou commented Sep 8, 2013

    That's because multiprocessing exits child processes with os._exit(), not sys.exit().

    The fix would be trivial (call threading._shutdown() before os._exit()), but I don't know if that's something we want to do. After all there are many things in the Python shutdown procedure that we may want to similarly replicate in multiprocessing children, such as calling atexit handlers. This screams for a more general solution, IMHO.

    @tim-one
    Copy link
    Member

    @tim-one tim-one commented Jul 13, 2016

    This came up again today as bug 27508. In the absence of "fixing it", we should add docs to multiprocessing explaining the high-level consequences of skipping "normal" exit processing (BTW, I'm unclear on why it's skipped).

    I've certainly mixed threads with multiprocessing too, but I always explicitly .join() my threads so never noticed this. It's sure a puzzle when it happens ;-)

    @eryksun
    Copy link
    Contributor

    @eryksun eryksun commented Jul 13, 2016

    In 3.4+ it works correctly with the "spawn" start method. This uses multiprocessing.spawn.spawn_main, which exits the child via sys.exit(exitcode). "fork" and "forkserver" exit the child via os._exit(code), respectively in multiprocessing.popen_fork.Popen._launch and multiprocessing.forkserver.main.

    @applio
    Copy link
    Member

    @applio applio commented Jul 14, 2016

    It is a general rule that when a process terminates, it likewise terminates all its threads (unless a thread has been explicitly detached). How it goes about doing so is complicated.

    Remember that POSIX threads have no concept of parent/child among themselves and all threads are viewed as a single pool. The section "No parents, no children" at
    http://www.ibm.com/developerworks/library/l-posix1/ offers us motivation for why waiting on a pthread should be explicitly requested and not assumed as a default behavior.

    There are numerous differences between what a UNIX-style process and a win32 process does at termination. Though an older post from Microsoft, a strong sense of how complicated the process-termination-begets-thread-termination truly is can be had from reading https://blogs.msdn.microsoft.com/oldnewthing/20070503-00/?p=27003 which also helps reinforce the sentiment above (needs explicit instructions on what to do, no general solution can exist). Whereas the prior provided some sense of motivation, this link walks us through ugly complications and consequences that arise.

    The short of it is that the current use of os._exit() is most appropriate in multiprocessing. Threads should be signaled that the process is terminating but we are not generally expected to wait on those threads.

    These and many other reference-worthy links help support the call for atexit-like functionality to be exposed on multiprocessing.Process. There have been multiple issues opened on the bug tracker ultimately requesting this enhancement (do a search for multiprocessing and atexit). I think it's a very sensible enhancement (Antoine did too and perhaps still does) and worth taking the time to pursue. As an aside, I wonder if an equivalent to pthread_cleanup_push should also be exposed on threading.Thread.

    When it comes to documentation, I am of two minds. There seem to be an increasing number of people coming to Python without much prior exposure to the concepts of threads and processes and so it would be wrong for us to ignore this reality. On the flip side, attempting to convey all the core concepts of threads and processes and how they interact would result in a large documentation effort that ultimately few people would eagerly read to completion. Adding a one-sentence caveat hiding somewhere in the docs won't do much to help. Given this topic and a few other issues that have come up very recently, I suggest that a concise paragraph be added on the topic of using threads and processes together -- likely placed at the beginning of the docs on the Process class. I think I'm up for taking a crack at that but I'd very much appreciate critical eyes to review it with me.

    Per Eryk's point about the difference in multiprocessing's behavior when using spawn vs. fork, the explanation for why it's done that way is also described in the DeveloperWorks article I mentioned above.

    Finally, per the original post from pietvo for future readers, not only is it *not* weird to start a Thread within a Process, it's a popular and useful technique.

    @tim-one
    Copy link
    Member

    @tim-one tim-one commented Jul 14, 2016

    Devin, a primary point of threading.py is to provide a sane alternative to the cross-platform thread mess. None of these reports are about making it easier for threads to go away "by magic" when the process ends. It's the contrary: they're asking for multiprocessing to respect threading.py's default behavior of making it the programmer's responsibility to shut down their threads cleanly. "Shut down your threads, or we refuse to let the process end."

    It doesn't matter that native OS threads may behave differently. threading.py very deliberately makes _its_ thread abstraction "non-daemonic" by default, and advertises that behavior for all platforms. So it's at best surprising that threading.Thread's default semantics get turned inside out when multiprocessing creates a process. I still see no reason to believe that's "a feature".

    As to docs, if it boils down to the difference between sys.exit() and os._exit(), then those are the places to put details, and multiprocessing docs just need to point out when a created process will use one or the other.

    As is, the docs don't contain the slightest clue anywhere that a threading.Thread may violate its own docs (with respect to process-exit behavior) when created by a process launched by multiprocessing (or also by a concurrent.futures.ProcessPoolExecutor? I didn't check).

    @eryksun
    Copy link
    Contributor

    @eryksun eryksun commented Jul 14, 2016

    Per Eryk's point about the difference in multiprocessing's behavior
    when using spawn vs. fork, the explanation for why it's done that
    way is also described in the DeveloperWorks article I mentioned above.

    Please spell this out for me. Why can't the "fork" and "forkserver" variations call sys.exit(), and thus Py_Finalize? I don't see why it's OK to call the _shutdown function in Lib/threading.py from a spawned child but not a forked child.

    @tim-one
    Copy link
    Member

    @tim-one tim-one commented Jul 14, 2016

    About ""No parents, no children", that's fine so far as it goes. But Python isn't C, a threading.Thread is not a POSIX thread, and threading.py _does_ have a concept of "the main thread". There's no conceptual problem _in Python_ with saying "the main thread" waits to .join() other non-daemon threading.Threads at process exit. No parent/child relationships are implied by that either - it's just the concept that one thread is distinguished.

    @pitrou
    Copy link
    Member

    @pitrou pitrou commented Jul 14, 2016

    I agree with Tim. Regardless of what OS threads do, Python tries to enforce predictable semantics of its own. There's no reason (apart from historical baggage) to not join Python threads (and only Python threads, of course, not other OS threads) at the shutdown of a child process.

    I don't exactly remember why using os._exit() rather than sys.exit() is required in child processes. Presumably it is because we don't want the child to clobber any resources shared with the parent (open files?). This doesn't have to be a binary thing, though: it may as well be os._exit() + a bunch of cleanup steps we know are safe to perform.

    @pitrou pitrou added type-feature and removed type-bug labels Jul 14, 2016
    @neologix
    Copy link
    Mannequin

    @neologix neologix mannequin commented Jul 14, 2016

    One reason for not calling sys.exit() is because on Linux, the default
    implementation uses fork(), hence the address space in the chilren is
    a clone of the parent: so all atexit handlers, for example, would be
    called multiple times.
    There's also the problem that fork() isn't MT-safe, making the
    probability of screwups/deadlocks in various destructors/stack
    unwinding greater.

    @applio
    Copy link
    Member

    @applio applio commented Jul 14, 2016

    Tim: Totally agreed about threading.Thread not being a POSIX thread. It was not my intent to suggest that they were equivalent -- apologies for the confusion.

    Instead I was attempting to describe a mentality of processes and their common behavior across multiple platforms at termination. The behavior of child processes via multiprocessing currently appears to follow this common mentality of signal the threads then exit quickly. (To avoid confusion, I am making an observation here.)

    Whereas threading.Thread is attempting to provide something homogeneous across platforms, achieving a similar goal in multiprocessing.Process is complicated by the concepts of fork vs. spawn and their availability on various OSes (a source of real confusion for some). This further opens the question of what should the mentality be for multiprocessing.Process? The notion that a process can die in such a way that not all of its threads were given time to clean up does not strike me as a foreign concept. The notion that a threading.Thread should always be (or at least be attempted to be) joined makes sense. The notion of categorically refusing to let a process end perhaps overreaches in certain situations.

    I believe the more general solution exists in offering atexit handlers on multiprocessing.Process.

    @eryksun
    Copy link
    Contributor

    @eryksun eryksun commented Jul 14, 2016

    all atexit handlers, for example, would be called multiple times.

    Davin is (I think) proposing a multiprocessing atexit facility, which can be used to ensure threading._shutdown is called. But could Python's regular atexit handling be reset in the child, allowing Py_Finalize to be called? In other words, can atexit can be integrated into the PyOS_AfterFork (Modules/signalmodule.c) sequence? multiprocessing could set a sys flag that forces atexit to clear its registered handlers, and for Py_AtExit, reset the static nexitfuncs variable in Python/pylifecycle.c. Or is that just opening a can of worms that will cause Py_Finalize to crash in various scenarios?

    There's also the problem that fork() isn't MT-safe

    This issue is about joining Python threads created in the child, which has a clean slate via PyOS_AfterFork, PyEval_ReInitThreads (Python/ceval.c), and threading._after_fork.

    @bitdancer
    Copy link
    Member

    @bitdancer bitdancer commented Jul 14, 2016

    As far as muliprocessing's "mentality" goes, it aims to provide the *same* API as Threading, so it is logical that it should preserve threading's behavior with respect to child threads in a process, rather than violating threading's model. Anything else is counter-intuitive to a python programmer, as demonstrated by this issue and Tim's comments :)

    @bitdancer
    Copy link
    Member

    @bitdancer bitdancer commented Jul 14, 2016

    Note, however, that fixing this will be a backward compatibility issue, since there are doubtless programs relying on this behavior, probably mostly unintentionally.

    @tim-one
    Copy link
    Member

    @tim-one tim-one commented Jul 14, 2016

    About: "The notion of categorically refusing to let a process end perhaps overreaches in certain situations." threading.py addressed that all along: if the programmer _wants_ the process to exit without waiting for a particular threading.Thread, that's fine, they ask the Thread constructor for a "daemon" thread. Whether a threading.Thread does or doesn't prevent process exit before it's .join()'ed has always been the programmer's choice. Python never attempted to guess their intent (except for "the main thread", which is necessarily non-daemonic). That's why it's especially surprising that multiprocessing can silently overrule what had always been an explicit choice about process-exit threading.Thread behavior.

    About compatibility, yup, that's potentially painful. I will note that compability was already broken on Windows with no apparent angst, or subsequent complaints (the program in bpo-27508 is an example: "runs forever" under 3.5.2 but "ends very quickly" under 2.7.11; "runs forever" is what the programmer wanted, matching how they expected non-daemon threading.Threads to work).

    @pitrou pitrou added the 3.7 label Aug 16, 2017
    @pitrou
    Copy link
    Member

    @pitrou pitrou commented Aug 16, 2017

    New changeset ee84a60 by Antoine Pitrou in branch 'master':
    bpo-18966: non-daemonic threads created by a multiprocessing.Process should be joined on exit (bpo-3111)
    ee84a60

    @pitrou
    Copy link
    Member

    @pitrou pitrou commented Aug 16, 2017

    This is now fixed in git master. Thank you for the report!

    @pitrou pitrou closed this as completed Aug 16, 2017
    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.7 stdlib type-feature
    Projects
    None yet
    Development

    No branches or pull requests

    5 participants