Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

3.12 regression: interpreter shouldn't shut down when there are non-daemon threads left #115533

Closed
intgr opened this issue Feb 15, 2024 · 10 comments
Labels
3.12 bugs and security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) topic-subinterpreters type-bug An unexpected behavior, bug, or error

Comments

@intgr
Copy link

intgr commented Feb 15, 2024

Bug report

Bug description:

This is related to #113964, but trying to address the bigger problem of what "interpreter shutdown" is, and trying to make a stronger case for restoring the old behavior.

TL;DR: This was an undocumented change. The old behavior was useful. Documentation is contradictory. Users are struggling and are recommended to downgrade.

Undocumented change

In Python 3.11 and previous versions, daemon=False threads could spawn new threads even after the main thread has exited.

This changed in Python 3.12, which throws "RuntimeError: can't create new thread at interpreter shutdown" when a daemon=False thread tries to spawn new threads after the main thread has exited.

I could not find this change to be documented in 3.12 release notes, it appears to be an unintentional change.

Reproducer

"""Python 3.11: Successfully spawns 20 subthreads. Python 3.12: RuntimeError."""
from threading import Thread, current_thread
from time import sleep

def subthread():
    print(f"Thread {current_thread().name} spawned...")

def thread():
    for i in range(20):
        Thread(target=subthread, daemon=False).start()
        sleep(0.001)

Thread(target=thread, daemon=False).start()

Old behavior was useful

While not join()ing the launched threads is considered an antipattern, at the very least the old behavior was useful for throw-away scripts that launch a bunch of threads, and have Python exit automatically once all work was finished.

Documentation is contradictory

The documentation says this about daemon threads:

A thread can be flagged as a “daemon thread”. The significance of this flag is that the entire Python program exits when only daemon threads are left.

I think a useful interpretation of "exits when only daemon threads are left" also implies "the Python program starts shutting down when only daemon threads are left", and does not shut down before. Because what else could "exits" mean if not "shuts down"?

Although this is in conflict with the definition of "interpreter shutdown" that states:

The main reason for interpreter shutdown is that the __main__ module or the script being run has finished executing.

This interpetation is also not being respected by sys.is_finalizing() API. This is supposed to "Return True if the Python interpreter is shutting down", but it returns False in this situation even after the main thread has exited and thread spawning fails: #114570. That's another indication that this change was unintentional.

Users are struggling

Python has a reputation for breakage across version upgrades. I hope there is motivation among CPython developers to change that. I think such unintentional changes that break users, should be treated as regressions to be fixed.

There are three posts on StackOverflow [1] [2] [3], where downgrading Python is the dominant "solution". And when you have threads spawning more threads, the "proper fix" is often not a simple matter of adding a Thread.join() somewhere, but may require re-architecting the whole program to keep track of what all threads are doing.

Related issues: #113964, #114570.

CPython versions tested on:

3.12.2, 3.11.7

Operating systems tested on:

Linux, macOS

@intgr intgr added the type-bug An unexpected behavior, bug, or error label Feb 15, 2024
@ronaldoussoren ronaldoussoren added the interpreter-core (Objects, Python, Grammar, and Parser dirs) label Feb 16, 2024
@gpshead
Copy link
Member

gpshead commented Feb 22, 2024

+cc: @ericsnowcurrently @serhiy-storchaka @vstinner as FYI about this 3.12 behavior change.

@gpshead gpshead added the 3.12 bugs and security fixes label Feb 22, 2024
@gpshead
Copy link
Member

gpshead commented Feb 22, 2024

A compatible workaround for application code running into this is to potentially add a call to a function something like https://gist.github.com/gpshead/48389eac0fd5c3fbc7ed8c5f4667be41 at the end of their main program.

Not pretty, but it avoids re-architecting. (EDIT: do not use this, see vstinner's much simpler thing below)

@gpshead gpshead changed the title Python 3.12 regression: interpreter shouldn't shut down when there are non-daemon threads left 3.12 regression: interpreter shouldn't shut down when there are non-daemon threads left Feb 22, 2024
@vstinner
Copy link
Member

I'm not sure of the purpose of this issue.

it appears to be an unintentional change.

The change is intentional. Spawning threads can cause crashes. It was changed to make Python safer and more reliable.

I could not find this change to be documented in 3.12 release notes,

Oh, that's bad. It should be documented.

The documentation says this about daemon threads:

Your example doesn't use daemon threads, so it's unrelated.

Python calls threading._shutdown() which waits for all threads: call their join() method. But that's unrelated to the change on disallowing to spawn new threads during the Python finalization.

The main reason for interpreter shutdown is that the main module or the script being run has finished executing.

In your example, Python is finalized after it executes the last line of your script. That's correct.

the "proper fix" is often not a simple matter of adding a Thread.join() somewhere, but may require re-architecting the whole program to keep track of what all threads are doing.

You can add these lines at the end of your program. It's generic and doesn't require to re-architecture anything:

import threading
for thread in threading.enumerate():
    if thread.daemon or thread is threading.current_thread():
        continue
    thread.join()

@intgr
Copy link
Author

intgr commented Feb 22, 2024

The change is intentional.

Where was this change discussed and decided?

Spawning threads can cause crashes. It was changed to make Python safer and more reliable.

Are you saying that it's impossible to safely and reliably delay interpreter shutdown until all non-daemon threads have exited? Doubt! gpshead above seems to have done it.

(Or are you saying that spawning threads in general can cause crashes? Then the change would be unrelated)

The documentation says this about daemon threads:

Your example doesn't use daemon threads, so it's unrelated.

It is very much related. It's the part of documentation that explains the difference between daemon threads and non-daemon threads. By documenting what happens with daemon=True, it also documents what doesn't happen with daemon=False. There isn't a third possible value to a boolean attribute.

In your example, Python is finalized after it executes the last line of your script. That's correct.

Are you basing your claim of "correctness" on the quote from documentation? It's possible for documentation to be incorrect, just as it's possible for code to be incorrect.

CPython didn't follow documentation for ages, until version 3.12. I think it's a fair argument that documentation should be adapted instead, particularly as it's in apparent conflict with the description of non-daemon threads.

@gpshead
Copy link
Member

gpshead commented Feb 22, 2024

FWIW @vstinner "I'm not sure the purpose of this issue" reads as dismissive of a user reporting a bug even if I doubt you meant it that way. It is a valid bug, what worked before no longer works in 3.12 - without any notice or explanation from us being the problem if this behavior change is what we want.

@zenbones
Copy link

zenbones commented Feb 22, 2024

You may say that basing Python's thread model on other languages is not relevant, but the current documentation accepts the meaning of daemon threads current in other languages with thread control. I'd say that Python's thread model is borrowed from Java, as are many of its more recent and useful additions to the standard libraries. This is not a bad thing. Love or hate Java it has a great set of standard libraries. In Java, when I mark a thread as daemon, the JVM does not exit until all daemon threads have finished. And I can continue to spawn new threads so long as the JVM has not exited. This is neither dangerous nor impossibly difficult. Forcing the user to keep track of every possible thread with some top level hack, to fulfill the meaning of daemon threads, is a situation which is unsafe and unclear.

You could change the documentation to clearly state that, in Python, a daemon thread will not prevent the interpreter from shutting down. I think that would be wrong and a step backwards in Python's evolution as a useful language. But at least it would clear and correct. Or, you could change python to properly track daemon threads and not shut down until they've all completed. This does naturally create a race condition between allowing the interpreter to finalize and client code which might be marking a thread in the daemon state. Fortunately, there are mutex locks capable of resolving that race condition in a clear manner. Use one. The error should not be delivered on creation of any thread during the lifetime of a daemon thread, but rather on the attempt to create a thread or mark a thread as daemon after the interpreter has determined no demon threads are running and begun finalization. And it's the interpreter's responsibility to run the checking loop, possibly on an internal daemon thread...

@vstinner
Copy link
Member

FWIW @vstinner "I'm not sure the purpose of this issue" reads as dismissive of a user reporting a bug even if I doubt you meant it that way.

Oh, I'm sorry if it can be read this way. It might be a translation issue, English is not my first language.

What I mean is that I'm not sure which actions should be taken to solve thie issue. We might have to clarify which tasks should be done to enhance the situation. Apparently, documenting the change is a first step.

@vstinner
Copy link
Member

Where was this change discussed and decided?

The change was done in issue gh-104690 with the change commit ce558e6: PR gh-104826. It was discussed there.

Are you saying that it's impossible to safely and reliably delay interpreter shutdown until all non-daemon threads have exited?

See issue gh-104690 for a concrete example of crash caused by a thread spawned during the Python finalization.

"Python exit" or "Python finalization" starts by waiting until threads complete. If new threads are spawned while Python is waiting for the existing threads to complete, bad things can happen.

In Python 3.11 and older it was allowed. But it was decided to change this in Python 3.12.

I didn't say that it's impossible to support this case. Python is a project maintained by volunteers. It's all about trade-offs. With our limited resources, we decided to make the finalization simpler to make it more reliable.

Apparently, there is a way to still support your use case, but it requires some changes to your code.

@gvanrossum
Copy link
Member

Closing as a duplicate of gh-113964.

(Also, is there a confusion above about the meaning of "daemon" threads vs. non-daemon threads? The intention, from Python's earliest days, was that at exit the interpreter waits until all non daemon threads have exited, while daemon threads are killed unceremoniously. There's got to be documentation about that. If it's different in Java, that's too bad -- while we did borrow the thread API from Java, we might have accidentally diverged, but it's at least two decades too late to change.)

@gpshead
Copy link
Member

gpshead commented Feb 22, 2024

#113964

@gpshead gpshead closed this as not planned Won't fix, can't repro, duplicate, stale Feb 22, 2024
ppigazzini added a commit to ppigazzini/fishtest that referenced this issue Apr 4, 2024
With the switch to python 3.12 the log started registering a RuntimeError
due to a race condition.

```
systemd[1]: Stopping Fishtest Server port 6543...
pserve[17032]: flush
pserve[17032]: .....................done
pserve[17032]: Exception in thread Thread-740:
pserve[17032]: Traceback (most recent call last):
pserve[17032]:   File "/home/fishtest/.pyenv/versions/3.12.2/lib/python3.12/threading.py", line 1073, in _bootstrap_inner
pserve[17032]:     self.run()
pserve[17032]:   File "/home/fishtest/.pyenv/versions/3.12.2/lib/python3.12/threading.py", line 1431, in run
pserve[17032]:     self.function(*self.args, **self.kwargs)
pserve[17032]:   File "/home/fishtest/fishtest/server/fishtest/rundb.py", line 451, in flush_buffers
pserve[17032]:     self.start_timer()
pserve[17032]:   File "/home/fishtest/fishtest/server/fishtest/rundb.py", line 365, in start_timer
pserve[17032]:     self.timer.start()
pserve[17032]:   File "/home/fishtest/.pyenv/versions/3.12.2/lib/python3.12/threading.py", line 992, in start
pserve[17032]:     _start_new_thread(self._bootstrap, ())
pserve[17032]: RuntimeError: can't create new thread at interpreter shutdown
systemd[1]: Stopped Fishtest Server port 6543.
```

This is the issue opened upstream:
python/cpython#115533
danpetry added a commit to AnacondaRecipes/pytorch-feedstock that referenced this issue Apr 18, 2024
we're getting an issue with the tests:

ValueError: option names {'--shard-id'} already added

This might be due to out-of-tree tests; after some debugging attempt I'm
removing due to time constraints.

The same tests were showing a "RuntimeError: can't create new thread at interpreter shutdown" on python 3.12

This is detailed here python/cpython#115533

From looking at the pytorch code it looks like there's still some polishing of
support for 3.12 happening so will consider this an upstream error
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.12 bugs and security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) topic-subinterpreters type-bug An unexpected behavior, bug, or error
Projects
Development

No branches or pull requests

7 participants