Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Checking for sys.is_finalizing before thread.start() still yields "can't create new thread at interpreter shutdown" #114570

Open
Jibola opened this issue Jan 25, 2024 · 11 comments
Labels
3.12 bugs and security fixes topic-asyncio topic-subinterpreters type-bug An unexpected behavior, bug, or error

Comments

@Jibola
Copy link

Jibola commented Jan 25, 2024

Bug report

Bug description:

We have code in pymongo Our MongoClient object immediately spawns a new daemon thread to run some periodic check operations from the moment it is initialized. When run on python 3.12, we've run into the issue can't create new thread at interpreter shutdown because we have cases thread.start() that can execute as the interpreter is shutting down.

To mitigate this issue we attempted checking sys.is_finalizing before issuing the start() call:

if not sys.is_finalizing():
    thread.start()

This still generated the same runtime error.

We tried catching the RuntimeError exception:

try:
    thread.start()
except RuntimeError:
    pass

This worked. This feels too broad a solution than the above since it will catch and ignore all RuntimeErrors.

Finally we decided to see if we could check sys.is_finalizing after the exception is raised.

try:
    thread.start()
except RuntimeError:
    if sys.is_finalizing():
        self._thread = None
        return
    raise

To our surprise, this also failed. I can understand that starting a thread on interpreter shutdown should be avoided, but this also feels misleading if the system isn't correctly seeing--which is my assumption on what sys.is_finalizing()--that it is experiencing interpreter shutdown.

Should the expectation not be that sys.is_finalizing is updated before the thread.start() is called OR by the time the RuntimeError is raised? This way we can safely catch or preempt the error before triggering?

As a general inquiry, though, what should be the general best practice for patching issues like this on our end as it has introduced somewhat breaking behavior from 3.11 -> 3.12? Should we do like above with catching the RuntimeError but additionally check the exception string for the exact type of RuntimeError?

CPython versions tested on:

3.12

Operating systems tested on:

macOS

Linked PRs

@serhiy-storchaka
Copy link
Member

There are few stages of shutdown.

New threads cannot be started and the process cannot be forked at interpreter shutdown. But there is also global runtime shutdown, and sys.is_finalizing() returns true only at this stage.

Perhaps sys.is_finalizing() should be changed to report the interpreter shutdown? Or we need two different functions? The only use of sys.is_finalizing() in the stdlib is in asyncio, added in 4a02543 (bpo-26133/gh-70321), and I am not sure whether it should mean an interpreter shutdown or the runtime shutdown.

cc @vstinner @ericsnowcurrently @asvetlov @gvanrossum

@vstinner
Copy link
Member

This worked. This feels too broad a solution than the above since it will catch and ignore all RuntimeErrors.

Previsously, I proposed adding a new PythonFinalizationError exception for that: #109809

Would adding a new more specific PythonFinalizationError exception solve your issue?

The problem with checking for sys.is_finalizing before spawning a thread is that another thread can start to shut down Python after sys.is_finalizing check and before spawning the thread: it's the known Time-of-check to time-of-use race condition.

It was related to #109047 but I decided to fix the multiprocessing bug without PythonFinalizationError, since I had to backport the change to stable versions which would not be get a new PythonFinalizationError exception type anyway.

@serhiy-storchaka
Copy link
Member

Yes, it should be used to check the cause of error after it happens. But it can also be used in optimistic check to prevent costly operations at shutdown. For example:

if not is_finalising():
    try:
        do_something_expensive()
    except CantDoError:
        if is_finalising():
            # okay
        else:
            # kaboom!

The problem in this issue is that sys.is_finalizing() does not give an answer to a particular question. It gives an answer to a different question.

@Jibola
Copy link
Author

Jibola commented Jan 31, 2024

Would adding a new more specific PythonFinalizationError exception solve your issue?

This would solve my issue pretty well. I will also add that being able to preemptively check like @serhiy-storchaka provided in the example would also be a great option.

@ShaneHarvey
Copy link
Contributor

The problem with checking for sys.is_finalizing before spawning a thread is that another thread can start to shut down Python after sys.is_finalizing check and before spawning the thread: it's the known Time-of-check to time-of-use race condition.

The problem is that checking sys.is_finalizing after catching the thread.start RuntimeError does not work either; sys.is_finalizing returns False even though thread.start says the interpreter is shutting down.

@ericsnowcurrently
Copy link
Member

Perhaps sys.is_finalizing() should be changed to report the interpreter shutdown? Or we need two different functions?

I'd be in favor of making it report the current interpreter's status. That could make use of interp->finalizing, which is set during finalization before we do anything else. That alone would be an improvement over the status quo.

We could add something like sys.is_global_finalizing() if needed, but it would only matter for subinterpreters and would only matter for features that are specific to the main interpreter (e.g. signals). It may be worth reporting _PyInterpreterState_Main()->finalizing, instead of Py_IsFinalizing() like we do now.

@ericsnowcurrently
Copy link
Member

tl;dr We should at least make sys.is_finalizing() report interp->finalizing instead of Py_IsFinalizing(). Something like PythonFinalizationError may be worth adding too.

Finally we decided to see if we could check sys.is_finalizing after the exception is raised.

try:
    thread.start()
except RuntimeError:
    if sys.is_finalizing():
        self._thread = None
        return
    raise

To our surprise, this also failed.

The problem is that sys.is_finalizing() doesn't check the same thing that Thread.start() does.

(expand for extra info)

This is partly because we are a bit inconsistent about how we report "finalizing":

  • global runtime:
    • sys.is_finalizing() returns what Py_IsFinalizing() returns
    • Py_IsFinalizing() checks _PyRuntimeState_GetFinalizing(), which is set well after finalization actually starts
    • we have _PyRuntimeState_GetFinalizingID(), which lines up with _PyRuntimeState_GetFinalizing()
  • per-interpreter
    • interp->finalizing is a boolean set at the very beginning of finalization
    • _PyInterpreterState_GetFinalizing(), which is the per-interpreter version of _PyRuntimeState_GetFinalizing()
    • ditto for _PyInterpreterState_GetFinalizingID()

When the runtime (ergo main interpreter) is finalizing, we do the following:

  1. set interp->finalizing to false (for the main interpreter)
  2. wait for all non-daemon Python threads to finish
  3. call all remaining pending calls
  4. call all registered atexit handlers
  5. call _PyInterpreterState_SetFinalizing() for the main interpreter, at which point _PyInterpreterState_GetFinalizing() and _PyInterpreterState_GetFinalizingID() finally start saying that we're finalizing
  6. do likewise with _PyRuntimeState_SetFinalizing()

(The same thing happens for all other subinterpreters except step 6.)

So, back to the reported problem, there are two different definitions of "finalizing" at play.

Thread.start() is checking interp->finalizing (indirectly 1), which is set before anything else happens in finalization. This contrasts with sys.is_finalizing(), which returns true much later in finalization. That window of overlap is where you are seeing the difference.

Possible solutions:

  • (Victor) add something like PythonFinalizationError
  • (Serhiy) make sys.is_finalizing() report the value of interp->finalizing, instead of Py_IsFinalizing() (atexit hooks can make this tricky)

Note that each of those solutions would only help with 3.13+, and not with 3.12 and earlier. In the meantime, I suppose you could get fancy with ctypes in the meantime (though I wouldn't recommend it), or you could access the right information with a custom extension module.

Footnotes

  1. The direct path is: Thread.start() -> _thread.start_joinable_thread()->do_start_new_thread()(_threadmodule.c) ->interp->finalizing`.

@ericsnowcurrently
Copy link
Member

It would be less controversial to add _thread.ThreadsFinalizingError (exposed by threading module).

I'd like to see threading.is_finalizing().

More generally, I'd really like to have something like sys.features().threads == 'finalizing', where other features (e.g. filesystem, external imports) are reported and where other values are 'disabled', 'enabled', 'unavailable', etc. Then elsewhere in the stdlib, like the threading module, can use the info from sys and even wrap it in something like threading.is_finalizing().

Also see: gh-110490

vstinner added a commit to vstinner/cpython that referenced this issue Feb 12, 2024
Add PythonFinalizationError exception. This exception derived from
RuntimeError is raised when an operation is blocked during the Python
finalization.

The following functions now raise PythonFinalizationError, instead of
RuntimeError:

* _thread.start_new_thread()
* subprocess.Popen at fork
* os.fork()
* os.fork1()
* os.forkpty()

Morever, _winapi.Overlapped finalizer now logs an unraisable
PythonFinalizationError, instead of an unraisable RuntimeError.
vstinner added a commit to vstinner/cpython that referenced this issue Feb 12, 2024
Add PythonFinalizationError exception. This exception derived from
RuntimeError is raised when an operation is blocked during the Python
finalization.

The following functions now raise PythonFinalizationError, instead of
RuntimeError:

* _thread.start_new_thread()
* subprocess.Popen
* os.fork()
* os.fork1()
* os.forkpty()

Morever, _winapi.Overlapped finalizer now logs an unraisable
PythonFinalizationError, instead of an unraisable RuntimeError.
@vstinner
Copy link
Member

As a first step forward, I wrote a new PR gh-115352 to add PythonFinalizationError exception.

@gpshead gpshead added the 3.12 bugs and security fixes label Feb 13, 2024
vstinner added a commit to vstinner/cpython that referenced this issue Feb 14, 2024
Add PythonFinalizationError exception. This exception derived from
RuntimeError is raised when an operation is blocked during the Python
finalization.

The following functions now raise PythonFinalizationError, instead of
RuntimeError:

* _thread.start_new_thread()
* subprocess.Popen
* os.fork()
* os.fork1()
* os.forkpty()

Morever, _winapi.Overlapped finalizer now logs an unraisable
PythonFinalizationError, instead of an unraisable RuntimeError.
vstinner added a commit that referenced this issue Feb 14, 2024
Add PythonFinalizationError exception. This exception derived from
RuntimeError is raised when an operation is blocked during the Python
finalization.

The following functions now raise PythonFinalizationError, instead of
RuntimeError:

* _thread.start_new_thread()
* subprocess.Popen
* os.fork()
* os.fork1()
* os.forkpty()

Morever, _winapi.Overlapped finalizer now logs an unraisable
PythonFinalizationError, instead of an unraisable RuntimeError.
@vstinner
Copy link
Member

As a first step forward, I wrote a new PR #115352 to add PythonFinalizationError exception.

Done with change 3e7b7df

@colesbury
Copy link
Contributor

I've put up #116677, which changes the guards for fork and starting new threads to use _PyInterpreterState_GetFinalizing(). Before then, it's safe to start new threads and trying to prevent it leads to regressions and flaky tests. For example, it's the cause of #112542 (and before that #109047).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.12 bugs and security fixes topic-asyncio topic-subinterpreters type-bug An unexpected behavior, bug, or error
Projects
Status: Todo
Status: Todo
Development

No branches or pull requests

7 participants