Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Crash when using sqlite3 with subinterpreters #92036

Closed
vstinner opened this issue Apr 28, 2022 · 14 comments
Closed

Crash when using sqlite3 with subinterpreters #92036

vstinner opened this issue Apr 28, 2022 · 14 comments
Labels
3.9 only security fixes 3.10 only security fixes topic-sqlite3 type-bug An unexpected behavior, bug, or error

Comments

@vstinner
Copy link
Member

Running sqlite3_crash.py on Python 3.9 and 3.10 fail randomly with an assertion failure in _PyObject_GC_UNTRACK() called by func_dealloc().

The problem is that a function is allocated in an interpreter and deallocated in another interpreter.

I built Python 3.10 with pyobject_ob_interp.patch. When the bug occurs, I get:

  • func_dealloc(): ob_interp = 0x7fffa4001030
  • In a parent frame (_PyObject_VectorcallTstate), tstate->interp = 0x7fffc0001030
@vstinner
Copy link
Member Author

In the main branch, the _sqlite3 extension was converted to the multi-phase initialization API (PEP 489) which made the extension safer to use with subinterpreters.

@vstinner
Copy link
Member Author

See also #90228 which is similar but has been fixed.

@vstinner
Copy link
Member Author

In the 3.10 branch, the script doesn't crash before my change:

commit e6bb17fe29713368e1fd93d9ac9611017c4f570c (HEAD)
Author: Miss Islington (bot) <31488909+miss-islington@users.noreply.github.com>
Date:   Thu Jan 13 10:50:09 2022 -0800

    bpo-46070: _PyGC_Fini() untracks objects (GH-30577)
    
    
    Py_EndInterpreter() now explicitly untracks all objects currently
    tracked by the GC. Previously, if an object was used later by another
    interpreter, calling PyObject_GC_UnTrack() on the object crashed if
    the previous or the next object of the PyGC_Head structure became a
    dangling pointer.
    (cherry picked from commit 1a4d1c1c9b08e75e88aeac90901920938f649832)
    
    Co-authored-by: Victor Stinner <vstinner@python.org>

@erlend-aasland erlend-aasland added 3.10 only security fixes 3.9 only security fixes labels Apr 28, 2022
@erlend-aasland
Copy link
Contributor

I can reproduce this on my Mac on 3.9 and 3.10.

@vstinner
Copy link
Member Author

It seems like the problem comes from the psyco_adapters dict shared by all interpreters: register_adapter(datetime.date, adapt_date) call in register_adapters_and_converters() of Lib/sqlite3/dbapi2.py.

Traceback when the bug occurs:

(gdb) py-bt
Traceback (most recent call first):
  <built-in method register_adapter of module object at remote 0x7fffd8118950>
  File "/home/vstinner/python/3.10/Lib/sqlite3/dbapi2.py", line 80, in register_adapters_and_converters
    register_adapter(datetime.date, adapt_date)
  File "/home/vstinner/python/3.10/Lib/sqlite3/dbapi2.py", line 85, in <module>
    register_adapters_and_converters()
  <built-in method exec of module object at remote 0x7fffe9bd5d30>
  (...)
  File "/home/vstinner/python/3.10/Lib/sqlite3/__init__.py", line 57, in <module>
    from sqlite3.dbapi2 import *
  <built-in method exec of module object at remote 0x7fffe9bd5d30>
  (...)
  File "<string>", line 2, in <module>
  <built-in method run_in_subinterp of module object at remote 0x7fffea4a72f0>
  File "/home/vstinner/python/3.10/sqlite3_crash.py", line 12, in doIt
    _testcapi.run_in_subinterp(code)
  (...)

(gdb) frame 10
#10 0x00007fffe80c690d in pysqlite_register_adapter_impl (module=<module at remote 0x7fffd8118950>, type=0x7fffe80f3320 <PyDateTime_DateType>, 
    caster=<function at remote 0x7fffe9bf7330>) at /home/vstinner/python/3.10/Modules/_sqlite/module.c:175
175	    rc = pysqlite_microprotocols_add(type, (PyObject*)pysqlite_PrepareProtocolType, caster);

(gdb) frame 9
#9  0x00007fffe80c5dda in pysqlite_microprotocols_add (type=0x7fffe80f3320 <PyDateTime_DateType>, proto=<type at remote 0x7fffa80a9790>, 
    cast=<function at remote 0x7fffe9bf7330>) at /home/vstinner/python/3.10/Modules/_sqlite/microprotocols.c:68
68	    rc = PyDict_SetItem(psyco_adapters, key, cast);

(gdb) info threads
  Id   Target Id                                   Frame 
  1    Thread 0x7ffff7c7b740 (LWP 332120) "python" 0x00007ffff7d1f3aa in __futex_abstimed_wait_common () from /lib64/libc.so.6
  3    Thread 0x7fffe9af8640 (LWP 332122) "python" 0x00007ffff7d1f3aa in __futex_abstimed_wait_common () from /lib64/libc.so.6
  5    Thread 0x7fffe8af6640 (LWP 332124) "python" 0x00007ffff7d1f3aa in __futex_abstimed_wait_common () from /lib64/libc.so.6
* 6    Thread 0x7fffdbfff640 (LWP 332125) "python" 0x00007ffff7d2454c in __pthread_kill_implementation () from /lib64/libc.so.6
  9    Thread 0x7fffda7fc640 (LWP 332128) "python" 0x00007ffff7d1f3aa in __futex_abstimed_wait_common () from /lib64/libc.so.6
  10   Thread 0x7fffd9ffb640 (LWP 332129) "python" 0x00007ffff7d1f3aa in __futex_abstimed_wait_common () from /lib64/libc.so.6
  16   Thread 0x7fffb67fc640 (LWP 332135) "python" 0x00007ffff7d1f3aa in __futex_abstimed_wait_common () from /lib64/libc.so.6

@vstinner
Copy link
Member Author

In the 3.10 branch, the script doesn't crash before my change

Well, gc_fini_untrack() solves one problem but not all of them :-( I wrote PR #92037 to fix this issue.

Explanation of this bug:

  • (1) Interpreter A is created
  • (2) Interpreter A creates an adapt_date() function and stores it in sqlite3 adapters using _sqlite3.register_adapter() function
  • (3) Interpreter B is created
  • (4) Interpreter B does the same: create an adapt_date() function and overrides the sqlite3 adapters registered by interperter A using _sqlite3.register_adapter() function
  • (5) Interpreter A is deleted
  • (6) Interpreter B is deleted

In this order, everything is fine: the interpreter A still exists when the interpreter B overrides the datetime adapter.

The problem is when the interpreter A is deleted before the interpreter B registers is its own adapter (override the key in the shared dict). In this case, the function is untracked by the GC. Problem: deleting a function requires the function to be tracked by the GC. Othserwise, _PyObject_GC_UNTRACK() assertion fails: this function must only be called on objects called by the GC.

@vstinner
Copy link
Member Author

By the way, when the bug occurs, Python built in debug mode displays this error:

$ ./python sqlite3_crash.py 
Objects/funcobject.c:641: _PyObject_GC_UNTRACK: Assertion "(((PyGC_Head *)(op)-1)->_gc_next != 0)" failed: object not tracked by the garbage collector
Enable tracemalloc to get the memory block allocation traceback

object address  : 0x7faa31e94730
object refcount : 0
object type     : 0x88d660
object type name: function
object repr     : 

But then hangs: _PyObject_Dump() tries to acquire the GIL, but another interpreter is holding it...

@vstinner
Copy link
Member Author

cc @ericsnowcurrently

vstinner added a commit that referenced this issue May 4, 2022
Fix a crash in subinterpreters related to the garbage collector. When
a subinterpreter is deleted, untrack all objects tracked by its GC.
To prevent a crash in deallocator functions expecting objects to be
tracked by the GC, leak a strong reference to these objects on
purpose, so they are never deleted and their deallocator functions
are not called.
miss-islington pushed a commit to miss-islington/cpython that referenced this issue May 4, 2022
Fix a crash in subinterpreters related to the garbage collector. When
a subinterpreter is deleted, untrack all objects tracked by its GC.
To prevent a crash in deallocator functions expecting objects to be
tracked by the GC, leak a strong reference to these objects on
purpose, so they are never deleted and their deallocator functions
are not called.
(cherry picked from commit 1424336)

Co-authored-by: Victor Stinner <vstinner@python.org>
miss-islington added a commit that referenced this issue May 4, 2022
Fix a crash in subinterpreters related to the garbage collector. When
a subinterpreter is deleted, untrack all objects tracked by its GC.
To prevent a crash in deallocator functions expecting objects to be
tracked by the GC, leak a strong reference to these objects on
purpose, so they are never deleted and their deallocator functions
are not called.
(cherry picked from commit 1424336)

Co-authored-by: Victor Stinner <vstinner@python.org>
@vstinner
Copy link
Member Author

vstinner commented May 4, 2022

I got the confirmation that my change fix Kodi crash: #90619 (comment)

miss-islington added a commit that referenced this issue May 4, 2022
Fix a crash in subinterpreters related to the garbage collector. When
a subinterpreter is deleted, untrack all objects tracked by its GC.
To prevent a crash in deallocator functions expecting objects to be
tracked by the GC, leak a strong reference to these objects on
purpose, so they are never deleted and their deallocator functions
are not called.
(cherry picked from commit 1424336)

Co-authored-by: Victor Stinner <vstinner@python.org>
@vstinner
Copy link
Member Author

vstinner commented May 4, 2022

Fixed by:

@vstinner vstinner closed this as completed May 4, 2022
@vstinner
Copy link
Member Author

vstinner commented May 5, 2022

I marked #90619 "Kodi crashing" as a duplicate of this bug.

@matthuisman
Copy link

matthuisman commented May 7, 2022

@vstinner
Great work.

Do you know what versions of 3.9 and 3.10 will include this fix?

Oh. Looks like 3.10.5 in a month

@vstinner
Copy link
Member Author

vstinner commented May 9, 2022

Do you know what versions of 3.9 and 3.10 will include this fix?

The next releases. See:

@vstinner
Copy link
Member Author

vstinner commented May 9, 2022

If you are blocked, you might try to detect which deleted objects are causing crashes and keep a copy of them in a list somewhere: #92036 (comment) Or even somehow call Py_IncRef() on them.

hello-adam pushed a commit to hello-adam/cpython that referenced this issue Jun 2, 2022
Fix a crash in subinterpreters related to the garbage collector. When
a subinterpreter is deleted, untrack all objects tracked by its GC.
To prevent a crash in deallocator functions expecting objects to be
tracked by the GC, leak a strong reference to these objects on
purpose, so they are never deleted and their deallocator functions
are not called.
(cherry picked from commit 1424336)

Co-authored-by: Victor Stinner <vstinner@python.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.9 only security fixes 3.10 only security fixes topic-sqlite3 type-bug An unexpected behavior, bug, or error
Projects
None yet
Development

No branches or pull requests

4 participants
@vstinner @matthuisman @erlend-aasland and others