Global String Objects are Interned Only in the First Interpreter #106931

ericsnowcurrently · 2023-07-20T21:30:20Z

When a string object is interned via _PyUnicode_InternInPlace(), its "state.interned" field is set. Afterward, subsequent calls to _PyUnicode_InternInPlace() will skip that string. The problem is that some strings may be used in multiple interpreters, which each have their own interned dict. The string is shared between the interpreters, along with its "state.interned" field. That means the string will only be interned in the first interpreter where _PyUnicode_InternInPlace() is called (ignoring races in the function, e.g. gh-106930).

We need to fix it so one of the following is true:

there should be one global interned "dict" shared by all interpreters (we tried this already and it is very tricky)
strings are always interned in every interpreter, regardless of the "state.interned" value

Linked PRs

The text was updated successfully, but these errors were encountered:

ericsnowcurrently · 2023-07-20T21:30:36Z

CC @kumaraditya303

kumaraditya303 · 2023-07-21T09:30:57Z

I prefer option 1 and would like to try make it work after I am in better health.

ericsnowcurrently · 2023-07-21T14:28:27Z

I likewise prefer the first option, though with some other data structure than a dict.

ericsnowcurrently · 2023-07-25T15:52:53Z

Solutions I've considered:

use a global _Py_hashtable for statically allocated strings and keep the per-interpreter dict for strings allocated by each interpreter
keep all per-interpreter dicts in sync, making a copy of the string in each one (or no copy if statically allocated)
use a global _Py_hashtable for all strings; store only main-interpreter-owned strings (i.e. copy if needed)

We tried this before with a dict and for all interned strings. That ran into problems due to interpreter isolation. However, exclusively using a per-interpreter cache caused some inconsistency that can eliminate the benefit of interning. Here we circle back to using a global cache, but only for statically allocated strings. We also use a more-basic _Py_hashtable_t for that global cache instead of a dict. Ideally we would only have the global cache, but the optional isolation of each interpreter's allocator means that a non-static string object must not outlive its interpreter. Thus we would have to store a copy of each such interned string in the global cache, tied to the main interpreter.

…gh-107272) We tried this before with a dict and for all interned strings. That ran into problems due to interpreter isolation. However, exclusively using a per-interpreter cache caused some inconsistency that can eliminate the benefit of interning. Here we circle back to using a global cache, but only for statically allocated strings. We also use a more-basic _Py_hashtable_t for that global cache instead of a dict. Ideally we would only have the global cache, but the optional isolation of each interpreter's allocator means that a non-static string object must not outlive its interpreter. Thus we would have to store a copy of each such interned string in the global cache, tied to the main interpreter. (cherry picked from commit b72947a) Co-authored-by: Eric Snow <ericsnowcurrently@gmail.com>

Skip subinterpreter tests when not supported.

ericsnowcurrently · 2023-07-28T16:42:07Z

This is done.

FYI, I didn't backport the fix to 3.12, given how late in the release cycle we are and the reservations of the release manager.

vstinner · 2023-08-30T15:33:19Z

I reopen the issue.

Using tracemalloc, test_sys now fails with a fatal error since commit b72947a:

$ git checkout b72947a8d26915156323ccfd04d273199ecb870c
$ make
$ ./python -X tracemalloc -m test test_sys -m test_subinterp_intern_dynamically_allocated
(...)
Debug memory block at address p=0x7f2ae1b16f70: API ''
    15987177244304867328 bytes originally requested
    The 7 pad bytes at p-7 are not all FORBIDDENBYTE (0xfd):
        at p-7: 0x00 *** OUCH
        at p-6: 0x00 *** OUCH
        at p-5: 0x00 *** OUCH
        at p-4: 0x00 *** OUCH
        at p-3: 0x00 *** OUCH
        at p-2: 0x00 *** OUCH
        at p-1: 0x00 *** OUCH
    Because memory is corrupted at the start, the count of bytes requested
       may be bogus, and checking the trailing pad bytes may segfault.
    The 8 pad bytes at tail=0xddde5c2ae1b16f70 are Fatal Python error: Segmentation fault

Current thread 0x00007f2aef939740 (most recent call first):
  <no Python frame>

gdb traceback:

Thread 1 "python" received signal SIGSEGV, Segmentation fault.
0x0000000000561014 in _PyObject_DebugDumpAddress (p=0x7fffe9ebaf70) at Objects/obmalloc.c:2409
2409	        if (tail[i] != PYMEM_FORBIDDENBYTE) {
Missing separate debuginfos, use: dnf debuginfo-install bzip2-libs-1.0.8-13.fc38.x86_64 glibc-2.37-4.fc38.x86_64 libffi-3.4.4-2.fc38.x86_64 openssl-libs-3.0.9-2.fc38.x86_64 xz-libs-5.4.1-1.fc38.x86_64 zlib-1.2.13-3.fc38.x86_64

(gdb) where
#0  0x0000000000561014 in _PyObject_DebugDumpAddress (p=0x7fffe9ebaf70) at Objects/obmalloc.c:2409
#1  0x0000000000560cfd in _PyMem_DebugCheckAddress (func=0x7e8850 <__func__.8> "_PyMem_DebugRawFree", api=114 'r', p=0x7fffe9ebaf70)
    at Objects/obmalloc.c:2327
#2  0x000000000056082c in _PyMem_DebugRawFree (ctx=0xb15920 <_PyRuntime+512>, p=0x7fffe9ebaf70) at Objects/obmalloc.c:2160
#3  0x000000000055e202 in PyMem_RawFree (ptr=0x7fffe9ebaf70) at Objects/obmalloc.c:686
#4  0x00000000005603c0 in _PyObject_Free (ctx=0x0, p=0x7fffe9ebaf70) at Objects/obmalloc.c:1854
#5  0x0000000000560872 in _PyMem_DebugRawFree (ctx=0xb15980 <_PyRuntime+608>, p=0x7fffe9ebaf80) at Objects/obmalloc.c:2164
#6  0x0000000000560c63 in _PyMem_DebugFree (ctx=0xb15980 <_PyRuntime+608>, ptr=0x7fffe9ebaf80) at Objects/obmalloc.c:2297
#7  0x000000000055e58a in PyObject_Free (ptr=0x7fffe9ebaf80) at Objects/obmalloc.c:831
#8  0x00000000005b03e0 in unicode_dealloc (unicode=<unknown at remote 0x7fffe9ebaf80>) at Objects/unicodeobject.c:1613
#9  0x000000000055cfdf in _Py_Dealloc (op=<unknown at remote 0x7fffe9ebaf80>) at Objects/object.c:2615
#10 0x00000000006fe895 in Py_DECREF (filename=0x8e8cce "Python/tracemalloc.c", lineno=775, op=<unknown at remote 0x7fffe9ebaf80>) at ./Include/object.h:699
#11 0x00000000006ffeb0 in tracemalloc_clear_filename (value=0x7fffe9ebaf80) at Python/tracemalloc.c:775
#12 0x00000000006a369e in _Py_hashtable_destroy_entry (ht=0xbdbf10, entry=0x18d0890) at Python/hashtable.c:383
#13 0x00000000006a3724 in _Py_hashtable_clear (ht=0xbdbf10) at Python/hashtable.c:399
#14 0x00000000006fff4e in tracemalloc_clear_traces () at Python/tracemalloc.c:795
#15 0x0000000000700373 in _PyTraceMalloc_Stop () at Python/tracemalloc.c:971
#16 0x0000000000700106 in tracemalloc_deinit () at Python/tracemalloc.c:870
#17 0x0000000000700cf1 in _PyTraceMalloc_Fini () at Python/tracemalloc.c:1344
#18 0x00000000006d49f6 in Py_FinalizeEx () at Python/pylifecycle.c:1895
#19 0x00000000007115d0 in Py_RunMain () at Modules/main.c:690
#20 0x000000000071165f in pymain_main (args=0x7fffffffde10) at Modules/main.c:718
#21 0x00000000007116d9 in Py_BytesMain (argc=8, argv=0x7fffffffdf78) at Modules/main.c:742
#22 0x000000000041b836 in main (argc=8, argv=0x7fffffffdf78) at ./Programs/python.c:15

(gdb) frame 11
#11 0x00000000006ffeb0 in tracemalloc_clear_filename (value=0x7fffe9ebaf80) at Python/tracemalloc.c:775
775	    Py_DECREF(filename);

(gdb) p /x *filename
$4 = {
  _ob_next = <unknown at remote 0xdddddddddddddddd>,
  _ob_prev = <unknown at remote 0xdddddddddddddddd>,
  {
    ob_refcnt = 0xdddddddddddddddd,
    ob_refcnt_split = {0xdddddddd, 0xdddddddd}
  },
  ob_type = 0xdddddddddddddddd
}

…ythongh-107272) We tried this before with a dict and for all interned strings. That ran into problems due to interpreter isolation. However, exclusively using a per-interpreter cache caused some inconsistency that can eliminate the benefit of interning. Here we circle back to using a global cache, but only for statically allocated strings. We also use a more-basic _Py_hashtable_t for that global cache instead of a dict. Ideally we would only have the global cache, but the optional isolation of each interpreter's allocator means that a non-static string object must not outlive its interpreter. Thus we would have to store a copy of each such interned string in the global cache, tied to the main interpreter. (cherry-picked from commit b72947a)

…7272) (gh-110713) We tried this before with a dict and for all interned strings. That ran into problems due to interpreter isolation. However, exclusively using a per-interpreter cache caused some inconsistency that can eliminate the benefit of interning. Here we circle back to using a global cache, but only for statically allocated strings. We also use a more-basic _Py_hashtable_t for that global cache instead of a dict. Ideally we would only have the global cache, but the optional isolation of each interpreter's allocator means that a non-static string object must not outlive its interpreter. Thus we would have to store a copy of each such interned string in the global cache, tied to the main interpreter. (cherry-picked from commit b72947a)

We tried this before with a dict and for all interned strings. That ran into problems due to interpreter isolation. However, exclusively using a per-interpreter cache caused some inconsistency that can eliminate the benefit of interning. Here we circle back to using a global cache, but only for statically allocated strings. We also use a more-basic _Py_hashtable_t for that global cache instead of a dict. Ideally we would only have the global cache, but the optional isolation of each interpreter's allocator means that a non-static string object must not outlive its interpreter. Thus we would have to store a copy of each such interned string in the global cache, tied to the main interpreter. (cherry-picked from commit b72947a)

We tried this before with a dict and for all interned strings. That ran into problems due to interpreter isolation. However, exclusively using a per-interpreter cache caused some inconsistency that can eliminate the benefit of interning. Here we circle back to using a global cache, but only for statically allocated strings. We also use a more-basic _Py_hashtable_t for that global cache instead of a dict. Ideally we would only have the global cache, but the optional isolation of each interpreter's allocator means that a non-static string object must not outlive its interpreter. Thus we would have to store a copy of each such interned string in the global cache, tied to the main interpreter. (cherry-picked from commit b72947a) (cherry-picked from commit 3e6b328)

ericsnowcurrently added type-bug An unexpected behavior, bug, or error topic-subinterpreters 3.12 bugs and security fixes 3.13 new features, bugs and security fixes labels Jul 20, 2023

bedevere-bot mentioned this issue Jul 25, 2023

gh-106931: Intern Statically Allocated Strings Globally #107272

Merged

bedevere-bot mentioned this issue Jul 27, 2023

[3.12] gh-106931: Intern Statically Allocated Strings Globally (gh-107272) #107358

Closed

bedevere-bot mentioned this issue Jul 27, 2023

gh-106931: Fix the WASM Buildbots #107362

Merged

ericsnowcurrently added a commit that referenced this issue Jul 27, 2023

gh-106931: Fix the WASM Buildbots (gh-107362)

2f9bb77

Skip subinterpreter tests when not supported.

ericsnowcurrently closed this as completed Jul 28, 2023

vstinner reopened this Aug 30, 2023

bedevere-app bot mentioned this issue Oct 11, 2023

[3.12] gh-106931: Intern Statically Allocated Strings Globally (gh-107272) #110713

Merged

ericsnowcurrently closed this as completed Nov 28, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Global String Objects are Interned Only in the First Interpreter #106931

Global String Objects are Interned Only in the First Interpreter #106931

ericsnowcurrently commented Jul 20, 2023 •

edited by bedevere-app bot

ericsnowcurrently commented Jul 20, 2023

kumaraditya303 commented Jul 21, 2023

ericsnowcurrently commented Jul 21, 2023

ericsnowcurrently commented Jul 25, 2023

ericsnowcurrently commented Jul 28, 2023

vstinner commented Aug 30, 2023

Global String Objects are Interned Only in the First Interpreter #106931

Global String Objects are Interned Only in the First Interpreter #106931

Comments

ericsnowcurrently commented Jul 20, 2023 • edited by bedevere-app bot

Linked PRs

ericsnowcurrently commented Jul 20, 2023

kumaraditya303 commented Jul 21, 2023

ericsnowcurrently commented Jul 21, 2023

ericsnowcurrently commented Jul 25, 2023

ericsnowcurrently commented Jul 28, 2023

vstinner commented Aug 30, 2023

ericsnowcurrently commented Jul 20, 2023 •

edited by bedevere-app bot