Bug report
Bug description:
Ran into this one accidentally (one of the perks of having name with Polish diacritic characters is doing QA like this all the time.)
Basically, any non-ASCII filename, qualnmame, module, task seems to return garbage strings (NUL etc.), and they're valid since PEP 3131.
I did not investigate it fully, but it's likely caused by hard-coding asciiobject_size:
|
size_t offset = (size_t)unwinder->debug_offsets.unicode_object.asciiobject_size; |
|
res = _Py_RemoteDebug_PagedReadRemoteMemory(&unwinder->handle, address + offset, len, buf); |
|
if (res < 0) { |
|
set_exception_cause(unwinder, PyExc_RuntimeError, "Failed to read string data from remote memory"); |
|
goto err; |
|
} |
|
buf[len] = '\0'; |
PS. If you're wondering what is zażółć gęślą jaźń.
Reproduction
2026-05-01T11:19:14.212125000+0200 maurycy@gimel /Users/maurycy/src/github.com/maurycy/cpython (main c0e0640) % cat /tmp/zażółćgęśląjaźń.py
def zażółć_gęślą_jaźń():
s = 0
for i in range(10_000_000):
s += i * i
return s
def główna():
for _ in range(20):
zażółć_gęślą_jaźń()
główna()
2026-05-01T11:19:14.489608000+0200 maurycy@gimel /Users/maurycy/src/github.com/maurycy/cpython (main c0e0640) % sudo ./python.exe -m profiling.sampling run --collapsed -o /tmp/profile.folded /tmp/zażółćgęśląjaźń.py
Captured 7,135 samples in 7.14 seconds
Sample rate: 999.98 samples/sec
Error rate: 0.07
Collapsed stack output written to /tmp/profile.folded
2026-05-01T11:19:23.243613000+0200 maurycy@gimel /Users/maurycy/src/github.com/maurycy/cpython (main c0e0640) % head -n 10 /tmp/profile.folded
tid:12268318;<frozen runpy>:_run_module_as_main:196;<frozen runpy>:_run_code:87;tmp:<module>:11;tmp::9;tmp:z:4 6250
tid:12268318;<frozen runpy>:_run_module_as_main:196;<frozen runpy>:_run_code:87;tmp:<module>:11;tmp::9;tmp:z:3 880
2026-05-01T11:19:28.606723000+0200 maurycy@gimel /Users/maurycy/src/github.com/maurycy/cpython (main c0e0640) %
or:
2026-05-01T11:20:40.003826000+0200 maurycy@gimel /Users/maurycy/src/github.com/maurycy/cpython (main c0e0640) % cat /tmp/zażółć3.py
import _remote_debugging, os
def zażółć():
u = _remote_debugging.RemoteUnwinder(os.getpid(), all_threads=True)
for interp in u.get_stack_trace():
for tid, _, frames in interp[1]:
for f in frames:
print(repr(f[0]), '|', repr(f[2]))
zażółć()
2026-05-01T11:20:44.723466000+0200 maurycy@gimel /Users/maurycy/src/github.com/maurycy/cpython (main c0e0640) % ./python.exe /tmp/zażółć3.py
'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00' | '\x00\x00\x00\x00\x00\x00'
'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00' | '<module>'
CPython versions tested on:
CPython main branch
Operating systems tested on:
macOS
Linked PRs
Bug report
Bug description:
Ran into this one accidentally (one of the perks of having name with Polish diacritic characters is doing QA like this all the time.)
Basically, any non-ASCII filename, qualnmame, module, task seems to return garbage strings (NUL etc.), and they're valid since PEP 3131.
I did not investigate it fully, but it's likely caused by hard-coding
asciiobject_size:cpython/Modules/_remote_debugging/object_reading.c
Lines 82 to 88 in ae6adc9
PS. If you're wondering what is zażółć gęślą jaźń.
Reproduction
or:
CPython versions tested on:
CPython main branch
Operating systems tested on:
macOS
Linked PRs