Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Setting long domain of locale.dgettext() crashes Python interpreter #87765

Open
xxm mannequin opened this issue Mar 23, 2021 · 4 comments
Open

Setting long domain of locale.dgettext() crashes Python interpreter #87765

xxm mannequin opened this issue Mar 23, 2021 · 4 comments
Labels
3.10 only security fixes stdlib Python modules in the Lib dir type-crash A hard crash of the interpreter, possibly with a core dump

Comments

@xxm
Copy link
Mannequin

xxm mannequin commented Mar 23, 2021

BPO 43599
Nosy @tiran, @serhiy-storchaka

Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

Show more details

GitHub fields:

assignee = None
closed_at = None
created_at = <Date 2021-03-23.01:54:58.736>
labels = ['library', '3.10', 'type-crash']
title = 'Setting long domain of locale.dgettext() crashes Python interpreter'
updated_at = <Date 2021-04-06.07:25:32.469>
user = 'https://bugs.python.org/xxm'

bugs.python.org fields:

activity = <Date 2021-04-06.07:25:32.469>
actor = 'serhiy.storchaka'
assignee = 'none'
closed = False
closed_date = None
closer = None
components = ['Library (Lib)']
creation = <Date 2021-03-23.01:54:58.736>
creator = 'xxm'
dependencies = []
files = []
hgrepos = []
issue_num = 43599
keywords = []
message_count = 4.0
messages = ['389363', '390279', '390282', '390285']
nosy_count = 3.0
nosy_names = ['christian.heimes', 'serhiy.storchaka', 'xxm']
pr_nums = []
priority = 'normal'
resolution = None
stage = None
status = 'open'
superseder = None
type = 'crash'
url = 'https://bugs.python.org/issue43599'
versions = ['Python 3.10']

@xxm
Copy link
Mannequin Author

xxm mannequin commented Mar 23, 2021

Setting the first argument of locale.dgettext() long string, Python interpreter crashes.

======================================================

Python 3.10.0a6 (default, Mar 19 2021, 11:45:56) [GCC 7.5.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import locale;locale.dgettext('abs'*10000000,'')
Segmentation fault (core dumped)

======================================================

System: Ubuntu 16.04

BTW, the api of module locale seems to be inconsistent between Ubuntu and Mac OS. E.g. there is no dgettext() for Python on Mac OS.

@xxm xxm mannequin added 3.10 only security fixes stdlib Python modules in the Lib dir type-crash A hard crash of the interpreter, possibly with a core dump labels Mar 23, 2021
@xxm
Copy link
Mannequin Author

xxm mannequin commented Apr 6, 2021

Attached testing results of gdb and valgrind. (No error is reported for locale.dgettext('abs'*10,''))

$gdb ./python
(gdb) run
>>> locale.dgettext('abs'*10000000,'')

Program received signal SIGSEGV, Segmentation fault.
__dcigettext (
domainname=domainname@entry=0xadb030 "absabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsab"..., msgid1=msgid1@entry=0x7ffff7fc09a0 "", msgid2=msgid2@entry=0x0,
plural=plural@entry=0, n=n@entry=0, category=category@entry=5) at dcigettext.c:675
675 dcigettext.c: No such file or directory.
(gdb)

valgrind
~$ PYTHONMALLOC=malloc_debug valgrind python
Memcheck, a memory error detector
==4870== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==4870== Using Valgrind-3.16.1 and LibVEX; rerun with -h for copyright info
==4870== Command: /home/xxm/Desktop/apifuzz/Python-3.10.0a6/python
==4870== 
Python 3.10.0a6 (default, Mar 19 2021, 11:45:56) [GCC 7.5.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> locale.dgettext('abs'*10000000,'')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
NameError: name 'locale' is not defined
>>> import locale
>>> locale.dgettext('abs'*10000000,'')
==4870== Warning: client switching stacks?  SP change: 0x1ffefff5c0 --> 0x1ffd363220
==4870==          to suppress, use: --max-stackframe=30000032 or greater
==4870== Invalid write of size 8
==4870==    at 0x5797E88: __dcigettext (dcigettext.c:675)
==4870==  Address 0x1ffd363218 is on thread 1's stack
==4870== 
==4870== 
==4870== Process terminating with default action of signal 11 (SIGSEGV)
==4870==  Access not within mapped region at address 0x1FFD363218
==4870==    at 0x5797E88: __dcigettext (dcigettext.c:675)
==4870==  If you believe this happened as a result of a stack
==4870==  overflow in your program's main thread (unlikely but
==4870==  possible), you can try to increase the size of the
==4870==  main thread stack using the --main-stacksize= flag.
==4870==  The main thread stack size used in this run was 8388608.
==4870== Invalid write of size 8
==4870==    at 0x4A2867A: _vgnU_freeres (vg_preloaded.c:57)
==4870==  Address 0x1ffd363210 is on thread 1's stack
==4870== 
==4870== 
==4870== Process terminating with default action of signal 11 (SIGSEGV)
==4870==  Access not within mapped region at address 0x1FFD363210
==4870==    at 0x4A2867A: _vgnU_freeres (vg_preloaded.c:57)
==4870==  If you believe this happened as a result of a stack
==4870==  overflow in your program's main thread (unlikely but
==4870==  possible), you can try to increase the size of the
==4870==  main thread stack using the --main-stacksize= flag.
==4870==  The main thread stack size used in this run was 8388608.
==4870== 
==4870== HEAP SUMMARY:
==4870==     in use at exit: 35,310,749 bytes in 35,706 blocks
==4870==   total heap usage: 87,221 allocs, 51,515 frees, 44,733,752 bytes allocated
==4870== 
==4870== LEAK SUMMARY:
==4870==    definitely lost: 0 bytes in 0 blocks
==4870==    indirectly lost: 0 bytes in 0 blocks
==4870==      possibly lost: 35,173,680 bytes in 34,899 blocks
==4870==    still reachable: 137,069 bytes in 807 blocks
==4870==         suppressed: 0 bytes in 0 blocks
==4870== Rerun with --leak-check=full to see details of leaked memory
==4870== 
==4870== For lists of detected and suppressed errors, rerun with: -s
==4870== ERROR SUMMARY: 2 errors from 2 contexts (suppressed: 0 from 0)
Segmentation fault (core dumped)

@tiran
Copy link
Member

tiran commented Apr 6, 2021

The crash occurs inside glibc's dgettext() implementation. Its man page does not list any limitation for domain or msgid length. This looks like a bug in glibc.

#0  0x00007ffff7c57a8f in __dcigettext () from /lib64/libc.so.6
#1  0x000000000058a235 in _locale_dgettext_impl (in=0x7fffea64d8e0 "", 
    domain=0x7fffe874e040 "absabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsabsab"..., module=<optimized out>) at ./Modules/_localemodule.c:662

@serhiy-storchaka
Copy link
Member

__dcigettext() contains:

  domainname_len = strlen (domainname);
  xdomainname = (char *) alloca (strlen (categoryname)
				 + domainname_len + 5);

It tries to allocate a buffer on stack, and for domain name causes stack overflow.

There is no portable way to restore after stack overflow or to check it ahead. We can add arbitrary limit for the length of domain name, but it does not guarantee anything. It is just yet one way to crash Python from Python code.

@ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.10 only security fixes stdlib Python modules in the Lib dir type-crash A hard crash of the interpreter, possibly with a core dump
Projects
Status: No status
Development

No branches or pull requests

2 participants