-
-
Notifications
You must be signed in to change notification settings - Fork 31.7k
Various segfaults with dict #72132
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Here I'll describe five distinct issues I found. Common to them all is that they Four of them are use-after-frees and one is an array-out-of-bounds indexing bug. All of the described functions reside in /Objects/dictobject.c. Issue 1: use-after-free when initializing a dictionary Initialization of a dictionary happens via the function dict_init which calls In PyDict_MergeFromSeq2 we retrieve a sequence of size 2 with this line: fast = PySequence_Fast(item, ""); After checking its size, we take out a key and value: key = PySequence_Fast_GET_ITEM(fast, 0);
value = PySequence_Fast_GET_ITEM(fast, 1); Then we call PyDict_GetItem. This calls back to Python code if the key has a Here's a PoC: --- class X:
def __hash__(self):
pair[:] = []
return 13
pair = [X(), 123]
dict([pair]) It crashes while trying to use freed memory as a PyObject: (gdb) run ./poc24.py Issue 2: use-after-free in dictitems_contains In the function dictitems_contains we call PyDict_GetItem to look up a value in found = PyDict_GetItem((PyObject *)dv->dv_dict, key); However this "found" variable is borrowed. We then go ahead and compare it:
But PyObject_RichCompareBool could call back into Python code and e.g. release Then, inside PyObject_RichCompareBool (actually in do_richcompare), the "found" PoC: --- class X:
def __eq__(self, other):
d.clear()
return NotImplemented
d = {0: set()}
(0, X()) in d.items() Result: (gdb) run ./poc25.py Issue 3: use-after-free in dict_equal In the function dict_equal, we call the "lookdict" function via
This value's address is stored into the "vaddr" variable and the value is
Then we call Py_DECREF(key) which can call back into Python code. This could cmp = PyObject_RichCompareBool(aval, bval, Py_EQ); This results in a use-after-free. PoC: --- class X():
def __del__(self):
dict_b.clear()
def __eq__(self, other):
dict_a.clear()
return True
def __hash__(self):
return 13
dict_a = {X(): 0}
dict_b = {X(): X()}
dict_a == dict_b Result: (gdb) run ./poc26.py Issue 4: use-after-free in _PyDict_FromKeys The function _PyDict_FromKeys takes an iterable as argument. If the iterable is while (_PyDict_Next(iterable, &pos, &key, &oldvalue, &hash)) {
if (insertdict(mp, key, hash, value)) {
...
}
} However if we look at the comment for PyDict_Next, we see this:
But insertdict can call on to Python code which might mutate the dict. In that Here's a PoC: --- class X(int):
def __hash__(self):
return 13
def __eq__(self, other):
if len(d) > 1:
d.clear()
return False
d = {}
d = {X(1): 1, X(2): 2}
x = {}.fromkeys(d) And the result: (gdb) run ./poc27.py An almost identical issue also exists further down in the function when calling d = set()
d = set([X(1), X(2)]) this likewise crashes with a use-after-free. (Note: if you grep for PyDict_Next you will find more similar cases, although --- class X(str):
def __hash__(self):
d.clear()
return 13
d = {}
d[X()] = X()
e = Exception()
e.__setstate__(d) end note.) Issue 5: out-of-bounds indexing in dictiter_iternextitem The function dictiter_iternextitem is used to iterate over a dictionary's items. Py_DECREF(PyTuple_GET_ITEM(result, 0));
Py_DECREF(PyTuple_GET_ITEM(result, 1)); This can execute Python code and mutate the dict. If that happens, the index "i" key = d->ma_keys->dk_entries[i].me_key; Furthermore the "value_ptr" variable would have gone stale, too. Taking the
Here's a PoC which crashes with the "value" variable being an arbitrary pointer: --- class X(int):
def __del__(self):
d.clear()
d = {i: X(i) for i in range(8)}
for result in d.items():
if result[0] == 2:
d[2] = None # free d[2] --> X(2).__del__ is called The result: (gdb) run ./poc29.py |
Ping. The built-in dict was considerably changed in bpo-27350; do any of these issues still persist? |
I cannot reproduce the segfaults for the first four issues however valgrind still reports problems for all but the second. The fifth (last) one still segfaults. I have a patch for the fifth issue. The other remaining issues are all reporting the same invalid read at exit. I'm not sure whether that is the same issue reported here or something else that they all trigger: Invalid read of size 8 |
Apologies: compiling python with --with-pydebug all of these issues are reproducible on head after all. Furthermore while my patch fixes the reported crash it still crashes on exit: Program received signal SIGSEGV, Segmentation fault. |
Fix for the first issue: with this fix there is no segfault or valgrind issue reported on during execution or on exit. |
Fix for the second issue: with this fix there is no segfault or valgrind issue reported on during execution or on exit. |
Ah, my first fix (for the fifth issue) was incomplete. Please see attached patch which I think correctly fixes the problem. |
Here is my patch for parts 3 and 4. |
Note that I think most or all of these issues apply to 2.7 and while I didn't do a proper check I think the fixes also apply. |
0001-Issue-27945-fix-PyDict_MergeFromSeq2-use-after-free.patch: LGTM. I've checked it can be applied to 2.7 and 3.5 branch and passes |
0001-Issue-27945-fix-dictitems_contains-use-after-free.patch LGTM. |
0001-Issue-27945-fix-dictiter_iternextitem-use-after-free.patch LGTM and OK too. I want to commit first three patches. |
I worry about performance. Additional increment/decrement of reference counter can add significant overhead in critical places of Python interpreter. Before committing any patches we should measure the performance lost. |
OK, I'll run benchmark in this week. |
I run performance 0.5.0 on Python 3.5. On Python 3.5: $ ./venv/cpython3.5-846d5b1f0b61/bin/python -m perf compare_to py35-master.json py35-patched.json -G
Slower (14):
- logging_silent: 2.92 us +- 0.15 us -> 3.05 us +- 0.18 us: 1.04x slower
- pickle: 93.2 us +- 4.7 us -> 95.8 us +- 5.0 us: 1.03x slower
- python_startup: 72.6 ms +- 4.7 ms -> 74.6 ms +- 4.8 ms: 1.03x slower
- meteor_contest: 766 ms +- 13 ms -> 779 ms +- 15 ms: 1.02x slower
- scimark_sor: 2.00 sec +- 0.04 sec -> 2.04 sec +- 0.03 sec: 1.02x slower
- fannkuch: 3.86 sec +- 0.05 sec -> 3.91 sec +- 0.04 sec: 1.01x slower
- python_startup_no_site: 37.4 ms +- 2.4 ms -> 38.0 ms +- 2.6 ms: 1.01x slower
- regex_dna: 958 ms +- 13 ms -> 973 ms +- 16 ms: 1.01x slower
- regex_compile: 1.41 sec +- 0.02 sec -> 1.43 sec +- 0.02 sec: 1.01x slower
- raytrace: 5.46 sec +- 0.06 sec -> 5.52 sec +- 0.07 sec: 1.01x slower
- scimark_lu: 1.88 sec +- 0.09 sec -> 1.90 sec +- 0.12 sec: 1.01x slower
- nbody: 887 ms +- 19 ms -> 895 ms +- 21 ms: 1.01x slower
- 2to3: 3.01 sec +- 0.03 sec -> 3.03 sec +- 0.04 sec: 1.01x slower
- scimark_fft: 2.52 sec +- 0.05 sec -> 2.54 sec +- 0.04 sec: 1.01x slower Faster (16):
Benchmark hidden because not significant (31): call_method, chameleon, crypto_pyaes, deltablue, dulwich_log, genshi_text, genshi_xml, go, hexiom, html5lib, json_dumps, json_loads, logging_format, logging_simple, nqueens, pathlib, pickle_dict, pickle_list, pickle_pure_python, pidigits, regex_effbot, regex_v8, richards, scimark_monte_carlo, spectral_norm, sympy_expand, unpack_sequence, unpickle, unpickle_list, unpickle_pure_python, xml_etree_iterparse |
Only patch which affects to hot loop is: --- a/Objects/dictobject.c Tue Nov 15 21:21:35 2016 -0500
+++ b/Objects/dictobject.c Wed Nov 16 11:40:51 2016 +0000
@@ -1550,11 +1550,18 @@ PyDict_MergeFromSeq2(PyObject *d, PyObje
/* Update/merge with this (key, value) pair. */
key = PySequence_Fast_GET_ITEM(fast, 0);
value = PySequence_Fast_GET_ITEM(fast, 1);
+ Py_INCREF(key);
+ Py_INCREF(value);
if (override || PyDict_GetItem(d, key) == NULL) {
int status = PyDict_SetItem(d, key, value);
- if (status < 0)
+ if (status < 0) {
+ Py_DECREF(key);
+ Py_DECREF(value);
goto Fail;
+ }
}
+ Py_DECREF(key);
+ Py_DECREF(value);
Py_DECREF(fast);
Py_DECREF(item);
} Microbenchmark for it: $ ~/local/py35/bin/master -m perf timeit --rigorous -t --python ~/local/py35/bin/patched --compare-to ~/local/py35/bin/master -s 'L = [(i,i) for i in range(10000)]' -- 'dict.fromkeys(L)'
Benchmark master ================ .........................................Total duration: 21.2 sec Number of runs: 41 Minimum: 3.41 ms (-13%) Median +- std dev: 3.92 ms +- 0.19 ms Benchmark patched .........................................Total duration: 21.3 sec Number of runs: 41 Minimum: 3.39 ms (-14%) Median +- std dev: 3.92 ms +- 0.23 ms Compare Median +- std dev: [master] 3.92 ms +- 0.19 ms -> [patched] 3.92 ms +- 0.23 ms: 1.00x slower |
I'm sorry, dict.fromkeys() didn't use PyDict_MergeFromSeq2(). This may be microbench for worst case: $ ~/local/py35/bin/master -m perf timeit --rigorous --python ~/local/py35/bin/patched --compare-to ~/local/py35/bin/master -s 'L = [(i,i) for i in range(10000)]' -- 'dict(L)'
master: ......................................... 2.06 ms +- 0.11 ms
patched: ......................................... 2.16 ms +- 0.09 ms Median +- std dev: [master] 2.06 ms +- 0.11 ms -> [patched] 2.16 ms +- 0.09 ms: 1.05x slower $ ~/local/py27/bin/master -m perf timeit --rigorous --python ~/local/py27/bin/patched --compare-to ~/local/py27/bin/master -s 'L = [(i,i) for i in range(10000)]' -- 'dict(L)'
master: ......................................... 1.48 ms +- 0.06 ms
patched: ......................................... 1.57 ms +- 0.09 ms Median +- std dev: [master] 1.48 ms +- 0.06 ms -> [patched] 1.57 ms +- 0.09 ms: 1.06x slower |
I modified the patch to avoid incref&decref when pair is not list, because tuple is common for such case. (python 2.7 is modified version of patch) inada-n@test1:~/work/bench$ ~/local/py27/bin/master -m perf timeit --rigorous --python ~/local/py27/bin/python2.7 --compare-to ~/local/py27/bin/master -s 'L = [(i,i) for i in range(10000)]' -- 'dict(L)' Median +- std dev: [master] 1.47 ms +- 0.06 ms -> [python2.7] 1.55 ms +- 0.06 ms: 1.05x slower Median +- std dev: [patched] 1.56 ms +- 0.08 ms -> [python2.7] 1.55 ms +- 0.09 ms: 1.01x faster |
Here is a consolidated patch for review. |
The change to dict_equal() LGTM. It doesn't add an overhead. For dictiter_iternextitem() I propose other change. It doesn't add an overhead. There are bugs in the patch for _PyDict_FromKeys(). The change to dictitems_contains() adds an overhead, but it is small and seems unavoidable. I wondering whether it is worth to use PySequence_Tuple() instead of PySequence_Fast() in PyDict_MergeFromSeq2(). This would add a cost of creating new tuple if items are not tuples, but save increfing/decrefing the key and the value in common case. I have doubts about insertdict(). Additional incref duplicates increfs in dict_merge(). Is it worth to move it outside of insertdict()? I have doubts about _PyDict_FromKeys(). It seems to me that it is safe to use _PyDict_Next() in a loop that mutates the dict (not checked _PySet_Next()). With guarded insertdict() additional check is not needed (and it doesn't help against clearing the dict inside insertdict()). In updated patch fixed some bugs and used different solution for dictiter_iternextitem(). |
LGTM. Performance on Azure VM (AMD Opteron(tm) Processor 4171 HE): $ ~/local/py36/bin/patched -m perf compare_to master.json patched.json -G
Slower (10):
- spectral_norm: 915 ms +- 17 ms -> 967 ms +- 25 ms: 1.06x slower
- nbody: 774 ms +- 28 ms -> 805 ms +- 22 ms: 1.04x slower
- regex_dna: 965 ms +- 18 ms -> 993 ms +- 22 ms: 1.03x slower
- go: 1.93 sec +- 0.05 sec -> 1.99 sec +- 0.06 sec: 1.03x slower
- python_startup_no_site: 29.7 ms +- 2.0 ms -> 30.5 ms +- 2.0 ms: 1.03x slower
- xml_etree_parse: 1.02 sec +- 0.03 sec -> 1.04 sec +- 0.04 sec: 1.02x slower
- python_startup: 52.7 ms +- 3.6 ms -> 53.7 ms +- 3.6 ms: 1.02x slower
- xml_etree_generate: 962 ms +- 24 ms -> 978 ms +- 25 ms: 1.02x slower
- pickle_dict: 188 us +- 8 us -> 191 us +- 8 us: 1.02x slower
- scimark_fft: 2.19 sec +- 0.04 sec -> 2.20 sec +- 0.05 sec: 1.00x slower Faster (26):
Benchmark hidden because not significant (28): call_method_unknown, call_simple, dulwich_log, float, genshi_text, hexiom, html5lib, json_dumps, json_loads, logging_format, logging_silent, logging_simple, pathlib, pickle, pickle_list, scimark_lu, scimark_monte_carlo, scimark_sor, scimark_sparse_mat_mult, sqlalchemy_declarative, sqlalchemy_imperative, sympy_expand, sympy_integrate, sympy_str, sympy_sum, tornado_http, unpickle_list, xml_etree_process |
Where do we stand on this issue? At the moment, 3.6.0 is going to be released without these fixes. |
This issue is severe, but I don't consider it as release critical for 3.6.0. The patch fixes segfaults, but it can add unneeded overhead, and the dict performance is critical for Python core. The segfaults are not new. I'm trying to minimize the overhead of changes, but this doesn't suffer a haste. |
Thank you for your patches Duane and Tim. Thank you for your very detailed report that allow writing these patches tehybel. I excluded changes in dict.fromkeys() since they look unnecessary for this issue after fixing insertdict(). There are other reasons for adding these changes (this makes the behavior with exact dicts and sets and their subclasses more consistent), but this is other issue. There are also reasons for not adding them. We need to check also the set implementation whether it is vulnerable to similar issues. |
Since Serhiy created backport PRs for 3.4 and 3.3, I'm reopening the issue and marking it as Release Blocker (for those releases) so we don't lose track of them and agree they meet the criteria for security-fix-only releases. |
For the context see bpo-30484. If the segfault is caused by garbage collection this perhaps is not an expluatable vulnerability, but a severe bug that can affect any multithread application and doesn't have a workaround. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: