-
-
Notifications
You must be signed in to change notification settings - Fork 30.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use-after-free by mutating set during set operations #90773
Comments
Maybe related to https://bugs.python.org/issue8420 Somewhat obscure, but using only standard Python, and no frame- or gc-hacks, it looks like we can get a use-after-free: from random import random
BADNESS = 0.0
class Bad:
def __eq__(self, other):
if random() < BADNESS:
set1.clear()
if random() < BADNESS:
set2.clear()
return True
def __hash__(self):
return 42
SIZE = 100
TRIALS = 10_000
ops = [
"|", "|=",
"==", "!=",
"<", "<=",
">", ">=",
# "&", # crash!
# "&=", # crash!
"^",
# "^=", # crash
# "-", # crash
"-=",
]
for op in ops:
stmt = f"set1 {op} set2"
print(stmt, "...")
for _ in range(TRIALS):
BADNESS = 0.00
set1 = {Bad() for _ in range(SIZE)}
set2 = {Bad() for _ in range(SIZE)}
BADNESS = 0.02
exec(stmt)
print("ok.") |
replacing |
set1.isdisjoint(set2) also crashes |
The likely culprit is the set_next() loop. Perhaps it is never safe to use set_next() because any lookup can callback to __eq__ which can mutate the set. Since set_isdisjoint() method isn't a mutating method, that is the easiest place to start investigating. Try disabling the exact set fast path to see if the issue persists. |
Presumably _PyDict_Next is also suspect. Even the advertised "safe" calls to PyDict_SetItem() for existing keys would be a trigger. Calling clear() in either __eq__ or __hash__ would suffice. If the next loops are the culprint, the new challenge is figuring out how to fix it without wrecking code clarity and performance (and having to deprecate PyDict_Next() which is part of the stable ABI). |
Marking as low priority given that ehe next loop code has been deployed without incident for two decades (a little less for sets and a little more for dicts). |
It looks like usages of the PyDict_Next API assume the resulting references are borrowed and so INCREF them. Usages of set_next do not, but should. It should hopefully be a straightforward fix of adding INCREF/DECREFs. |
Raised the priority back to normal. I agree with Dennis's observation that PyDict_Next is safe, provided it's used as intended: it returns borrowed references, but to things that absolutely are legitimate at the time. In the presence of mutations, *what* it returns isn't defined at all, but I don't see a way for it to blow up (unless its caller screws up by believing it owns the references). It doesn't assume anything about the structure of the dict across calls. |
Thanks Dennis for your report and PRs. Do you mind to analyze also uses of _PySet_NextEntry(), PyDict_Next() and _PyDict_Next()? Many of them look safe, but _pickle.c seems vulnerable. |
It does look like there are some pickle situations that crash. Attached is a randomized crasher. I haven't done too much careful reasoning about it, but adding INCREFs everywhere seems to fix most of the issues. |
The original issue is resolved. I opened #92930 about picking dicts. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: