Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use-after-free by mutating set during set operations #90773

Closed
sweeneyde opened this issue Feb 2, 2022 · 14 comments
Closed

Use-after-free by mutating set during set operations #90773

sweeneyde opened this issue Feb 2, 2022 · 14 comments
Labels
3.9 only security fixes 3.10 only security fixes 3.11 only security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) type-crash A hard crash of the interpreter, possibly with a core dump

Comments

@sweeneyde
Copy link
Member

BPO 46615
Nosy @tim-one, @rhettinger, @serhiy-storchaka, @miss-islington, @sweeneyde
PRs
  • bpo-46615: Don't crash when set operations mutate the sets #31120
  • [3.10] bpo-46615: Don't crash when set operations mutate the sets (GH-31120) #31284
  • [3.9] bpo-46615: Don't crash when set operations mutate the sets (GH-31120) #31312
  • Files
  • picklecrasher.py: Randomized crasher for uses of PyDict_Next in _pickle.c
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = None
    created_at = <Date 2022-02-02.18:01:22.729>
    labels = ['interpreter-core', '3.10', '3.9', 'type-crash', '3.11']
    title = 'Use-after-free by mutating set during set operations'
    updated_at = <Date 2022-02-19.05:24:06.142>
    user = 'https://github.com/sweeneyde'

    bugs.python.org fields:

    activity = <Date 2022-02-19.05:24:06.142>
    actor = 'Dennis Sweeney'
    assignee = 'none'
    closed = False
    closed_date = None
    closer = None
    components = ['Interpreter Core']
    creation = <Date 2022-02-02.18:01:22.729>
    creator = 'Dennis Sweeney'
    dependencies = []
    files = ['50631']
    hgrepos = []
    issue_num = 46615
    keywords = ['patch']
    message_count = 13.0
    messages = ['412386', '412387', '412389', '412444', '412448', '412487', '412489', '412494', '413085', '413097', '413175', '413176', '413530']
    nosy_count = 5.0
    nosy_names = ['tim.peters', 'rhettinger', 'serhiy.storchaka', 'miss-islington', 'Dennis Sweeney']
    pr_nums = ['31120', '31284', '31312']
    priority = 'normal'
    resolution = None
    stage = 'patch review'
    status = 'open'
    superseder = None
    type = 'crash'
    url = 'https://bugs.python.org/issue46615'
    versions = ['Python 3.9', 'Python 3.10', 'Python 3.11']

    @sweeneyde
    Copy link
    Member Author

    Maybe related to https://bugs.python.org/issue8420

    Somewhat obscure, but using only standard Python, and no frame- or gc-hacks, it looks like we can get a use-after-free:

    from random import random
    
    BADNESS = 0.0
    
    class Bad:
        def __eq__(self, other):
            if random() < BADNESS:
                set1.clear()
            if random() < BADNESS:
                set2.clear()
            return True
        def __hash__(self):
            return 42
    
    SIZE = 100
    TRIALS = 10_000
    
    ops = [
        "|", "|=",
        "==", "!=",
        "<", "<=",
        ">", ">=",
        # "&",  # crash!
        # "&=", # crash!
        "^",
        # "^=", # crash
        # "-", # crash
        "-=",
    ]
    
    for op in ops:
        stmt = f"set1 {op} set2"
        print(stmt, "...")
        for _ in range(TRIALS):
            BADNESS = 0.00
            set1 = {Bad() for _ in range(SIZE)}
            set2 = {Bad() for _ in range(SIZE)}
            BADNESS = 0.02
            exec(stmt)
        print("ok.")

    @sweeneyde sweeneyde added 3.9 only security fixes 3.10 only security fixes 3.11 only security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) type-crash A hard crash of the interpreter, possibly with a core dump labels Feb 2, 2022
    @sweeneyde
    Copy link
    Member Author

    replacing return True with return random() < 0.5 makes all of the operations crash, except for | and |=.

    @sweeneyde sweeneyde changed the title Segfault in set intersection (&) and difference (-) Use-after-free by mutating set during set operations Feb 2, 2022
    @sweeneyde sweeneyde changed the title Segfault in set intersection (&) and difference (-) Use-after-free by mutating set during set operations Feb 2, 2022
    @sweeneyde
    Copy link
    Member Author

    set1.isdisjoint(set2) also crashes

    @rhettinger
    Copy link
    Contributor

    The likely culprit is the set_next() loop. Perhaps it is never safe to use set_next() because any lookup can callback to __eq__ which can mutate the set.

    Since set_isdisjoint() method isn't a mutating method, that is the easiest place to start investigating. Try disabling the exact set fast path to see if the issue persists.

    @rhettinger
    Copy link
    Contributor

    Presumably _PyDict_Next is also suspect. Even the advertised "safe" calls to PyDict_SetItem() for existing keys would be a trigger. Calling clear() in either __eq__ or __hash__ would suffice.

    If the next loops are the culprint, the new challenge is figuring out how to fix it without wrecking code clarity and performance (and having to deprecate PyDict_Next() which is part of the stable ABI).

    @rhettinger
    Copy link
    Contributor

    Marking as low priority given that ehe next loop code has been deployed without incident for two decades (a little less for sets and a little more for dicts).

    @sweeneyde
    Copy link
    Member Author

    It looks like usages of the PyDict_Next API assume the resulting references are borrowed and so INCREF them.

    Usages of set_next do not, but should.

    It should hopefully be a straightforward fix of adding INCREF/DECREFs.

    @tim-one
    Copy link
    Member

    tim-one commented Feb 4, 2022

    Raised the priority back to normal.

    I agree with Dennis's observation that PyDict_Next is safe, provided it's used as intended: it returns borrowed references, but to things that absolutely are legitimate at the time. In the presence of mutations, *what* it returns isn't defined at all, but I don't see a way for it to blow up (unless its caller screws up by believing it owns the references). It doesn't assume anything about the structure of the dict across calls.

    @sweeneyde
    Copy link
    Member Author

    New changeset 4a66615 by Dennis Sweeney in branch 'main':
    bpo-46615: Don't crash when set operations mutate the sets (GH-31120)
    4a66615

    @miss-islington
    Copy link
    Contributor

    New changeset 1f5fe99 by Miss Islington (bot) in branch '3.10':
    bpo-46615: Don't crash when set operations mutate the sets (GH-31120)
    1f5fe99

    @serhiy-storchaka
    Copy link
    Member

    New changeset c31b8a9 by Dennis Sweeney in branch '3.9':
    bpo-46615: Don't crash when set operations mutate the sets (GH-31120) (GH-31312)
    c31b8a9

    @serhiy-storchaka
    Copy link
    Member

    Thanks Dennis for your report and PRs.

    Do you mind to analyze also uses of _PySet_NextEntry(), PyDict_Next() and _PyDict_Next()? Many of them look safe, but _pickle.c seems vulnerable.

    @sweeneyde
    Copy link
    Member Author

    It does look like there are some pickle situations that crash. Attached is a randomized crasher. I haven't done too much careful reasoning about it, but adding INCREFs everywhere seems to fix most of the issues.

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    @sweeneyde
    Copy link
    Member Author

    The original issue is resolved. I opened #92930 about picking dicts.

    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.9 only security fixes 3.10 only security fixes 3.11 only security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) type-crash A hard crash of the interpreter, possibly with a core dump
    Projects
    None yet
    Development

    No branches or pull requests

    5 participants