Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

C unpickling bypasses import thread safety #78753

Closed
tjb900 mannequin opened this issue Sep 3, 2018 · 13 comments
Closed

C unpickling bypasses import thread safety #78753

tjb900 mannequin opened this issue Sep 3, 2018 · 13 comments
Labels
3.7 (EOL) end of life 3.8 only security fixes extension-modules C modules in the Modules dir type-bug An unexpected behavior, bug, or error

Comments

@tjb900
Copy link
Mannequin

tjb900 mannequin commented Sep 3, 2018

BPO 34572
Nosy @brettcannon, @ncoghlan, @pitrou, @avassalotti, @ericsnowcurrently, @tjb900, @gstarnberger
PRs
  • bpo-34572: change _pickle unpickling to use import rather than retrieving from sys.modules #9047
  • [3.7] bpo-34572: change _pickle unpickling to use import rather than retrieving from sys.modules (GH-9047) #11921
  • Files
  • reproducer_submit.py
  • reproduce_34572.py
  • find_class_deadlock.py
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2019-02-18.15:53:12.514>
    created_at = <Date 2018-09-03.15:53:21.527>
    labels = ['extension-modules', '3.8', 'type-bug', '3.7']
    title = 'C unpickling bypasses import thread safety'
    updated_at = <Date 2019-11-21.19:24:51.077>
    user = 'https://github.com/tjb900'

    bugs.python.org fields:

    activity = <Date 2019-11-21.19:24:51.077>
    actor = 'Valentyn Tymofieiev'
    assignee = 'none'
    closed = True
    closed_date = <Date 2019-02-18.15:53:12.514>
    closer = 'pitrou'
    components = ['Extension Modules']
    creation = <Date 2018-09-03.15:53:21.527>
    creator = 'tjb900'
    dependencies = []
    files = ['47784', '48715', '48736']
    hgrepos = []
    issue_num = 34572
    keywords = ['patch', 'needs review']
    message_count = 13.0
    messages = ['324528', '327729', '335023', '335027', '335096', '335098', '335099', '335842', '335844', '335845', '356641', '356697', '357200']
    nosy_count = 8.0
    nosy_names = ['brett.cannon', 'ncoghlan', 'pitrou', 'alexandre.vassalotti', 'eric.snow', 'tjb900', 'Valentyn Tymofieiev', 'gst']
    pr_nums = ['9047', '11921']
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue34572'
    versions = ['Python 3.7', 'Python 3.8']

    @tjb900
    Copy link
    Mannequin Author

    tjb900 mannequin commented Sep 3, 2018

    Retrieving and using a module directly from sys.modules (from C in this case) leads to a race condition where the module may be importing on another thread but has not yet been initialised. For slow filesystems or large modules (e.g. numpy) this seems to lead to easily reproducible errors (the attached code fails 100% of the time on my work machine - CentOS 7).

    I believe they have to be in sys.modules during this phase due to the possibility of circular references.

    importlib handles this carefully with locking, but _pickle.c bypasses all that, leading to issues with threaded codes that use pickling, e.g. dask/distributed.

    @tjb900 tjb900 mannequin added type-crash A hard crash of the interpreter, possibly with a core dump 3.7 (EOL) end of life 3.8 only security fixes extension-modules C modules in the Modules dir labels Sep 3, 2018
    @tjb900
    Copy link
    Mannequin Author

    tjb900 mannequin commented Oct 15, 2018

    Hi! Just wondering if there is anything I can do to move this along?

    @pitrou
    Copy link
    Member

    pitrou commented Feb 7, 2019

    Interesting I could not reproduce here, even by throwing Pandas into the mix and spawning 1024 workers...

    @ericsnowcurrently
    Copy link
    Member

    Perhaps PyImport_GetModule() should aquire-release the module's lock before the lookup? This would effectively be a call to _lock_unlock_module() in importlib._bootstrap.

    The alternative is to encourage using PyImport_Import() instead, like the PR has done. In the case the docs for PyImport_GetModule() should make it clear that it is guaranteed that the module is fully imported yet (and recommend using PyImport_Import() for the guarantee).

    Either way there should be a new issue for the more general change (and it should reference this issue).

    @ericsnowcurrently ericsnowcurrently added type-bug An unexpected behavior, bug, or error and removed type-crash A hard crash of the interpreter, possibly with a core dump labels Feb 7, 2019
    @pitrou
    Copy link
    Member

    pitrou commented Feb 8, 2019

    I agree that more generally PyImport_GetModule() should be fixed. But it should be done carefully so as to not to lose the performance benefit of doing it. I think we should open a separate issue about that.

    PS: one possibility is to reuse the optimization already done in PyImport_ImportModuleLevelObject():

            /* Optimization: only call _bootstrap._lock_unlock_module() if
               __spec__._initializing is true.
               NOTE: because of this, initializing must be set *before*
               stuffing the new module in sys.modules.
             */
            spec = _PyObject_GetAttrId(mod, &PyId___spec__);
            if (_PyModuleSpec_IsInitializing(spec)) {
                PyObject *value = _PyObject_CallMethodIdObjArgs(interp->importlib,
                                                &PyId__lock_unlock_module, abs_name,
                                                NULL);
                if (value == NULL) {
                    Py_DECREF(spec);
                    goto error;
                }
                Py_DECREF(value);
            }
            Py_XDECREF(spec);

    @pitrou
    Copy link
    Member

    pitrou commented Feb 8, 2019

    Opened bpo-35943 for PyImport_GetModule().

    @ericsnowcurrently
    Copy link
    Member

    Thanks, Antoine.

    @pitrou
    Copy link
    Member

    pitrou commented Feb 18, 2019

    New changeset 4371c0a by Antoine Pitrou (tjb900) in branch 'master':
    bpo-34572: change _pickle unpickling to use import rather than retrieving from sys.modules (GH-9047)
    4371c0a

    @pitrou
    Copy link
    Member

    pitrou commented Feb 18, 2019

    New changeset 3129432 by Antoine Pitrou (Miss Islington (bot)) in branch '3.7':
    bpo-34572: change _pickle unpickling to use import rather than retrieving from sys.modules (GH-9047) (GH-11921)
    3129432

    @pitrou
    Copy link
    Member

    pitrou commented Feb 18, 2019

    This is pushed to 3.7 and master now. Thank you Tim for the report and the fix!

    @pitrou pitrou closed this as completed Feb 18, 2019
    @gstarnberger
    Copy link
    Mannequin

    gstarnberger mannequin commented Nov 15, 2019

    For this issue only 3.7 and 3.8 are listed as affected versions, but it appears to be also reproducible on the latest 3.5 and 3.6 releases. I've attached a script that can be used to reproduce the issue on those earlier releases (it consistently fails for me with values of 50 or higher as command line argument).

    @brettcannon
    Copy link
    Member

    3.6 and 3.5 are in security mode, so unless there's a security risk due to this bug the fix will not be backported to those older versions (https://devguide.python.org/#status-of-python-branches).

    @ValentynTymofieiev
    Copy link
    Mannequin

    ValentynTymofieiev mannequin commented Nov 21, 2019

    While investigating[1], I observe that certain unpickling operations, for example, Unpickler.find_class, remain not thread-safe in Python 3.7.5 and earlier versions that I tried. I have not tried 3.8, but cannot reproduce this error on Python 2.

    For example, attached find_class_deadlock.py which consistently fails with a deadlock on Python 3.7.5.

    I opened https://bugs.python.org/issue38884, which may be causing these errors, since the failure mode is similar, and in at least some codepaths, find_class seems to be calling __import__ [2].

    [1] https://issues.apache.org/jira/browse/BEAM-8651
    [2]

    __import__(module, level=0)
    .

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.7 (EOL) end of life 3.8 only security fixes extension-modules C modules in the Modules dir type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    3 participants