Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

socket.setdefaulttimeout affecting multiprocessing Manager #50306

Closed
ryles mannequin opened this issue May 18, 2009 · 12 comments
Closed

socket.setdefaulttimeout affecting multiprocessing Manager #50306

ryles mannequin opened this issue May 18, 2009 · 12 comments
Labels
stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error

Comments

@ryles
Copy link
Mannequin

ryles mannequin commented May 18, 2009

BPO 6056
Nosy @JimJJewett
Files
  • mpsock.patch: make python2.7 mp sockets blocking
  • mp6056fix.patch: updated patch against 3.3b1(ish)
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2012-07-29.12:28:11.838>
    created_at = <Date 2009-05-18.17:48:59.835>
    labels = ['type-bug', 'library']
    title = 'socket.setdefaulttimeout affecting multiprocessing Manager'
    updated_at = <Date 2012-07-29.12:28:11.837>
    user = 'https://bugs.python.org/ryles'

    bugs.python.org fields:

    activity = <Date 2012-07-29.12:28:11.837>
    actor = 'sbt'
    assignee = 'jnoller'
    closed = True
    closed_date = <Date 2012-07-29.12:28:11.838>
    closer = 'sbt'
    components = ['Library (Lib)']
    creation = <Date 2009-05-18.17:48:59.835>
    creator = 'ryles'
    dependencies = []
    files = ['22376', '26497']
    hgrepos = []
    issue_num = 6056
    keywords = ['patch']
    message_count = 12.0
    messages = ['88040', '88045', '88064', '88078', '137830', '137860', '138415', '138489', '151968', '151982', '166292', '166567']
    nosy_count = 7.0
    nosy_names = ['jnoller', 'ryles', 'asksol', 'python-dev', 'sbt', 'underrun', 'Jim.Jewett']
    pr_nums = []
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue6056'
    versions = ['Python 3.1', 'Python 2.7', 'Python 3.2', 'Python 3.3', 'Python 3.4']

    @ryles
    Copy link
    Mannequin Author

    ryles mannequin commented May 18, 2009

    Terminal 1:
    Python 2.6.1 (r261:67515, Apr  2 2009, 18:25:55)
    [GCC 4.1.2 20071124 (Red Hat 4.1.2-42)] on linux2
    Type "help", "copyright", "credits" or "license" for more information.
    >>> from multiprocessing.managers import SyncManager
    >>> manager = SyncManager(authkey="mykey")
    >>> manager.start()
    >>> queue = manager.Queue()
    >>> import pickle
    >>> pickle.dump(queue, open("myqueue.pkl", "w"))
    >>>
    
    Terminal 2:
    Python 2.6.1 (r261:67515, Apr  2 2009, 18:25:55)
    [GCC 4.1.2 20071124 (Red Hat 4.1.2-42)] on linux2
    Type "help", "copyright", "credits" or "license" for more information.
    >>> import socket
    >>> socket.setdefaulttimeout(30)
    >>> import multiprocessing, pickle
    >>> multiprocessing.current_process().authkey = "mykey"
    >>> queue = pickle.load(open("myqueue.pkl"))
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "python2.6/pickle.py", line 1370, in load
        return Unpickler(file).load()
      File "python2.6/pickle.py", line 858, in load
        dispatch[key](self)
      File "python2.6/pickle.py", line 1133, in load_reduce
        value = func(*args)
      File "python2.6/multiprocessing/managers.py", line 845, in RebuildProxy
        return func(token, serializer, incref=incref, **kwds)
      File "python2.6/multiprocessing/managers.py", line 894, in AutoProxy
        incref=incref)
      File "python2.6/multiprocessing/managers.py", line 700, in __init__
        self._incref()
      File "python2.6/multiprocessing/managers.py", line 749, in _incref
        conn = self._Client(self._token.address, authkey=self._authkey)
      File "python2.6/multiprocessing/connection.py", line 140, in Client
        answer_challenge(c, authkey)
      File "python2.6/multiprocessing/connection.py", line 376, in
    answer_challenge
        response = connection.recv_bytes(256)        # reject large message
    IOError: [Errno 11] Resource temporarily unavailable
    >>> 

    This works as expected without socket.setdefaulttimeout(). However, the
    timeout is useful since if the listening process on terminal 1 goes to
    sleep, e.g. ^Z, it would avoid blocking.

    I suspect the cause is similar to http://bugs.python.org/issue976613

    @ryles ryles mannequin added the stdlib Python modules in the Lib dir label May 18, 2009
    @jnoller
    Copy link
    Mannequin

    jnoller mannequin commented May 18, 2009

    Pickling the queue and then unpickling it in a new process is something I
    never thought of. That's interesting in and of itself ;)

    @ryles
    Copy link
    Mannequin Author

    ryles mannequin commented May 19, 2009

    Yeah, storing pickled queues in the file system makes for some easy IPC
    :) It wasn't a very original idea, I took the pickling comments in the
    documentation at face value:
    http://docs.python.org/library/multiprocessing.html#proxy-objects

    So, from what I can tell this issue is related to the mixing of standard
    python socket I/O with multiprocessing socket I/O, with state not being
    carried from the former to the latter.

    In multiprocessing/connection.py, SocketClient() creates a familiar
    python socket object which, when a default timeout has been set in the
    module, will be made non-blocking. In addition, the timeout is
    remembered in the socket object, and when calling socket.recv(), the
    function internal_select() will use this to perform the expected poll()
    or select().

    However, after a connection is established, SocketClient() will not use
    python's socket implementation any further, and instead pass its
    low-level socket descriptor to a multiprocessing Connection object. This
    object has its own specialized socket I/O implementation, which is not
    at all aware of the timeout previously associated with the socket. As a
    result no select/poll occurs and, due to the socket's non-blocking
    status, recv() may return EAGAIN immediately. I suspect this is what's
    happening.

    There might be a number of ways to make SocketClient() more timeout
    friendly, but possibility could be to simply check if the python socket
    has a timeout associated, and if so, use connection.poll() in addition
    to connection.recv(). There may be other places in the code where
    similar changes would occur.

    You obviously have more experience with this code base so I'll be
    curious to see the outcome.

    @ryles ryles mannequin added the type-bug An unexpected behavior, bug, or error label May 19, 2009
    @jnoller
    Copy link
    Mannequin

    jnoller mannequin commented May 19, 2009

    Well; I'm pretty tapped out right now - I think your idea of checking to
    see if a timeout has been set elsewhere makes sense. If you have the time
    to put together a patch (with a unit test or three :)) I can review it.
    Might take me a bit of time to get to this.

    @underrun
    Copy link
    Mannequin

    underrun mannequin commented Jun 7, 2011

    This should be higher priority as one of the major benefits of the multiprocessing module is remote process management in a completely transparent manner. socket timeouts are very important in this context as blocking forever waiting for a connection is not always an option.

    The problem of not being able to use a default socket timeout for other purposes in combination with multiprocessing managers is definitely an issue, but making multiprocessing actually use the timeout itself if set would be a huge advantage.

    This might not be the place to ask for it, but it would make sense for manager objects to gain a timeout attribute to be used as a timeout for local or remote communications. At the very least, the manager.connect() method should accept a timeout argument.

    @jnoller
    Copy link
    Mannequin

    jnoller mannequin commented Jun 7, 2011

    I agree derek, I think that would be a fine addition, however we lack a patch and I don't have the current bandwidth to add it.

    @underrun
    Copy link
    Mannequin

    underrun mannequin commented Jun 16, 2011

    While having multiprocessing use a timeout would be great, I didn't really have the time to fiddle with the c code.

    Instead of using the socket timeout, I'm modifying all the sockets created by the socket module to have no timeout (and thus to be blocking) which makes the multiprocessing module 'immune' to the socket module's default timeout.

    For testing, I just run the test suite twice, once with the initial default socket timeout and once with a 60 second timeout. Nothing there failed with this issue.

    It is worth noting, however, that when using a default socket timeout, for some reason processes that have have put data into a queue no longer block at exit waiting for the data to be consumed. I'm not sure if there is some additional cleanup code that uses sockets and might need to block? Or maybe whatever the issue was with blocking sockets is not an issue with non-blocking sockets?

    @underrun
    Copy link
    Mannequin

    underrun mannequin commented Jun 17, 2011

    I was wrong about exit behavior of a process that has put to a queue -- it seems to behave as expected. i had been playing with a proxy to a queue rather than a queue (to which, if you put, the process can exit right away because the actual put happens in the process that owns the queue).

    I think this works as intended, but lmk. Also, I haven't really played with the tests that much, so that bit could use some review. It hasn't broken anything in any of my real world tests though.

    Also, have I mentioned that the multiprocessing module is amazing? Cause it is. I sort of feel like antigravity == multiprocessing ...

    @underrun
    Copy link
    Mannequin

    underrun mannequin commented Jan 25, 2012

    Any chance this patch will be accepted (or at least reviewed) soon?

    @underrun underrun mannequin added type-crash A hard crash of the interpreter, possibly with a core dump and removed type-bug An unexpected behavior, bug, or error labels Jan 25, 2012
    @jimjjewett
    Copy link
    Mannequin

    jimjjewett mannequin commented Jan 25, 2012

    The wording in 138415 suggested this patch was changing socket to not support timeouts -- which would be unacceptable.

    But the actual patch only seems to touch multiprocessing/connection.py -- a far more reasonable change.

    Unfortunately, the patch no longer applies to the development tip. I *think* the places you wanted to change are still there, and just moved.

    (1) Is it sufficiently clear that this is not-a-feature to justify a backport?

    (2) Are the problems already fixed by some of the other changes? (It doesn't look like it, but I'm not sure.)

    (3) Can you produce an updated patch? (The current tip is http://hg.python.org/cpython/file/fec45282dc28/Lib/multiprocessing/connection.py )

    (4) If I understand the intent, then s.setblocking(True) would be slightly more clear than s.settimeout(None), though that change obviously isn't essential.

    @underrun
    Copy link
    Mannequin

    underrun mannequin commented Jul 24, 2012

    Thanks, Jim, here is an updated patch.

    1. I feel like it is clearly not-a-feature. Currently 2.7 crashes if remote managers are used and socket.setdefaulttimeout is anything other than None. Crashing seems bad and all this does is keep multiprocessing connection sockets non-blocking even if a default timeout is specified (so it maintains current behavior rather than crashing).

    2. This problem is still evident on 2.7, 3.2 and 3.3 beta 1. This patch is against the current dev tip as of a few days ago.

    3. here it is!

    4. I agree that setblocking is more clear. I made the change.

    My test modifications cover the entire suite twice, once without a default timeout and once with. This may be excessive? I'm not sure where non-blocking sockets might pop up as an issue since there is C code that relies on blocking sockets and I haven't dug that deep.

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Jul 27, 2012

    New changeset 4e85e4743757 by Richard Oudkerk in branch '2.7':
    Issue bpo-6056: Make multiprocessing use setblocking(True) on the sockets it uses.
    http://hg.python.org/cpython/rev/4e85e4743757

    New changeset 290f04722be3 by Richard Oudkerk in branch '3.2':
    Issue bpo-6056: Make multiprocessing use setblocking(True) on the sockets it uses.
    http://hg.python.org/cpython/rev/290f04722be3

    New changeset f03839401420 by Richard Oudkerk in branch 'default':
    Issue bpo-6056: Make multiprocessing use setblocking(True) on the sockets it uses.
    http://hg.python.org/cpython/rev/f03839401420

    @sbt sbt mannequin closed this as completed Jul 29, 2012
    @sbt sbt mannequin added type-bug An unexpected behavior, bug, or error and removed type-crash A hard crash of the interpreter, possibly with a core dump labels Jul 29, 2012
    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    0 participants