Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

multiprocessing.Queue.get is not getting all the items in the queue #67770

Closed
RobertoMartnez mannequin opened this issue Mar 4, 2015 · 5 comments
Closed

multiprocessing.Queue.get is not getting all the items in the queue #67770

RobertoMartnez mannequin opened this issue Mar 4, 2015 · 5 comments
Labels
type-bug An unexpected behavior, bug, or error

Comments

@RobertoMartnez
Copy link
Mannequin

RobertoMartnez mannequin commented Mar 4, 2015

BPO 23582
Nosy @bitdancer, @applio
Files
  • mpqueuegetwrong.py: Example of the wrong behavior. The process should get 100 items but only a few can be retrieved.
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2015-03-04.16:04:33.422>
    created_at = <Date 2015-03-04.10:15:48.547>
    labels = ['type-bug', 'invalid']
    title = 'multiprocessing.Queue.get is not getting all the items in the queue'
    updated_at = <Date 2015-03-04.21:33:34.856>
    user = 'https://bugs.python.org/RobertoMartnez'

    bugs.python.org fields:

    activity = <Date 2015-03-04.21:33:34.856>
    actor = 'Roberto Mart\xc3\xadnez'
    assignee = 'none'
    closed = True
    closed_date = <Date 2015-03-04.16:04:33.422>
    closer = 'r.david.murray'
    components = []
    creation = <Date 2015-03-04.10:15:48.547>
    creator = 'Roberto Mart\xc3\xadnez'
    dependencies = []
    files = ['38327']
    hgrepos = []
    issue_num = 23582
    keywords = []
    message_count = 5.0
    messages = ['237178', '237189', '237204', '237211', '237214']
    nosy_count = 5.0
    nosy_names = ['r.david.murray', 'sbt', 'davin', 'alexei.romanov', 'Roberto Mart\xc3\xadnez']
    pr_nums = []
    priority = 'normal'
    resolution = 'not a bug'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue23582'
    versions = ['Python 2.7', 'Python 3.4']

    @RobertoMartnez
    Copy link
    Mannequin Author

    RobertoMartnez mannequin commented Mar 4, 2015

    We face yesterday a bug in a project, and after a few hours of investigation we found a bad/not documented behavior in multiprocessing.Queue.

    If you put one or more items in a queue and if the items are large, there is a delay between the put is executed and the item is finally available in the queue. This is reasonable because the underlying thread and the pipe, but the problem is not this.

    The problem is that Queue.qsize() is reporting the number of items put, Queue.empty() is returning True, and Queue.get() is raising Empty.

    So, the only safe method to get all the items is as follows:

    while q.qsize():                                                         
        try:                                                                 
            item = q.get_nowait()                                                   
        except Empty:                                                        
            pass

    Which is not very nice.

    I attach a sample file reproducing the behavior, a single process put 100 elements in a Queue and after that it tries to get all of them, I tested in python 2.7.9 and 3.4 with the same result (seems python3 is a little bit faster and you need to enlarge the items). Also if you wait between get's, the process is able to retrieve all the items.

    @RobertoMartnez RobertoMartnez mannequin added the type-bug An unexpected behavior, bug, or error label Mar 4, 2015
    @bitdancer
    Copy link
    Member

    This is the documented behavior. "qsize: Return the approximate size of the queue. Note, qsize() > 0 doesn't guarantee that a subsequent get() will not block, nor will qsize() < maxsize guarantee that put() will not block."

    It looks like what you want here is get with a timeout, not get_nowait.

    @RobertoMartnez
    Copy link
    Mannequin Author

    RobertoMartnez mannequin commented Mar 4, 2015

    I think you misunderstood my explanation. My english is not very good, sorry.

    I think that my previously attached file (mpqueuegetwrong.py) shows it better than my words. In this file you can see that I am not calling qsize nor get_nowait, only put and get. And the behavior I am reporting is only related to those methods (and I think it is not documented).

    @bitdancer
    Copy link
    Member

    You are calling q.empty when draining the queue. That has the same documented caution. Since you are using get in that loop, in that case you *want* to be calling qsize, just like you do when you drain it at the end.

    @RobertoMartnez
    Copy link
    Mannequin Author

    RobertoMartnez mannequin commented Mar 4, 2015

    That's not my point.

    To my understanding, when you put a item in the queue the item *must* be available to get in the next call. So I think put should block until the item is really in the queue so when you call get it return *always* and not some times (depending on the item size).

    If this is not the expected behavior I think it should be warned in the put/get method.

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    1 participant