Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

imaplib.IMAP4_stream subprocess is opened unbuffered but ignores short reads #61645

Closed
gpshead opened this issue Mar 17, 2013 · 10 comments
Closed
Labels
stdlib Python modules in the Lib dir

Comments

@gpshead
Copy link
Member

gpshead commented Mar 17, 2013

BPO 17443
Nosy @gpshead, @pitrou, @bitdancer
Files
  • imaplib-buff.patch: small patch to force bufffering imaplib popen streams
  • imaplib-bufsize.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2013-03-19.18:03:46.630>
    created_at = <Date 2013-03-17.05:27:19.240>
    labels = ['library']
    title = 'imaplib.IMAP4_stream subprocess is opened unbuffered but ignores short reads'
    updated_at = <Date 2013-03-19.18:03:46.629>
    user = 'https://github.com/gpshead'

    bugs.python.org fields:

    activity = <Date 2013-03-19.18:03:46.629>
    actor = 'r.david.murray'
    assignee = 'none'
    closed = True
    closed_date = <Date 2013-03-19.18:03:46.630>
    closer = 'r.david.murray'
    components = ['Library (Lib)']
    creation = <Date 2013-03-17.05:27:19.240>
    creator = 'gregory.p.smith'
    dependencies = []
    files = ['29434', '29473']
    hgrepos = []
    issue_num = 17443
    keywords = ['patch']
    message_count = 10.0
    messages = ['184363', '184364', '184372', '184373', '184388', '184417', '184594', '184600', '184651', '184652']
    nosy_count = 5.0
    nosy_names = ['gregory.p.smith', 'pitrou', 'r.david.murray', 'python-dev', 'detrout']
    pr_nums = []
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = None
    url = 'https://bugs.python.org/issue17443'
    versions = ['Python 3.2', 'Python 3.3', 'Python 3.4']

    @gpshead
    Copy link
    Member Author

    gpshead commented Mar 17, 2013

    imaplib.IMAP4_stream subprocess is opened unbuffered but ignores short reads when reading the message body. Depending on timing, message body size and kernel pipe buffer size and phase of the moon and whether you're debugging the thing or not... It can fail to read the entire message body before wrongly assuming it has and attempting to read the terminating b')\r\n' of the IMAP protocol.

    Bug discovered during a debugging session at the PyCon 2013 Python 3 Porting Clinic BOF.

    @gpshead
    Copy link
    Member Author

    gpshead commented Mar 17, 2013

    The error does not happen when running the same code under 2.7, despite the same default bufsize=0 subprocess behavior. This is likely due to differences in the Python 2.x old style io library when os.fdopen(fd, 'rb', bufsize) is used vs 3.x when io.open(fd, 'rb', bufsize) is used for Popen.stdout.

    One workaround is to add a non-zero bufsize to the subprocess.Popen call in imaplib.IMAP4_stream.

    I'm not sure if subprocess should be updated or if subprocess's docs on what it means for a pipe to be unbuffered (read(n) is a single syscall rather than a loop until n bytes or EOF) should be updated.

    @pitrou
    Copy link
    Member

    pitrou commented Mar 17, 2013

    os.fdopen() in 2.x would always create a FILE*, and therefore inherit fread()'s semantics even in "unbuffered" mode. In 3.x, unbuffered I/O instead calls read() directly, and happily returns partial reads; this is by design.

    So, I guess imaplib should be fixed :-)

    @pitrou pitrou added the stdlib Python modules in the Lib dir label Mar 17, 2013
    @pitrou
    Copy link
    Member

    pitrou commented Mar 17, 2013

    I don't think there's any reason to open the subprocess in unbuffered mode (you aren't sharing the stdio streams with anyone else). Just be careful to call flush() on stdin before attempting to read any response from stdout.

    @gpshead
    Copy link
    Member Author

    gpshead commented Mar 17, 2013

    Yes imaplib can be fixed pretty easily and should use buffered IO regardless.

    I'm pondering if the default behavior of subprocess needs fixing as
    existing python 2.x code being ported to 3 doesn't expect this changed
    behavior of the PIPE file objects. It probably does.

    Thankfully my subprocess32 backport on 2.x doesn't suffer from the
    problem as it still uses os.fdopen.

    @detrout
    Copy link
    Mannequin

    detrout mannequin commented Mar 18, 2013

    So as a first stab at fixing this. I modified imaplib to wrap the process.stdin / process.stdout from with io.BufferedWriter / io.BufferedReader. I didn't use the TextIOWrapper as the imaplib wanted to work with the raw \r\n.

    The change seems to have fixed the problem I was having, I also checked out 82724:ef8ea052bcc4 and tried running "./python -m test -j3 " before and after the buffer wrapping and it didn't seem to trigger any test case failures.

    @detrout
    Copy link
    Mannequin

    detrout mannequin commented Mar 19, 2013

    After bumping into r.david.murray in the elevator I got the impression setting the bufsize argument to the Popen call would be a better idea.

    I found that BufferedReader/Writer were using a DEFAULT_BUFFER_SIZE set somewhere in the c part of io. To cut down on magic numbers, this imaplib patch imports that constant and uses it on the Popen call.

    It doesn't seem to introduce test failures and still fixes the imap desynchronization problem seen at the porting clinic.

    @gpshead
    Copy link
    Member Author

    gpshead commented Mar 19, 2013

    that patch looks good for imaplib.

    i'll follow up on the subprocess side of things to see if the default
    behavior should be changed to better match what happened in 2.7 (or if not:
    to make sure the change in behavior is sufficiently documented and not
    relied on elsewhere in the stdlib)

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Mar 19, 2013

    New changeset c5aacf9d1cdc by R David Murray in branch '3.2':
    bpo-17443: Fix buffering in IMAP4_stream.
    http://hg.python.org/cpython/rev/c5aacf9d1cdc

    New changeset 0baa65b3ef76 by R David Murray in branch '3.3':
    Merge: bpo-17443: Fix buffering in IMAP4_stream.
    http://hg.python.org/cpython/rev/0baa65b3ef76

    New changeset 4c6463b96a2c by R David Murray in branch 'default':
    Merge: bpo-17443: Fix buffering in IMAP4_stream.
    http://hg.python.org/cpython/rev/4c6463b96a2c

    @bitdancer
    Copy link
    Member

    Thanks, Diane, and expecially thanks for finding this and helping is track down the cause.

    We need better test infrastructure for imap...because this occurs only during string litteral reads, I decided that making a test for this with our current imap test infrastructure just wasn't worth it.

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    stdlib Python modules in the Lib dir
    Projects
    None yet
    Development

    No branches or pull requests

    3 participants