Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

corrupted contents of stdout result from subprocess call under a trace #59210

Closed
techtonik mannequin opened this issue Jun 5, 2012 · 17 comments
Closed

corrupted contents of stdout result from subprocess call under a trace #59210

techtonik mannequin opened this issue Jun 5, 2012 · 17 comments
Labels
stdlib Python modules in the Lib dir

Comments

@techtonik
Copy link
Mannequin

techtonik mannequin commented Jun 5, 2012

BPO 15005
Nosy @amauryfa
Files
  • file2.py
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2013-03-22.13:07:43.036>
    created_at = <Date 2012-06-05.14:40:11.251>
    labels = ['library']
    title = 'corrupted contents of stdout result from subprocess call under a trace'
    updated_at = <Date 2013-03-22.22:18:17.322>
    user = 'https://bugs.python.org/techtonik'

    bugs.python.org fields:

    activity = <Date 2013-03-22.22:18:17.322>
    actor = 'techtonik'
    assignee = 'none'
    closed = True
    closed_date = <Date 2013-03-22.13:07:43.036>
    closer = 'amaury.forgeotdarc'
    components = ['Library (Lib)']
    creation = <Date 2012-06-05.14:40:11.251>
    creator = 'techtonik'
    dependencies = []
    files = ['25832']
    hgrepos = []
    issue_num = 15005
    keywords = []
    message_count = 17.0
    messages = ['162347', '162348', '162350', '162352', '162354', '162361', '162362', '162370', '162375', '178138', '184952', '184953', '184956', '184960', '184963', '184965', '185015']
    nosy_count = 2.0
    nosy_names = ['amaury.forgeotdarc', 'techtonik']
    pr_nums = []
    priority = 'normal'
    resolution = 'wont fix'
    stage = None
    status = 'closed'
    superseder = None
    type = None
    url = 'https://bugs.python.org/issue15005'
    versions = ['Python 2.7']

    @techtonik
    Copy link
    Mannequin Author

    techtonik mannequin commented Jun 5, 2012

    This code dumps a lot of internal source code info when executed with trace as:

    python -m trace --trace file2.py

    ---[file2.py]
    import subprocess

    def ret():
      output = subprocess.check_output(['hg', 'id', '-nib'])
      print( output )
      print( output )
      print( output.strip() )
      print( output.strip().split() )
    
    ret()

    Normally, the last line of the output is:
    ['e67793ec2995+', '2162+', 'default']

    But during trace call it is:
    ['subprocess.py(1180):', '_dup2(errwrite,', '2)', '---', 'modulename:', 'subprocess,', 'funcname:', '_dup2', 'subprocess.py(1174):', 'if', 'a', '==', 'b:', 'subprocess.py(1176):', 'elif', 'a', 'is', 'not', 'None:', 'subprocess.py(1184):', 'closed', '=', '{', 'None', '}',
    ...
    etc.

    @techtonik
    Copy link
    Mannequin Author

    techtonik mannequin commented Jun 5, 2012

    The behavior repeats with PyPy 1.8.0, and doesn't repeat with Python 3.

    @techtonik techtonik mannequin added the stdlib Python modules in the Lib dir label Jun 5, 2012
    @amauryfa
    Copy link
    Member

    amauryfa commented Jun 5, 2012

    Python2 has a pure python implementation of subprocess, with separate calls to fork() and exec(); so the output of the subprocess contains the trace of the forked Python interpreter, until the exec() system call.
    Python3 has a C implementation: _posixsubprocess.fork_exec() won't call the Python interpreter between fork() and exec(), and no trace is emitted.

    Yes, Python3 is better :)

    @techtonik
    Copy link
    Mannequin Author

    techtonik mannequin commented Jun 5, 2012

    OMG. =) Is it possible to fix it somehow? Postpone output collection until the very exec() call? Or provide a different stream for collecting output?

    @amauryfa
    Copy link
    Member

    amauryfa commented Jun 5, 2012

    It's not possible to delay output collection: output starts being collected just after the call os.dup2(cp2write, 1) and before exec(), we need to os.close it. The trace module will already have emitted some lines.

    Process output by definition goes to the C stdout, but you could redirect sys.stdout (the Python one, used by print) to something else, like a StringIO so that trace.py does not pollute the subprocess output.

    In 3.2, suprocess.py states that the pure Python implementation (the one use in 2.7) is not thread safe. We could add that it's not reentrant or trace-friendly as well... this is not surprising IMO.

    @techtonik
    Copy link
    Mannequin Author

    techtonik mannequin commented Jun 5, 2012

    Is it possible to pause trace and resume after the exec call? Between some missing instructions from subprocess internals and traceability of the Python programs I'd choose the latter.

    It can be even more actual for people tracing program execution in the process of porting to Python 3.

    @amauryfa
    Copy link
    Member

    amauryfa commented Jun 5, 2012

    Sure, just save/restore the trace function around your calls to subprocess.
    But I fail to see how a Python2-only issue can affect a Python3 port.

    @techtonik
    Copy link
    Mannequin Author

    techtonik mannequin commented Jun 5, 2012

    The fix for saving/restoring trace function belong to subprocess module. Python2 only issue will be actual when you have to port Python2 only app where it works ok to the Python3 where it doesn't work even if it executes successfully.

    @amauryfa
    Copy link
    Member

    amauryfa commented Jun 5, 2012

    Sorry, what does not work in Python3?
    And how can the trace module help you?

    @techtonik
    Copy link
    Mannequin Author

    techtonik mannequin commented Dec 25, 2012

    The trace module helps to gather program flow statistics and see the differences in patterns for large systems when program evolves. In particular, components ported to Python 3 should still behave the same way on Python 2.

    Right now the behavior under the trace is broken.

    @techtonik
    Copy link
    Mannequin Author

    techtonik mannequin commented Mar 22, 2013

    This use case is actual for various kind of asynchronous operations.

    @techtonik techtonik mannequin changed the title trace corrupts return result on chained execution corrupted output from subprocess call under a trace Mar 22, 2013
    @techtonik techtonik mannequin changed the title corrupted output from subprocess call under a trace corrupted contents of stdout result from subprocess call under a trace Mar 22, 2013
    @techtonik
    Copy link
    Mannequin Author

    techtonik mannequin commented Mar 22, 2013

    I can't repeat this on Windows. Looks like it is a Linux issue, because of forks.

    @amauryfa
    Copy link
    Member

    So, does this page report an issue with Python2, or Python3?

    @techtonik
    Copy link
    Mannequin Author

    techtonik mannequin commented Mar 22, 2013

    I can not repeat this neither on Python 2.7 nor on Python 3 on Windows Vista. Need to run this on a Linux system to confirm.

    @techtonik
    Copy link
    Mannequin Author

    techtonik mannequin commented Mar 22, 2013

    Python 3 is not affected. Python 2.7, Linux only.

    @amauryfa
    Copy link
    Member

    So the issue can be closed:

    • subprocess was rewritten in C on Python3 for good reasons: it's wrong (and dangerous) to run Python code between fork() and exec(); this bug report is another example.

    • This change is too large to be merged to 2.7.

    • There is a backport available on pypi: https://pypi.python.org/pypi/subprocess32
      Yes it's a distinct module, but it's possible to install it in place of the standard one if needed::

        import sys
        import subprocess32 as subprocess
        sys.modules['subprocess'] = subprocess

    @techtonik
    Copy link
    Mannequin Author

    techtonik mannequin commented Mar 22, 2013

    And how to find all such issues for Python 2 that people need to be aware of in this tracker?

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    stdlib Python modules in the Lib dir
    Projects
    None yet
    Development

    No branches or pull requests

    1 participant