Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

subprocess: support bytes program name (POSIX) #52759

Closed
vstinner opened this issue Apr 23, 2010 · 21 comments
Closed

subprocess: support bytes program name (POSIX) #52759

vstinner opened this issue Apr 23, 2010 · 21 comments
Labels
stdlib Python modules in the Lib dir topic-unicode

Comments

@vstinner
Copy link
Member

BPO 8513
Nosy @gpshead, @vstinner
Dependencies
  • bpo-8514: Add fsencode() functions to os module
  • bpo-8603: Create a bytes version of os.environ and getenvb()
  • Files
  • subprocess_bytes_program-5.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2010-05-18.17:25:07.128>
    created_at = <Date 2010-04-23.22:42:05.444>
    labels = ['library', 'expert-unicode']
    title = 'subprocess: support bytes program name (POSIX)'
    updated_at = <Date 2011-03-03.12:55:08.983>
    user = 'https://github.com/vstinner'

    bugs.python.org fields:

    activity = <Date 2011-03-03.12:55:08.983>
    actor = 'vstinner'
    assignee = 'none'
    closed = True
    closed_date = <Date 2010-05-18.17:25:07.128>
    closer = 'vstinner'
    components = ['Library (Lib)', 'Unicode']
    creation = <Date 2010-04-23.22:42:05.444>
    creator = 'vstinner'
    dependencies = ['8514', '8603']
    files = ['17379']
    hgrepos = []
    issue_num = 8513
    keywords = ['patch', 'needs review']
    message_count = 21.0
    messages = ['104059', '104061', '104065', '104208', '104922', '105169', '105170', '105262', '105276', '105282', '105322', '105365', '105370', '105843', '105882', '105888', '105890', '105891', '105962', '105991', '129964']
    nosy_count = 3.0
    nosy_names = ['gregory.p.smith', 'vstinner', 'Arfrever']
    pr_nums = []
    priority = 'normal'
    resolution = 'fixed'
    stage = None
    status = 'closed'
    superseder = None
    type = None
    url = 'https://bugs.python.org/issue8513'
    versions = ['Python 3.2']

    @vstinner
    Copy link
    Member Author

    While fixing bpo-8391, I realized that subprocess doesn't support bytes program name if it's not an absolute path:
    -------

    $ ./python
    >>> import subprocess
    >>> subprocess.call([b'echo'])
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "/home/SHARE/SVN/py3k/Lib/subprocess.py", line 449, in call
        return Popen(*popenargs, **kwargs).wait()
      File "/home/SHARE/SVN/py3k/Lib/subprocess.py", line 681, in __init__
        restore_signals, start_new_session)
      File "/home/SHARE/SVN/py3k/Lib/subprocess.py", line 1116, in _execute_child
        for exe in executable_list)
      File "/home/SHARE/SVN/py3k/Lib/subprocess.py", line 1115, in <genexpr>
        executable_list = tuple(fs_encode(exe)
      File "/home/SHARE/SVN/py3k/Lib/subprocess.py", line 1114, in <genexpr>
        for dir in path_list)
      File "/home/SHARE/SVN/py3k/Lib/posixpath.py", line 75, in join
        if b.startswith(sep):
    TypeError: expected an object with the buffer interface
    [62826 refs]

    I'm working on a patch.

    @vstinner vstinner added stdlib Python modules in the Lib dir topic-unicode labels Apr 23, 2010
    @vstinner
    Copy link
    Member Author

    Attached patch (draft version) fixes this issue.

    @vstinner
    Copy link
    Member Author

    I proposed the creation of fs_encode() and fs_decode() functions in os.path: see bpo-8514. There functions would simplify my patch, but also make it correct on all OS (Windows, Mac, Linux).

    @vstinner
    Copy link
    Member Author

    My patch changes:

    • os._execvpe(): support bytes type for the file argument (program name)
    • os.get_exec_path(): support bytes type for the PATH environment variable
    • Popen._execute_child(): decode the executable name before encoding the arguments (if the program name is not an absolute path)

    @vstinner vstinner changed the title subprocess: support bytes program name subprocess: support bytes program name (POSIX) Apr 30, 2010
    @vstinner
    Copy link
    Member Author

    vstinner commented May 4, 2010

    Bytes program name problem should be splitted in two parts:

    (a) subprocess.call([b'env']) and subprocess.call([b'env'], env={'PATH': '/usr/bin'}): bytes program and unicode environ
    (b) bytes program and bytes environ

    Part (a)
    --------

    (a) would benefit from the creation of os(.path).fsencode() function (issue bpo-8514).

    Part (b)
    --------

    If the creation of os.environb is accepted (bpo-8603), I think that subprocess should also be modified to support pure bytes environ. Mix bytes and unicode variables in env would lead to mojibake and headaches.

    So I propose the creation of an "envb" (environment bytes) argument to Popen constructor. (b) would be written as:

       subprocess.call([b'env'], envb={b'PATH': b'/usr/bin'})

    or

       envb = os.environb.copy()
       envb[b'PATH'] = b'/usr/bin'
       subprocess.call([b'env'], envb=envb)

    (and it will not be possible to define env and envb at the same time)

    In this case, we will need a pure bytes version of os.get_exec_path(), and os._execvpe() should have a "bytes environment" mode (add an optional argument).

    --

    I have a patch fixing both problems, parts (a) and (b), but the patch depends on os.environb. I prefer to wait until os.environb is accepted to submit my patch.

    @vstinner
    Copy link
    Member Author

    vstinner commented May 6, 2010

    New patch (issue8513_partA.patch):

    • don't *decode*, only encode (str->bytes)
    • only patch os._execvpe() for POSIX

    @vstinner
    Copy link
    Member Author

    vstinner commented May 6, 2010

    If the creation of os.environb is accepted (bpo-8603), I think that
    subprocess should also be modified to support pure bytes environ.

    I fixed bpo-8603 and opened bpo-8640 for subprocess and envb.

    @gpshead
    Copy link
    Member

    gpshead commented May 8, 2010

    I think your partA patch makes sense.

    It would benefit from fsencode/fsdecode functions rather than manually doing the 'surrogateescape' thing everywhere.

    Also, could you add a unittest for os._execvpe to test its behavior?

    @vstinner
    Copy link
    Member Author

    vstinner commented May 8, 2010

    I think your partA patch makes sense.

    I can fix part A and B in two commits.

    It would benefit from fsencode/fsdecode functions rather
    than manually doing the 'surrogateescape' thing everywhere.

    I choosed to drop the idea of fsdecode() (patch for part A doesn't decode bytes anymore, it only encodes str). bpo-8514 has now a short and simple patch. I'm waiting for the final decision on bpo-8514 to commit the part A.

    Also, could you add a unittest for os._execvpe to test its behavior?

    os._execvpe() is a protected function. issue8513_partA.patch includes a test for subprocess. test_subprocess in two twices: with _posixsubprocess (C module) and with the pure Python implementation. The pure Python implementation calls os._execvpe(), that's why I patched also this function in my patch ;-)

    @vstinner
    Copy link
    Member Author

    vstinner commented May 8, 2010

    Update the patch to use os.fsencode().

    @gpshead
    Copy link
    Member

    gpshead commented May 8, 2010

    Build on the os._execvpe unittest I added in py3k r81001. Protected functions are perfectly fine things to unittest.

    @vstinner
    Copy link
    Member Author

    vstinner commented May 9, 2010

    Build on the os._execvpe unittest I added in py3k r81001.

    The test fails on Windows.

    ======================================================================
    FAIL: test_internal_execvpe (test.test_os.ExecTests)
    ----------------------------------------------------------------------

    Traceback (most recent call last):
      File "D:\cygwin\home\db3l\buildarea\3.x.bolen-windows\build\lib\test\test_os.py", line 683, in test_internal_execvpe
        exec_stubbed.calls)
    AssertionError: Lists differ: [('execve', '/p/f', (['-a'], {... != [('execve', '/p\\f', (['-a'], ...

    First differing element 0:
    ('execve', '/p/f', (['-a'], {'spam': 'beans'}))
    ('execve', '/p\\f', (['-a'], {'spam': 'beans'}))

    • [('execve', '/p/f', (['-a'], {'spam': 'beans'})),
      ? ^

    + [('execve', '/p\\f', (['-a'], {'spam': 'beans'})),
    ? ^^

    • ('execve', '/pp/f', (['-a'], {'spam': 'beans'}))]
      ? ^

    + ('execve', '/pp\\f', (['-a'], {'spam': 'beans'}))]
    ? ^^

    @gpshead
    Copy link
    Member

    gpshead commented May 9, 2010

    my bad. hopefully r81019 fixes that.

    @vstinner
    Copy link
    Member Author

    New patch fixing this issue:

    • os.get_exec_path() type now depends on the OS: str on Windows, bytes on Unix
    • os.get_exec_path(None) uses os.environ on Windows, os.environb on Unix
    • os.get_exec_path(env) uses 'PATH' or b'PATH' key, but raise a ValueError if both keys exist
    • add os.supports_bytes_environ flag (boolean)
    • os._execvpe() and subprocess._execute_child() canonicalize the program to bytes
    • test "not path.supports_unicode_filenames" to check if fsencode() should be defined and used (instead of testing name != "nt")

    I'm not proud of the change on os.get_exec_path() result type, I'm not sure that it's the right thing to do. But at least, the patch works :-)

    @vstinner
    Copy link
    Member Author

    Oops, I forgot to add the new patch: subprocess_bytes_program-3.patch.

    @vstinner
    Copy link
    Member Author

    I asked on #python-dev about os.get_exec_path() result type. As expected, the answer was "It's a really bad idea".

    So here is a new version of my patch. Summary of the patch version 4:

    • subprocess.Popen() and os._execvpe() support bytes program name
    • os.get_exec_path() now supports b'PATH' key and bytes value
    • add os.supports_bytes_environ flag: "True if the native OS type of the environment is bytes (eg. False on Windows)"

    Changes since the version 3:

    • document the new os.supports_bytes_environ flag
    • os.get_exec_path() result type is always str (decode bytes to str using sys.getfilesystemencoding() + surrogateescape)
    • path.supports_unicode_filenames is False on Windows 9x but Windows 9x native type is unicode and so fsencode() should not be defined on this OS: revert the fsencode() test (if not path.supports_unicode_filenames => if name != 'nt')

    @vstinner
    Copy link
    Member Author

    I forgot to update test_os: patch version 5.

    @vstinner
    Copy link
    Member Author

    Ok, everything is ready for this issue: os.environb and os.fsencode() are commited, and test_os is prepared to test os._execvpe() change. I'm just waiting for a review.

    Execpt the change on os.get_exec_path(), the patch is now simple.

    @vstinner
    Copy link
    Member Author

    Hum, os.get_exec_path() has no test for the new features (support b'PATH' key and bytes value).

    @vstinner
    Copy link
    Member Author

    Fixed by r81291 + r81292 (py3k).

    The final commit contains much more tests ;-) I will watch the buildbot next hours and block the commit in 3.1.

    @vstinner
    Copy link
    Member Author

    vstinner commented Mar 3, 2011

    Oh, I forget subprocess.call(b'ls'): command as a byte string => fixed in Python 3.3 (r88720).

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    This issue was closed.
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    stdlib Python modules in the Lib dir topic-unicode
    Projects
    None yet
    Development

    No branches or pull requests

    2 participants