[2.7] subprocess.call fails with unicode strings in command line #45239

mclausch · 2007-07-24T18:24:11Z

BPO	1759845
Nosy	@terryjreedy, @amauryfa, @tjguk, @Safihre
Files	CreateProcessW.patch Python-2.5.2-subprocess.patch: Alternate Python-only patch

^{Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.}

Show more details

GitHub fields:

assignee = 'https://github.com/tjguk'
closed_at = <Date 2010-08-04.02:48:10.483>
created_at = <Date 2007-07-24.18:24:11.000>
labels = ['type-feature', 'library']
title = '[2.7] subprocess.call fails with unicode strings in command line'
updated_at = <Date 2017-10-04.10:00:39.706>
user = 'https://bugs.python.org/mclausch'

bugs.python.org fields:

activity = <Date 2017-10-04.10:00:39.706>
actor = 'vstinner'
assignee = 'tim.golden'
closed = True
closed_date = <Date 2010-08-04.02:48:10.483>
closer = 'terry.reedy'
components = ['Library (Lib)']
creation = <Date 2007-07-24.18:24:11.000>
creator = 'mclausch'
dependencies = []
files = ['9580', '11674']
hgrepos = []
issue_num = 1759845
keywords = ['patch']
message_count = 16.0
messages = ['32546', '32547', '32548', '63176', '74142', '87566', '87580', '87597', '87605', '112739', '112767', '112825', '112835', '112854', '113288', '303677']
nosy_count = 12.0
nosy_names = ['terry.reedy', 'amaury.forgeotdarc', 'gregcouch', 'andersjm', 'ocean-city', 'mclausch', 'brotch', 'tim.golden', 'kcwu', 'jnoller', 'xianyiteng', 'Safihre']
pr_nums = []
priority = 'normal'
resolution = 'out of date'
stage = None
status = 'closed'
superseder = None
type = 'enhancement'
url = 'https://bugs.python.org/issue1759845'
versions = ['Python 2.7']

mclausch · 2007-07-24T18:24:11Z

On Windows, subprocess.call() fails with an exception if either the executable or any of the arguments contain upper level characters. See below:

>> cmd = [ u'test_\xc5_exec.bat', u'arg1', u'arg2' ]
>> subprocess.call(cmd)

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Python25\lib\subprocess.py", line 443, in call
    return Popen(*popenargs, **kwargs).wait()
  File "C:\Python25\lib\subprocess.py", line 593, in __init__
    errread, errwrite)
  File "C:\Python25\lib\subprocess.py", line 815, in _execute_child
    startupinfo)
UnicodeEncodeError: 'ascii' codec can't encode character u'\xc5' in position 5: ordinal not in range(128)

brotch · 2007-08-05T08:36:16Z

Python's default character coding is 'ascii' which can't convert unicode > 127 into chars.

Forcing the unicode string to encode as 'iso-8859-1'

eg.
subprocess.call(cmd.encode('iso-8859-1'))

resolves the problem and runs the correct command.

mclausch · 2007-08-20T21:12:53Z

Sorry, I should have been more specific. I'm looking for a general solution, not just one for characters in iso-8859-1. For instance, I need to execute a subprocess where the executable or the arguments may contain Japanese characters.

So another example would be:
cmd = [ u'test_\u65e5\u672c\u8a9e_exec.bat', u'arg1', u'arg2' ]
subprocess.call(cmd)

ocean-city · 2008-03-02T07:58:14Z

I tried to fix this problem using CreateProcessW.
(environment variables are still ANSI)

I don't know Python C API well, maybe I'm doing
something wrong. (I confirmed test_subprocess.py
passes)

gregcouch · 2008-10-01T19:40:51Z

We're having the same problem. My quick fix was to patch subprocess.py
so the command line and executable are converted to the filesystem
encoding (mbcs).

kcwu · 2009-05-11T07:56:28Z

ocrean-city's patch applied cleanly with trunk and it works for me.
Could anybody review and commit? I could help if any refinement required.

amauryfa · 2009-05-11T17:26:26Z

The first patch will introduce regressions for strings that cannot be
decoded with the filesystem encoding. It is necessary to provide a
fallback to the CreateProcessA function.

I'd prefer the python-only patch, except for the "sys=sys" argument to
the function. Is it really needed?

gregcouch · 2009-05-12T00:41:06Z

I like the C patch better. It only tries to decode non-unicode objects
with the filesystem (mbcs) encoding. This fits in with Python 3.0
perfectly where all strings are unicode. In 2.5, strings are assumed to
be in the mbcs encoding, to match the Windows ANSI API, so decoding
those with the mbcs encoding shouldn't alter the set of acceptable
strings (which is what the C patch is doing if I read the code correctly).

kcwu · 2009-05-12T03:33:21Z

There is slight difference between C and python patch.
C version: convert mbcs argument to unicode
py version: convert unicode argument to mbcs

Actually, python version patch may not work if the string is unicode and
cannot encoded by mbcs. For example, my windows system is Chinese
(cp950) and the program I want to execute contains Japanese characters.
Encode Japanese characters with mbcs (in this case, it is cp950) will
fail. This is also what Matt (mclausch) said.

On the other hand, the C version patch. I don't think fall-back is
necessary. If the string is failed to convert from mbcs to unicode, it
will be eventually failed inside CreateProcessA() because CreateProcessA
internally (after win2k) will try to convert from mbcs to unicode and
call CreateProcessW.

terryjreedy · 2010-08-04T02:48:10Z

I fail to see why subprocess.call(cmd.encode('whatever')) is not a general solution. Auto-encoding strikes me as wrong. Someone who wants that should write their own wrapper. In any case, 2.7 is out and closed to new features, while 3.x fixes this and numerous other unicode issues.

kcwu · 2010-08-04T07:11:29Z

I fail to see why subprocess.call(cmd.encode('whatever')) is not a general solution.
Because 'whatever' encoding doesn't exist.

Assume cmd contains Japanese characters and my system is Chinese windows. subprocess.call expect the argument is encoded in mbcs, which is cp950. However, cp950 encoding doesn't contain Japanese characters.

subprocess.call(cmd.encode('cp950')) will fail because cp950 doesn't contain Japanese characters.
subprocess.call(cmd.encode('cp932')) will fail because subprocess.call will decode fail or incorrectly.

terryjreedy · 2010-08-04T15:56:03Z

Thanks for the simple explanation.

gregcouch · 2010-08-04T17:28:45Z

So Terry, can you reopen this bug then? It's not out of date.

terryjreedy · 2010-08-04T18:35:59Z

I will not reopen this now for the reasons I already stated after "In any case ...". To expand on that.

2.7 is in maintenance (bug-fix only) mode and I view this a feature request. To persuade someone otherwise, quote some doc that clearly says subprocess should behave as requested. I nosy-ed Jesse Noller so he can contradict me if he wishes.
The underlying issue seems to be the use of limited encodings, which was and is being fixed as well as possible in 3.x. Since there has been no mention of this issue being a problem with subprocess in 3.1, I presume there is none. If there is, say so and I will reopen.

The discussion shows disagreement on both the goal and approach to change. I am dubious that there will be an acceptable general solution. Even if this is persuasively seen as a bug and there is a good patch, I am dubious that any of the current developers will want to spent the necessary time to properly review a workaround to an issue that was already fixed the right way in 3.x.

tjguk · 2010-08-08T17:26:02Z

To confirm the situation on 3.x: a unicode string with non-ascii-encodable characters is fine. The easy test here in the uk is a pound sign:

<code>
import subprocess

FILENAME = "abc£.bat"
FILENAME.encode ("ascii")
#
# UnicodeEncodeError
#
with open (FILENAME, "w") as f:
  f.write ("echo hello\n")

subprocess.call ([FILENAME])

# "hello" output as expected

</code>

So no action for 3.x. I'm sympathetic (in principle) to making a change to 2.7 but I haven't looked over the "competing" patches and assessed the ins-and-outs.

Safihre · 2017-10-04T09:12:44Z

Although this issue is very old, in case anyone else like us need this functionality I created a package that implements the proposed C-fix.
https://pypi.python.org/pypi/subprocessww
Simply "import subprocessww" and POpen is patched. We tested it and it does the job pretty well, haven't run into special situations yet.

We really want to upgrade our app to Python 3, but currently lack the manpower to go over our app line by line. It's not a simple 2to3 conversion, unfortunately.

mclausch mannequin added stdlib Python modules in the Lib dir labels Jul 24, 2007

terryjreedy closed this as completed Aug 4, 2010

terryjreedy added the type-feature A feature request or enhancement label Aug 4, 2010

terryjreedy closed this as completed Aug 4, 2010

terryjreedy added the type-feature A feature request or enhancement label Aug 4, 2010

tjguk assigned tjguk Aug 8, 2010

vstinner changed the title ~~subprocess.call fails with unicode strings in command line~~ [2.7] subprocess.call fails with unicode strings in command line Oct 4, 2017

ezio-melotti transferred this issue from another repository Apr 10, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[2.7] subprocess.call fails with unicode strings in command line #45239

[2.7] subprocess.call fails with unicode strings in command line #45239

mclausch mannequin commented Jul 24, 2007

mclausch mannequin commented Jul 24, 2007

brotch mannequin commented Aug 5, 2007

mclausch mannequin commented Aug 20, 2007

ocean-city mannequin commented Mar 2, 2008

gregcouch mannequin commented Oct 1, 2008

kcwu mannequin commented May 11, 2009

amauryfa commented May 11, 2009

gregcouch mannequin commented May 12, 2009

kcwu mannequin commented May 12, 2009

terryjreedy commented Aug 4, 2010

kcwu mannequin commented Aug 4, 2010

terryjreedy commented Aug 4, 2010

gregcouch mannequin commented Aug 4, 2010

terryjreedy commented Aug 4, 2010

tjguk commented Aug 8, 2010

Safihre mannequin commented Oct 4, 2017

[2.7] subprocess.call fails with unicode strings in command line #45239

[2.7] subprocess.call fails with unicode strings in command line #45239

Comments

mclausch mannequin commented Jul 24, 2007

mclausch mannequin commented Jul 24, 2007

brotch mannequin commented Aug 5, 2007

mclausch mannequin commented Aug 20, 2007

ocean-city mannequin commented Mar 2, 2008

gregcouch mannequin commented Oct 1, 2008

kcwu mannequin commented May 11, 2009

amauryfa commented May 11, 2009

gregcouch mannequin commented May 12, 2009

kcwu mannequin commented May 12, 2009

terryjreedy commented Aug 4, 2010

kcwu mannequin commented Aug 4, 2010

terryjreedy commented Aug 4, 2010

gregcouch mannequin commented Aug 4, 2010

terryjreedy commented Aug 4, 2010

tjguk commented Aug 8, 2010

Safihre mannequin commented Oct 4, 2017