Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP

Loading…

Win32 shlex #1064

Merged
merged 7 commits into from

3 participants

@jstenar
Collaborator

Suggested more complete fix for issue #592. Using ctypes to call a windows function for doing shlex like splitting.

Had to comment out the unicode strings in test_arg_split to get the tests to run (see #1063).

@takluyver
Owner

Note that we've got modules in utils named _process_win32 and _process_posix. That's probably the place to put platform specific logic like this.

IPython/utils/tests/test_process.py
((5 lines not shown))
def test_arg_split():
"""Ensure that argument lines are correctly split like in a shell."""
tests = [['hi', ['hi']],
[u'hi', [u'hi']],
['hello there', ['hello', 'there']],
- [u'h\N{LATIN SMALL LETTER A WITH CARON}llo', [u'h\N{LATIN SMALL LETTER A WITH CARON}llo']],
+# [u'h\N{LATIN SMALL LETTER A WITH CARON}llo', [u'h\N{LATIN SMALL LETTER A WITH CARON}llo']],
@takluyver Owner

I'd rather not comment out this test. I don't think it's essential to use a name escape: it should work with a \u escape.

@jstenar Collaborator
jstenar added a note
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@takluyver
Owner

Do we need a pure-python fallback? Is there any situation in which all this ctypes stuff could fail?

@fperez
Owner

@takluyver, you're our unicode guru; is this good to go or does it need any further work?

@takluyver
Owner

Well, it doesn't really change anything on the posix side (I'm assuming Jörgen just copied and pasted the arg_split function for the posix version, I haven't checked it character-for-character). The windows ctypes stuff doesn't mean much to me.

@jstenar: I'd just like to clarify: is there any version of Windows, any locale, or other setting, under which it could fail? I know that ctypes code can cause segfaults if it goes wrong, and I can't test it here. Also, is there any way to access this functionality through a library like pywin32 - even if we have this as a fallback when that's not installed? I'm a bit wary of relying on ctypes code.

@jstenar
Collaborator

As far as I know CommandLineToArgvW is not dependent on locales. It is however available only from Windows 2000 and forward. I could add a try/except guard to catch if it is missing and fall back to the old posix implementation (by copying it, because I can't import _process_posix on windows).

I could not find this function in pywin32.

@takluyver
Owner
@fperez
Owner

We're definitely not worrying about windows 98! XP and newer is more than enough of a cutoff, I think.

@jstenar
Collaborator

I moved the posix version of arg_split to _process_common and use that as a fallback in _process_win32 if CommandLineToArgvW. That way we will have a fallback and not just crash.

@fperez
Owner

Great, thanks! @takluyver, this is looking pretty baked out then, right? If both you and @jstenar are happy with it, merge away! @jstenar, thanks for the work :)

ps - double-check that any issues supposed to be closed do get closed, I've seen github recently flake out and not closing issues listed in commits.

@takluyver takluyver merged commit c72bbc7 into from
@takluyver
Owner

Great. Tested and merged. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Commits on Nov 29, 2011
  1. Special version of arg_split for win32

    J�rgen Stenarson authored
  2. Fixing some errors in the tests

    J�rgen Stenarson authored
  3. Splitting tests for getoutput_quoted into all and non-win32

    J�rgen Stenarson authored
  4. Merge branch 'get_opt_quoted-test' into win32-shlex

    J�rgen Stenarson authored
  5. Moved arg_split to _process_win32.py and _process_posix.py.

    J�rgen Stenarson authored
Commits on Nov 30, 2011
  1. Create fallback with old arg_split, incase CommandLineToArgvW is miss…

    J�rgen Stenarson authored
    …ing.
  2. Moving posix version of arg_split to _process_common.

    J�rgen Stenarson authored
This page is out of date. Refresh to see the latest.
View
25 IPython/utils/_process_common.py
@@ -15,6 +15,7 @@
# Imports
#-----------------------------------------------------------------------------
import subprocess
+import shlex
import sys
from IPython.utils import py3compat
@@ -143,3 +144,27 @@ def getoutputerror(cmd):
return '', ''
out, err = out_err
return py3compat.bytes_to_str(out), py3compat.bytes_to_str(err)
+
+
+def arg_split(s, posix=False):
+ """Split a command line's arguments in a shell-like manner.
+
+ This is a modified version of the standard library's shlex.split()
+ function, but with a default of posix=False for splitting, so that quotes
+ in inputs are respected."""
+
+ # Unfortunately, python's shlex module is buggy with unicode input:
+ # http://bugs.python.org/issue1170
+ # At least encoding the input when it's unicode seems to help, but there
+ # may be more problems lurking. Apparently this is fixed in python3.
+ is_unicode = False
+ if (not py3compat.PY3) and isinstance(s, unicode):
+ is_unicode = True
+ s = s.encode('utf-8')
+ lex = shlex.shlex(s, posix=posix)
+ lex.whitespace_split = True
+ tokens = list(lex)
+ if is_unicode:
+ # Convert the tokens back to unicode.
+ tokens = [x.decode('utf-8') for x in tokens]
+ return tokens
View
5 IPython/utils/_process_posix.py
@@ -23,7 +23,7 @@
# Our own
from .autoattr import auto_attr
-from ._process_common import getoutput
+from ._process_common import getoutput, arg_split
from IPython.utils import text
from IPython.utils import py3compat
@@ -192,3 +192,6 @@ def system(self, cmd):
# programs think they are talking to a tty and produce highly formatted output
# (ls is a good example) that makes them hard.
system = ProcessHandler().system
+
+
+
View
30 IPython/utils/_process_win32.py
@@ -18,11 +18,15 @@
# stdlib
import os
import sys
+import ctypes
+from ctypes import c_int, POINTER
+from ctypes.wintypes import LPCWSTR, HLOCAL
from subprocess import STDOUT
# our own imports
from ._process_common import read_no_interrupt, process_handler
+from . import py3compat
from . import text
#-----------------------------------------------------------------------------
@@ -146,3 +150,29 @@ def getoutput(cmd):
if out is None:
out = ''
return out
+
+try:
+ CommandLineToArgvW = ctypes.windll.shell32.CommandLineToArgvW
+ CommandLineToArgvW.arg_types = [LPCWSTR, POINTER(c_int)]
+ CommandLineToArgvW.res_types = [POINTER(LPCWSTR)]
+ LocalFree = ctypes.windll.kernel32.LocalFree
+ LocalFree.res_type = HLOCAL
+ LocalFree.arg_types = [HLOCAL]
+
+ def arg_split(commandline, posix=False):
+ """Split a command line's arguments in a shell-like manner.
+
+ This is a special version for windows that use a ctypes call to CommandLineToArgvW
+ to do the argv splitting. The posix paramter is ignored.
+ """
+ #CommandLineToArgvW returns path to executable if called with empty string.
+ if commandline.strip() == "":
+ return []
+ argvn = c_int()
+ result_pointer = CommandLineToArgvW(py3compat.cast_unicode(commandline.lstrip()), ctypes.byref(argvn))
+ result_array_type = LPCWSTR * argvn.value
+ result = [arg for arg in result_array_type.from_address(result_pointer)]
+ retval = LocalFree(result_pointer)
+ return result
+except AttributeError:
+ from ._process_common import arg_split
View
30 IPython/utils/process.py
@@ -22,9 +22,10 @@
# Our own
if sys.platform == 'win32':
- from ._process_win32 import _find_cmd, system, getoutput, AvoidUNCPath
+ from ._process_win32 import _find_cmd, system, getoutput, AvoidUNCPath, arg_split
else:
- from ._process_posix import _find_cmd, system, getoutput
+ from ._process_posix import _find_cmd, system, getoutput, arg_split
+
from ._process_common import getoutputerror
from IPython.utils import py3compat
@@ -103,31 +104,6 @@ def pycmd2argv(cmd):
else:
return [sys.executable, cmd]
-
-def arg_split(s, posix=False):
- """Split a command line's arguments in a shell-like manner.
-
- This is a modified version of the standard library's shlex.split()
- function, but with a default of posix=False for splitting, so that quotes
- in inputs are respected."""
-
- # Unfortunately, python's shlex module is buggy with unicode input:
- # http://bugs.python.org/issue1170
- # At least encoding the input when it's unicode seems to help, but there
- # may be more problems lurking. Apparently this is fixed in python3.
- is_unicode = False
- if (not py3compat.PY3) and isinstance(s, unicode):
- is_unicode = True
- s = s.encode('utf-8')
- lex = shlex.shlex(s, posix=posix)
- lex.whitespace_split = True
- tokens = list(lex)
- if is_unicode:
- # Convert the tokens back to unicode.
- tokens = [x.decode('utf-8') for x in tokens]
- return tokens
-
-
def abbrev_cwd():
""" Return abbreviated version of cwd, e.g. d:mydir """
cwd = os.getcwdu().replace('\\','/')
View
22 IPython/utils/tests/test_process.py
@@ -62,16 +62,32 @@ def test_find_cmd_fail():
nt.assert_raises(FindCmdError,find_cmd,'asdfasdf')
+@dec.skip_win32
def test_arg_split():
"""Ensure that argument lines are correctly split like in a shell."""
tests = [['hi', ['hi']],
[u'hi', [u'hi']],
['hello there', ['hello', 'there']],
- [u'h\N{LATIN SMALL LETTER A WITH CARON}llo', [u'h\N{LATIN SMALL LETTER A WITH CARON}llo']],
+ # \u01ce == \N{LATIN SMALL LETTER A WITH CARON}
+ # Do not use \N because the tests crash with syntax error in
+ # some cases, for example windows python2.6.
+ [u'h\u01cello', [u'h\u01cello']],
['something "with quotes"', ['something', '"with quotes"']],
]
for argstr, argv in tests:
nt.assert_equal(arg_split(argstr), argv)
+
+@dec.skip_if_not_win32
+def test_arg_split_win32():
+ """Ensure that argument lines are correctly split like in a shell."""
+ tests = [['hi', ['hi']],
+ [u'hi', [u'hi']],
+ ['hello there', ['hello', 'there']],
+ [u'h\u01cello', [u'h\u01cello']],
+ ['something "with quotes"', ['something', 'with quotes']],
+ ]
+ for argstr, argv in tests:
+ nt.assert_equal(arg_split(argstr), argv)
class SubProcessTestCase(TestCase, tt.TempFileMixin):
@@ -100,6 +116,10 @@ def test_getoutput(self):
def test_getoutput_quoted(self):
out = getoutput('python -c "print (1)"')
self.assertEquals(out.strip(), '1')
+
+ #Invalid quoting on windows
+ @dec.skip_win32
+ def test_getoutput_quoted2(self):
out = getoutput("python -c 'print (1)'")
self.assertEquals(out.strip(), '1')
out = getoutput("python -c 'print (\"1\")'")
Something went wrong with that request. Please try again.