Skip to content
This repository

optionally ignore shlex's ValueError in arg_split #1130

Closed
wants to merge 5 commits into from

1 participant

Min RK
Min RK
Owner

Sometimes we pass things that aren't really command-line args to arg_split, e.g:

%timeit python_code(" ")

These things probably shouldn't be parsed as command-line args, but they are.
So we should at least prevent them from raising, when shlex doesn't think they are valid shell args.

This commit protects this sort of thing from raising errors. Another
approach would be to use a different function when parsing things that
are clearly not command-line args (like Python code), but this works for now.

closes #1109

Min RK minrk referenced this pull request
Closed

Shlex unicode #1116

and others added some commits
Min RK add strict flag to arg_split, to optionally ignore shlex parse errors
Sometimes we pass things that aren't really command-line args to arg_split, e.g:

    %timeit python_code(" ")

This commit adds a `strict` flag, which defaults to the same raising behavior
as before.

Currently magic_timeit is the *only* place, we use strict=False, but it should
also be done in completions (PR #1116).

closes #1109
4f1e79b
Fixing shlex_split to return unicode on py2.x 5bd8972
Replaced shlex_split with arg_split from _process_common.
shlex_split was removed since it was a unicode unsafe version of
arg_split. Tests were added to test magic_run_completer.
aac154d
Min RK use arg_split(...strict=False) in module_completer 963dd69
Min RK add %run open-quote completerlib test fa99077
Min RK minrk referenced this pull request from a commit in minrk/ipython
Min RK Merge shlex PRs (#1130, #1116)
* arg_split now takes optional strict flag, to ignore ValueErrors in
  shlex parsing
* %timeit uses strict=False, to avoid errors parsing python code
* %run completer uses arg_split(strict=False) for its unicode behavior, instead
  of custom shlex derivative, which is now redundant.

closes #1109
closes #1115
closes #1116
closes #1130
790cb14
Min RK minrk closed this pull request from a commit
Min RK Merge shlex PRs (#1130, #1116)
* arg_split now takes optional strict flag, to ignore ValueErrors in
  shlex parsing
* %timeit uses strict=False, to avoid errors parsing python code
* %run completer uses arg_split(strict=False) for its unicode behavior, instead
  of custom shlex derivative, which is now redundant.

closes #1109
closes #1115
closes #1116
closes #1130
790cb14
Min RK minrk closed this in 790cb14
Brian E. Granger ellisonbg referenced this pull request from a commit
Commit has since been removed from the repository and is no longer available.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Showing 5 unique commits by 2 authors.

Dec 10, 2011
Min RK add strict flag to arg_split, to optionally ignore shlex parse errors
Sometimes we pass things that aren't really command-line args to arg_split, e.g:

    %timeit python_code(" ")

This commit adds a `strict` flag, which defaults to the same raising behavior
as before.

Currently magic_timeit is the *only* place, we use strict=False, but it should
also be done in completions (PR #1116).

closes #1109
4f1e79b
Dec 12, 2011
Fixing shlex_split to return unicode on py2.x 5bd8972
Replaced shlex_split with arg_split from _process_common.
shlex_split was removed since it was a unicode unsafe version of
arg_split. Tests were added to test magic_run_completer.
aac154d
Min RK use arg_split(...strict=False) in module_completer 963dd69
Min RK add %run open-quote completerlib test fa99077
This page is out of date. Refresh to see the latest.
33  IPython/core/completerlib.py
@@ -20,7 +20,6 @@
20 20
 import inspect
21 21
 import os
22 22
 import re
23  
-import shlex
24 23
 import sys
25 24
 
26 25
 # Third-party imports
@@ -31,6 +30,7 @@
31 30
 from IPython.core.completer import expand_user, compress_user
32 31
 from IPython.core.error import TryNext
33 32
 from IPython.utils import py3compat
  33
+from IPython.utils._process_common import arg_split
34 34
 
35 35
 # FIXME: this should be pulled in with the right call via the component system
36 36
 from IPython.core.ipapi import get as get_ipython
@@ -56,35 +56,6 @@
56 56
 # Local utilities
57 57
 #-----------------------------------------------------------------------------
58 58
 
59  
-def shlex_split(x):
60  
-    """Helper function to split lines into segments.
61  
-    """
62  
-    # shlex.split raises an exception if there is a syntax error in sh syntax
63  
-    # for example if no closing " is found. This function keeps dropping the
64  
-    # last character of the line until shlex.split does not raise
65  
-    # an exception. It adds end of the line to the result of shlex.split
66  
-    #
67  
-    # Example:
68  
-    # %run "c:/python -> ['%run','"c:/python']
69  
-
70  
-    # shlex.split has unicode bugs in Python 2, so encode first to str
71  
-    if not py3compat.PY3:
72  
-        x = py3compat.cast_bytes(x)
73  
-
74  
-    endofline = []
75  
-    while x != '':
76  
-        try:
77  
-            comps = shlex.split(x)
78  
-            if len(endofline) >= 1:
79  
-                comps.append(''.join(endofline))
80  
-            return comps
81  
-
82  
-        except ValueError:
83  
-            endofline = [x[-1:]]+endofline
84  
-            x = x[:-1]
85  
-
86  
-    return [''.join(endofline)]
87  
-
88 59
 def module_list(path):
89 60
     """
90 61
     Return the list containing the names of the modules available in the given
@@ -265,7 +236,7 @@ def module_completer(self,event):
265 236
 def magic_run_completer(self, event):
266 237
     """Complete files that end in .py or .ipy for the %run command.
267 238
     """
268  
-    comps = shlex_split(event.line)
  239
+    comps = arg_split(event.line, strict=False)
269 240
     relpath = (len(comps) > 1 and comps[-1] or '').strip("'\"")
270 241
 
271 242
     #print("\nev=", event)  # dbg
5  IPython/core/magic.py
@@ -266,6 +266,7 @@ def parse_options(self,arg_str,opt_str,*long_opts,**kw):
266 266
         # Get options
267 267
         list_all = kw.get('list_all',0)
268 268
         posix = kw.get('posix', os.name == 'posix')
  269
+        strict = kw.get('strict', True)
269 270
 
270 271
         # Check if we have more than one argument to warrant extra processing:
271 272
         odict = {}  # Dictionary with options
@@ -273,7 +274,7 @@ def parse_options(self,arg_str,opt_str,*long_opts,**kw):
273 274
         if len(args) >= 1:
274 275
             # If the list of inputs only has 0 or 1 thing in it, there's no
275 276
             # need to look for options
276  
-            argv = arg_split(arg_str,posix)
  277
+            argv = arg_split(arg_str, posix, strict)
277 278
             # Do regular option processing
278 279
             try:
279 280
                 opts,args = getopt(argv,opt_str,*long_opts)
@@ -1865,7 +1866,7 @@ def magic_timeit(self, parameter_s =''):
1865 1866
         scaling = [1, 1e3, 1e6, 1e9]
1866 1867
 
1867 1868
         opts, stmt = self.parse_options(parameter_s,'n:r:tcp:',
1868  
-                                        posix=False)
  1869
+                                        posix=False, strict=False)
1869 1870
         if stmt == "":
1870 1871
             return
1871 1872
         timefunc = timeit.default_timer
69  IPython/core/tests/test_completerlib.py
... ...
@@ -0,0 +1,69 @@
  1
+# -*- coding: utf-8 -*-
  2
+"""Tests for completerlib.
  3
+
  4
+"""
  5
+from __future__ import absolute_import
  6
+
  7
+#-----------------------------------------------------------------------------
  8
+# Imports
  9
+#-----------------------------------------------------------------------------
  10
+
  11
+import os
  12
+import shutil
  13
+import sys
  14
+import tempfile
  15
+import unittest
  16
+from os.path import join
  17
+
  18
+import nose.tools as nt
  19
+from nose import SkipTest
  20
+
  21
+from IPython.core.completerlib import magic_run_completer
  22
+from IPython.testing import decorators as dec
  23
+from IPython.testing import tools as tt
  24
+from IPython.utils import py3compat
  25
+
  26
+
  27
+class MockEvent(object):
  28
+    def __init__(self, line):
  29
+        self.line = line
  30
+
  31
+#-----------------------------------------------------------------------------
  32
+# Test functions begin
  33
+#-----------------------------------------------------------------------------
  34
+class Test_magic_run_completer(unittest.TestCase):
  35
+    def setUp(self):
  36
+        self.BASETESTDIR = tempfile.mkdtemp()
  37
+        for fil in [u"aaå.py", u"a.py", u"b.py"]:
  38
+            with open(join(self.BASETESTDIR, fil), "w") as sfile:
  39
+                sfile.write("pass\n")
  40
+        self.oldpath = os.getcwdu()
  41
+        os.chdir(self.BASETESTDIR)
  42
+
  43
+    def tearDown(self):
  44
+        os.chdir(self.oldpath)
  45
+        shutil.rmtree(self.BASETESTDIR)
  46
+
  47
+    def test_1(self):
  48
+        """Test magic_run_completer, should match two alterntives
  49
+        """
  50
+        event = MockEvent(u"%run a")
  51
+        mockself = None
  52
+        match = magic_run_completer(mockself, event) 
  53
+        self.assertEqual(match, [u"a.py", u"aaå.py",])
  54
+
  55
+    def test_2(self):
  56
+        """Test magic_run_completer, should match one alterntive
  57
+        """
  58
+        event = MockEvent(u"%run aa")
  59
+        mockself = None
  60
+        match = magic_run_completer(mockself, event) 
  61
+        self.assertEqual(match, [u"aaå.py",])
  62
+
  63
+    def test_3(self):
  64
+        """Test '%run "a<tab>' completion"""
  65
+        event = MockEvent(u'%run "a')
  66
+        mockself = None
  67
+        match = magic_run_completer(mockself, event)
  68
+        self.assertEqual(match, [u"a.py", u"aaå.py"])
  69
+
9  IPython/core/tests/test_magic.py
@@ -344,3 +344,12 @@ def test_psearch():
344 344
     with tt.AssertPrints("dict.fromkeys"):
345 345
         _ip.run_cell("dict.fr*?")
346 346
 
  347
+def test_timeit_shlex():
  348
+    """test shlex issues with timeit (#1109)"""
  349
+    _ip.ex("def f(*a,**kw): pass")
  350
+    _ip.magic('timeit -n1 "this is a bug".count(" ")')
  351
+    _ip.magic('timeit -r1 -n1 f(" ", 1)')
  352
+    _ip.magic('timeit -r1 -n1 f(" ", 1, " ", 2, " ")')
  353
+    _ip.magic('timeit -r1 -n1 ("a " + "b")')
  354
+    _ip.magic('timeit -r1 -n1 f("a " + "b")')
  355
+    _ip.magic('timeit -r1 -n1 f("a " + "b ")')
30  IPython/utils/_process_common.py
@@ -146,12 +146,18 @@ def getoutputerror(cmd):
146 146
     return py3compat.bytes_to_str(out), py3compat.bytes_to_str(err)
147 147
 
148 148
 
149  
-def arg_split(s, posix=False):
  149
+def arg_split(s, posix=False, strict=True):
150 150
     """Split a command line's arguments in a shell-like manner.
151 151
 
152 152
     This is a modified version of the standard library's shlex.split()
153 153
     function, but with a default of posix=False for splitting, so that quotes
154  
-    in inputs are respected."""
  154
+    in inputs are respected.
  155
+
  156
+    if strict=False, then any errors shlex.split would raise will result in the
  157
+    unparsed remainder being the last element of the list, rather than raising.
  158
+    This is because we sometimes use arg_split to parse things other than
  159
+    command-line args.
  160
+    """
155 161
 
156 162
     # Unfortunately, python's shlex module is buggy with unicode input:
157 163
     # http://bugs.python.org/issue1170
@@ -163,7 +169,25 @@ def arg_split(s, posix=False):
163 169
         s = s.encode('utf-8')
164 170
     lex = shlex.shlex(s, posix=posix)
165 171
     lex.whitespace_split = True
166  
-    tokens = list(lex)
  172
+    # Extract tokens, ensuring that things like leaving open quotes
  173
+    # does not cause this to raise.  This is important, because we
  174
+    # sometimes pass Python source through this (e.g. %timeit f(" ")),
  175
+    # and it shouldn't raise an exception.
  176
+    # It may be a bad idea to parse things that are not command-line args
  177
+    # through this function, but we do, so let's be safe about it.
  178
+    tokens = []
  179
+    while True:
  180
+        try:
  181
+            tokens.append(lex.next())
  182
+        except StopIteration:
  183
+            break
  184
+        except ValueError:
  185
+            if strict:
  186
+                raise
  187
+            # couldn't parse, get remaining blob as last token
  188
+            tokens.append(lex.token)
  189
+            break
  190
+    
167 191
     if is_unicode:
168 192
         # Convert the tokens back to unicode.
169 193
         tokens = [x.decode('utf-8') for x in tokens]
Commit_comment_tip

Tip: You can add notes to lines in a file. Hover to the left of a line to make a note

Something went wrong with that request. Please try again.