Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP

Loading…

optionally ignore shlex's ValueError in arg_split #1130

Closed
wants to merge 5 commits into from

1 participant

@minrk
Owner

Sometimes we pass things that aren't really command-line args to arg_split, e.g:

%timeit python_code(" ")

These things probably shouldn't be parsed as command-line args, but they are.
So we should at least prevent them from raising, when shlex doesn't think they are valid shell args.

This commit protects this sort of thing from raising errors. Another
approach would be to use a different function when parsing things that
are clearly not command-line args (like Python code), but this works for now.

closes #1109

minrk and others added some commits
@minrk minrk add strict flag to arg_split, to optionally ignore shlex parse errors
Sometimes we pass things that aren't really command-line args to arg_split, e.g:

    %timeit python_code(" ")

This commit adds a `strict` flag, which defaults to the same raising behavior
as before.

Currently magic_timeit is the *only* place, we use strict=False, but it should
also be done in completions (PR #1116).

closes #1109
4f1e79b
Jörgen Stenarson Fixing shlex_split to return unicode on py2.x 5bd8972
Jörgen Stenarson Replaced shlex_split with arg_split from _process_common.
shlex_split was removed since it was a unicode unsafe version of
arg_split. Tests were added to test magic_run_completer.
aac154d
@minrk minrk use arg_split(...strict=False) in module_completer 963dd69
@minrk minrk add %run open-quote completerlib test fa99077
@minrk minrk referenced this pull request from a commit in minrk/ipython
@minrk minrk Merge shlex PRs (#1130, #1116)
* arg_split now takes optional strict flag, to ignore ValueErrors in
  shlex parsing
* %timeit uses strict=False, to avoid errors parsing python code
* %run completer uses arg_split(strict=False) for its unicode behavior, instead
  of custom shlex derivative, which is now redundant.

closes #1109
closes #1115
closes #1116
closes #1130
790cb14
@minrk minrk closed this pull request from a commit
@minrk minrk Merge shlex PRs (#1130, #1116)
* arg_split now takes optional strict flag, to ignore ValueErrors in
  shlex parsing
* %timeit uses strict=False, to avoid errors parsing python code
* %run completer uses arg_split(strict=False) for its unicode behavior, instead
  of custom shlex derivative, which is now redundant.

closes #1109
closes #1115
closes #1116
closes #1130
790cb14
@minrk minrk closed this in 790cb14
@ellisonbg ellisonbg referenced this pull request from a commit
Commit has since been removed from the repository and is no longer available.
@mattvonrocketstein mattvonrocketstein referenced this pull request from a commit in mattvonrocketstein/ipython
@minrk minrk Merge shlex PRs (#1130, #1116)
* arg_split now takes optional strict flag, to ignore ValueErrors in
  shlex parsing
* %timeit uses strict=False, to avoid errors parsing python code
* %run completer uses arg_split(strict=False) for its unicode behavior, instead
  of custom shlex derivative, which is now redundant.

closes #1109
closes #1115
closes #1116
closes #1130
5799471
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Commits on Dec 11, 2011
  1. @minrk

    add strict flag to arg_split, to optionally ignore shlex parse errors

    minrk authored
    Sometimes we pass things that aren't really command-line args to arg_split, e.g:
    
        %timeit python_code(" ")
    
    This commit adds a `strict` flag, which defaults to the same raising behavior
    as before.
    
    Currently magic_timeit is the *only* place, we use strict=False, but it should
    also be done in completions (PR #1116).
    
    closes #1109
Commits on Dec 12, 2011
  1. @minrk

    Fixing shlex_split to return unicode on py2.x

    Jörgen Stenarson authored minrk committed
  2. @minrk

    Replaced shlex_split with arg_split from _process_common.

    Jörgen Stenarson authored minrk committed
    shlex_split was removed since it was a unicode unsafe version of
    arg_split. Tests were added to test magic_run_completer.
  3. @minrk
  4. @minrk
This page is out of date. Refresh to see the latest.
View
33 IPython/core/completerlib.py
@@ -20,7 +20,6 @@
import inspect
import os
import re
-import shlex
import sys
# Third-party imports
@@ -31,6 +30,7 @@
from IPython.core.completer import expand_user, compress_user
from IPython.core.error import TryNext
from IPython.utils import py3compat
+from IPython.utils._process_common import arg_split
# FIXME: this should be pulled in with the right call via the component system
from IPython.core.ipapi import get as get_ipython
@@ -56,35 +56,6 @@
# Local utilities
#-----------------------------------------------------------------------------
-def shlex_split(x):
- """Helper function to split lines into segments.
- """
- # shlex.split raises an exception if there is a syntax error in sh syntax
- # for example if no closing " is found. This function keeps dropping the
- # last character of the line until shlex.split does not raise
- # an exception. It adds end of the line to the result of shlex.split
- #
- # Example:
- # %run "c:/python -> ['%run','"c:/python']
-
- # shlex.split has unicode bugs in Python 2, so encode first to str
- if not py3compat.PY3:
- x = py3compat.cast_bytes(x)
-
- endofline = []
- while x != '':
- try:
- comps = shlex.split(x)
- if len(endofline) >= 1:
- comps.append(''.join(endofline))
- return comps
-
- except ValueError:
- endofline = [x[-1:]]+endofline
- x = x[:-1]
-
- return [''.join(endofline)]
-
def module_list(path):
"""
Return the list containing the names of the modules available in the given
@@ -265,7 +236,7 @@ def module_completer(self,event):
def magic_run_completer(self, event):
"""Complete files that end in .py or .ipy for the %run command.
"""
- comps = shlex_split(event.line)
+ comps = arg_split(event.line, strict=False)
relpath = (len(comps) > 1 and comps[-1] or '').strip("'\"")
#print("\nev=", event) # dbg
View
5 IPython/core/magic.py
@@ -266,6 +266,7 @@ def parse_options(self,arg_str,opt_str,*long_opts,**kw):
# Get options
list_all = kw.get('list_all',0)
posix = kw.get('posix', os.name == 'posix')
+ strict = kw.get('strict', True)
# Check if we have more than one argument to warrant extra processing:
odict = {} # Dictionary with options
@@ -273,7 +274,7 @@ def parse_options(self,arg_str,opt_str,*long_opts,**kw):
if len(args) >= 1:
# If the list of inputs only has 0 or 1 thing in it, there's no
# need to look for options
- argv = arg_split(arg_str,posix)
+ argv = arg_split(arg_str, posix, strict)
# Do regular option processing
try:
opts,args = getopt(argv,opt_str,*long_opts)
@@ -1865,7 +1866,7 @@ def magic_timeit(self, parameter_s =''):
scaling = [1, 1e3, 1e6, 1e9]
opts, stmt = self.parse_options(parameter_s,'n:r:tcp:',
- posix=False)
+ posix=False, strict=False)
if stmt == "":
return
timefunc = timeit.default_timer
View
69 IPython/core/tests/test_completerlib.py
@@ -0,0 +1,69 @@
+# -*- coding: utf-8 -*-
+"""Tests for completerlib.
+
+"""
+from __future__ import absolute_import
+
+#-----------------------------------------------------------------------------
+# Imports
+#-----------------------------------------------------------------------------
+
+import os
+import shutil
+import sys
+import tempfile
+import unittest
+from os.path import join
+
+import nose.tools as nt
+from nose import SkipTest
+
+from IPython.core.completerlib import magic_run_completer
+from IPython.testing import decorators as dec
+from IPython.testing import tools as tt
+from IPython.utils import py3compat
+
+
+class MockEvent(object):
+ def __init__(self, line):
+ self.line = line
+
+#-----------------------------------------------------------------------------
+# Test functions begin
+#-----------------------------------------------------------------------------
+class Test_magic_run_completer(unittest.TestCase):
+ def setUp(self):
+ self.BASETESTDIR = tempfile.mkdtemp()
+ for fil in [u"aaå.py", u"a.py", u"b.py"]:
+ with open(join(self.BASETESTDIR, fil), "w") as sfile:
+ sfile.write("pass\n")
+ self.oldpath = os.getcwdu()
+ os.chdir(self.BASETESTDIR)
+
+ def tearDown(self):
+ os.chdir(self.oldpath)
+ shutil.rmtree(self.BASETESTDIR)
+
+ def test_1(self):
+ """Test magic_run_completer, should match two alterntives
+ """
+ event = MockEvent(u"%run a")
+ mockself = None
+ match = magic_run_completer(mockself, event)
+ self.assertEqual(match, [u"a.py", u"aaå.py",])
+
+ def test_2(self):
+ """Test magic_run_completer, should match one alterntive
+ """
+ event = MockEvent(u"%run aa")
+ mockself = None
+ match = magic_run_completer(mockself, event)
+ self.assertEqual(match, [u"aaå.py",])
+
+ def test_3(self):
+ """Test '%run "a<tab>' completion"""
+ event = MockEvent(u'%run "a')
+ mockself = None
+ match = magic_run_completer(mockself, event)
+ self.assertEqual(match, [u"a.py", u"aaå.py"])
+
View
9 IPython/core/tests/test_magic.py
@@ -344,3 +344,12 @@ def test_psearch():
with tt.AssertPrints("dict.fromkeys"):
_ip.run_cell("dict.fr*?")
+def test_timeit_shlex():
+ """test shlex issues with timeit (#1109)"""
+ _ip.ex("def f(*a,**kw): pass")
+ _ip.magic('timeit -n1 "this is a bug".count(" ")')
+ _ip.magic('timeit -r1 -n1 f(" ", 1)')
+ _ip.magic('timeit -r1 -n1 f(" ", 1, " ", 2, " ")')
+ _ip.magic('timeit -r1 -n1 ("a " + "b")')
+ _ip.magic('timeit -r1 -n1 f("a " + "b")')
+ _ip.magic('timeit -r1 -n1 f("a " + "b ")')
View
30 IPython/utils/_process_common.py
@@ -146,12 +146,18 @@ def getoutputerror(cmd):
return py3compat.bytes_to_str(out), py3compat.bytes_to_str(err)
-def arg_split(s, posix=False):
+def arg_split(s, posix=False, strict=True):
"""Split a command line's arguments in a shell-like manner.
This is a modified version of the standard library's shlex.split()
function, but with a default of posix=False for splitting, so that quotes
- in inputs are respected."""
+ in inputs are respected.
+
+ if strict=False, then any errors shlex.split would raise will result in the
+ unparsed remainder being the last element of the list, rather than raising.
+ This is because we sometimes use arg_split to parse things other than
+ command-line args.
+ """
# Unfortunately, python's shlex module is buggy with unicode input:
# http://bugs.python.org/issue1170
@@ -163,7 +169,25 @@ def arg_split(s, posix=False):
s = s.encode('utf-8')
lex = shlex.shlex(s, posix=posix)
lex.whitespace_split = True
- tokens = list(lex)
+ # Extract tokens, ensuring that things like leaving open quotes
+ # does not cause this to raise. This is important, because we
+ # sometimes pass Python source through this (e.g. %timeit f(" ")),
+ # and it shouldn't raise an exception.
+ # It may be a bad idea to parse things that are not command-line args
+ # through this function, but we do, so let's be safe about it.
+ tokens = []
+ while True:
+ try:
+ tokens.append(lex.next())
+ except StopIteration:
+ break
+ except ValueError:
+ if strict:
+ raise
+ # couldn't parse, get remaining blob as last token
+ tokens.append(lex.token)
+ break
+
if is_unicode:
# Convert the tokens back to unicode.
tokens = [x.decode('utf-8') for x in tokens]
Something went wrong with that request. Please try again.