Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP

Loading…

pexpect & Python 3 #798

Merged
merged 3 commits into from

3 participants

Thomas Kluyver Fernando Perez Min RK
Thomas Kluyver
Owner

Changes to pexpect so running external processes from the Qt console works in Python 3.

Fernando Perez
Owner

These look fine to merge, but I'd suggest you split it into two separate commits. Since one entails changes to an external dependency, I think it's best to always have those available in nice, isolated commits. It makes it easier to send the patch upstream, reapply it if we update to a new version from upstream that doesn't have the changes, etc.

So my vote is: rebase by splitting into two commits, one for each file, then merge away.

Thanks!

takluyver added some commits
Thomas Kluyver takluyver Changes to pexpect so it does what we need after conversion to Python 3. a69359f
Thomas Kluyver takluyver Decode output from unix subprocesses using platform default encoding. 2cda1d8
Thomas Kluyver takluyver Try locale encoding if stdin encoding is ascii.
Starting the Qt console on Python 3, the kernel's stdin ends up with a .encoding of 'ascii' (whereas on Python 2 it is None). Since most platforms can handle a superset of ASCII, we may as well try locale.getpreferredencoding() in this case.
c06689d
Thomas Kluyver
Owner

OK, I've split them, but I found another minor bug. The kernel in Python 2 has a sys.stdin.encoding of None, so @minrk's getdefaultencoding() function fell back to using the locale. In Python 3, sys.stdin.encoding is 'ascii', although the system is using UTF-8.

On the principle that most systems now can handle a superset of ascii (either utf-8 or a Windows code page), I've made getdefaultencoding() try the locale encoding if sys.stdin.encoding is None or ascii. I think the only way this could make anything worse is if a subprocess is returning ascii output on a system where the locale encoding is not ascii compatible (e.g. UTF-16).

Thomas Kluyver
Owner

I'll merge this soon unless anyone objects.

Min RK
Owner

seems fine to me.

Do you know why Python3 incorrectly marks sys.stdin.encoding as ascii? Or is that the new replacement for None? The problem is that if the terminal really is ASCII, we might run into problems. But I wouldn't worry too much about that.

Thomas Kluyver
Owner

I think that the object which is now used for sys.stdin (io.TextIOWrapper) has to have a non-None encoding attribute, so ascii is the default if it's not told anything else.

I don't think it should cause problems, because:

  1. If the terminal is ASCII, the default locale should presumably indicate ascii as well.
  2. If the terminal is ASCII and the default locale indicates another encoding, it will most likely be either UTF-8 or an ascii compatible code page. So any characters we get from the terminal should be correctly decoded anyway. The only situation in which it could is if the locale incorrectly indicates a non-ascii-compatible encoding, such as UTF-16.
Min RK
Owner

Sounds good.

Thomas Kluyver takluyver merged commit 5590396 into from
Thomas Kluyver
Owner

Thanks, Min. Merged.

Brian E. Granger ellisonbg referenced this pull request from a commit
Commit has since been removed from the repository and is no longer available.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Commits on Sep 17, 2011
  1. Thomas Kluyver
  2. Thomas Kluyver
  3. Thomas Kluyver

    Try locale encoding if stdin encoding is ascii.

    takluyver authored
    Starting the Qt console on Python 3, the kernel's stdin ends up with a .encoding of 'ascii' (whereas on Python 2 it is None). Since most platforms can handle a superset of ASCII, we may as well try locale.getpreferredencoding() in this case.
This page is out of date. Refresh to see the latest.
32 IPython/external/pexpect/_pexpect.py
View
@@ -229,16 +229,16 @@ def print_ticks(d):
while 1:
try:
index = child.expect (patterns)
- if type(child.after) in types.StringTypes:
+ if isinstance(child.after, basestring):
child_result_list.append(child.before + child.after)
else: # child.after may have been a TIMEOUT or EOF, so don't cat those.
child_result_list.append(child.before)
- if type(responses[index]) in types.StringTypes:
+ if isinstance(responses[index], basestring):
child.send(responses[index])
- elif type(responses[index]) is types.FunctionType:
+ elif isinstance(responses[index], types.FunctionType):
callback_result = responses[index](locals())
sys.stdout.flush()
- if type(callback_result) in types.StringTypes:
+ if isinstance(callback_result, basestring):
child.send(callback_result)
elif callback_result:
break
@@ -399,7 +399,7 @@ def __init__(self, command, args=[], timeout=30, maxread=2000, searchwindowsize=
self.logfile_read = None # input from child (read_nonblocking)
self.logfile_send = None # output to send (send, sendline)
self.maxread = maxread # max bytes to read at one time into buffer
- self.buffer = '' # This is the read buffer. See maxread.
+ self.buffer = b'' # This is the read buffer. See maxread.
self.searchwindowsize = searchwindowsize # Anything before searchwindowsize point is preserved, but not searched.
# Most Linux machines don't like delaybeforesend to be below 0.03 (30 ms).
self.delaybeforesend = 0.05 # Sets sleep time used just before sending data to child. Time in seconds.
@@ -828,7 +828,7 @@ def read_nonblocking (self, size = 1, timeout = -1):
except OSError, e: # Linux does this
self.flag_eof = True
raise EOF ('End Of File (EOF) in read_nonblocking(). Exception style platform.')
- if s == '': # BSD style
+ if s == b'': # BSD style
self.flag_eof = True
raise EOF ('End Of File (EOF) in read_nonblocking(). Empty string style platform.')
@@ -936,12 +936,14 @@ def writelines (self, sequence): # File-like object.
for s in sequence:
self.write (s)
- def send(self, s):
+ def send(self, s, encoding='utf-8'):
"""This sends a string to the child process. This returns the number of
bytes written. If a log file was set then the data is also written to
the log. """
+ if isinstance(s, unicode):
+ s = s.encode(encoding)
time.sleep(self.delaybeforesend)
if self.logfile is not None:
self.logfile.write (s)
@@ -1208,7 +1210,7 @@ def compile_pattern_list(self, patterns):
if patterns is None:
return []
- if type(patterns) is not types.ListType:
+ if not isinstance(patterns, list):
patterns = [patterns]
compile_flags = re.DOTALL # Allow dot to match \n
@@ -1216,7 +1218,7 @@ def compile_pattern_list(self, patterns):
compile_flags = compile_flags | re.IGNORECASE
compiled_pattern_list = []
for p in patterns:
- if type(p) in types.StringTypes:
+ if isinstance(p, basestring):
compiled_pattern_list.append(re.compile(p, compile_flags))
elif p is EOF:
compiled_pattern_list.append(EOF)
@@ -1337,7 +1339,7 @@ def expect_exact(self, pattern_list, timeout = -1, searchwindowsize = -1):
This method is also useful when you don't want to have to worry about
escaping regular expression characters that you want to match."""
- if type(pattern_list) in types.StringTypes or pattern_list in (TIMEOUT, EOF):
+ if isinstance(pattern_list, basestring) or pattern_list in (TIMEOUT, EOF):
pattern_list = [pattern_list]
return self.expect_loop(searcher_string(pattern_list), timeout, searchwindowsize)
@@ -1371,7 +1373,7 @@ def expect_loop(self, searcher, timeout = -1, searchwindowsize = -1):
self.match_index = index
return self.match_index
# No match at this point
- if timeout < 0 and timeout is not None:
+ if timeout is not None and timeout < 0:
raise TIMEOUT ('Timeout exceeded in expect_any().')
# Still have time left, so read more data
c = self.read_nonblocking (self.maxread, timeout)
@@ -1381,7 +1383,7 @@ def expect_loop(self, searcher, timeout = -1, searchwindowsize = -1):
if timeout is not None:
timeout = end_time - time.time()
except EOF, e:
- self.buffer = ''
+ self.buffer = b''
self.before = incoming
self.after = EOF
index = searcher.eof_index
@@ -1484,7 +1486,7 @@ def sigwinch_passthrough (sig, data):
# Flush the buffer.
self.stdout.write (self.buffer)
self.stdout.flush()
- self.buffer = ''
+ self.buffer = b''
mode = tty.tcgetattr(self.STDIN_FILENO)
tty.setraw(self.STDIN_FILENO)
try:
@@ -1700,7 +1702,7 @@ def __init__(self, patterns):
self.eof_index = -1
self.timeout_index = -1
self._searches = []
- for n, s in zip(range(len(patterns)), patterns):
+ for n, s in enumerate(patterns):
if s is EOF:
self.eof_index = n
continue
@@ -1721,7 +1723,7 @@ def __str__(self):
if self.timeout_index >= 0:
ss.append ((self.timeout_index,' %d: TIMEOUT' % self.timeout_index))
ss.sort()
- ss = zip(*ss)[1]
+ ss = [a[1] for a in ss]
return '\n'.join(ss)
def search(self, buffer, freshlen, searchwindowsize=None):
16 IPython/utils/_process_posix.py
View
@@ -19,17 +19,12 @@
import subprocess as sp
import sys
-# Third-party
-# We ship our own copy of pexpect (it's a single file) to minimize dependencies
-# for users, but it's only used if we don't find the system copy.
-try:
- import pexpect
-except ImportError:
- from IPython.external import pexpect
+from IPython.external import pexpect
# Our own
from .autoattr import auto_attr
from ._process_common import getoutput
+from IPython.utils import text
#-----------------------------------------------------------------------------
# Function definitions
@@ -132,6 +127,9 @@ def system(self, cmd):
-------
int : child's exitstatus
"""
+ # Get likely encoding for the output.
+ enc = text.getdefaultencoding()
+
pcmd = self._make_cmd(cmd)
# Patterns to match on the output, for pexpect. We read input and
# allow either a short timeout or EOF
@@ -155,7 +153,7 @@ def system(self, cmd):
# res is the index of the pattern that caused the match, so we
# know whether we've finished (if we matched EOF) or not
res_idx = child.expect_list(patterns, self.read_timeout)
- print(child.before[out_size:], end='')
+ print(child.before[out_size:].decode(enc, 'replace'), end='')
flush()
if res_idx==EOF_index:
break
@@ -171,7 +169,7 @@ def system(self, cmd):
try:
out_size = len(child.before)
child.expect_list(patterns, self.terminate_timeout)
- print(child.before[out_size:], end='')
+ print(child.before[out_size:].decode(enc, 'replace'), end='')
sys.stdout.flush()
except KeyboardInterrupt:
# Impatient users tend to type it multiple times
2  IPython/utils/text.py
View
@@ -47,7 +47,7 @@ def getdefaultencoding():
and usually ASCII.
"""
enc = sys.stdin.encoding
- if not enc:
+ if not enc or enc=='ascii':
try:
# There are reports of getpreferredencoding raising errors
# in some cases, which may well be fixed, but let's be conservative here.
Something went wrong with that request. Please try again.