Skip to content


Subversion checkout URL

You can clone with
Download ZIP


pexpect & Python 3 #798

merged 3 commits into from

3 participants


Changes to pexpect so running external processes from the Qt console works in Python 3.


These look fine to merge, but I'd suggest you split it into two separate commits. Since one entails changes to an external dependency, I think it's best to always have those available in nice, isolated commits. It makes it easier to send the patch upstream, reapply it if we update to a new version from upstream that doesn't have the changes, etc.

So my vote is: rebase by splitting into two commits, one for each file, then merge away.


takluyver added some commits
@takluyver takluyver Changes to pexpect so it does what we need after conversion to Python 3. a69359f
@takluyver takluyver Decode output from unix subprocesses using platform default encoding. 2cda1d8
@takluyver takluyver Try locale encoding if stdin encoding is ascii.
Starting the Qt console on Python 3, the kernel's stdin ends up with a .encoding of 'ascii' (whereas on Python 2 it is None). Since most platforms can handle a superset of ASCII, we may as well try locale.getpreferredencoding() in this case.

OK, I've split them, but I found another minor bug. The kernel in Python 2 has a sys.stdin.encoding of None, so @minrk's getdefaultencoding() function fell back to using the locale. In Python 3, sys.stdin.encoding is 'ascii', although the system is using UTF-8.

On the principle that most systems now can handle a superset of ascii (either utf-8 or a Windows code page), I've made getdefaultencoding() try the locale encoding if sys.stdin.encoding is None or ascii. I think the only way this could make anything worse is if a subprocess is returning ascii output on a system where the locale encoding is not ascii compatible (e.g. UTF-16).


I'll merge this soon unless anyone objects.


seems fine to me.

Do you know why Python3 incorrectly marks sys.stdin.encoding as ascii? Or is that the new replacement for None? The problem is that if the terminal really is ASCII, we might run into problems. But I wouldn't worry too much about that.


I think that the object which is now used for sys.stdin (io.TextIOWrapper) has to have a non-None encoding attribute, so ascii is the default if it's not told anything else.

I don't think it should cause problems, because:

  1. If the terminal is ASCII, the default locale should presumably indicate ascii as well.
  2. If the terminal is ASCII and the default locale indicates another encoding, it will most likely be either UTF-8 or an ascii compatible code page. So any characters we get from the terminal should be correctly decoded anyway. The only situation in which it could is if the locale incorrectly indicates a non-ascii-compatible encoding, such as UTF-16.

Sounds good.

@takluyver takluyver merged commit 5590396 into ipython:master

Thanks, Min. Merged.

@ellisonbg ellisonbg referenced this pull request from a commit
Commit has since been removed from the repository and is no longer available.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Commits on Sep 17, 2011
  1. @takluyver
  2. @takluyver
  3. @takluyver

    Try locale encoding if stdin encoding is ascii.

    takluyver authored
    Starting the Qt console on Python 3, the kernel's stdin ends up with a .encoding of 'ascii' (whereas on Python 2 it is None). Since most platforms can handle a superset of ASCII, we may as well try locale.getpreferredencoding() in this case.
This page is out of date. Refresh to see the latest.
32 IPython/external/pexpect/
@@ -229,16 +229,16 @@ def print_ticks(d):
while 1:
index = child.expect (patterns)
- if type(child.after) in types.StringTypes:
+ if isinstance(child.after, basestring):
child_result_list.append(child.before + child.after)
else: # child.after may have been a TIMEOUT or EOF, so don't cat those.
- if type(responses[index]) in types.StringTypes:
+ if isinstance(responses[index], basestring):
- elif type(responses[index]) is types.FunctionType:
+ elif isinstance(responses[index], types.FunctionType):
callback_result = responses[index](locals())
- if type(callback_result) in types.StringTypes:
+ if isinstance(callback_result, basestring):
elif callback_result:
@@ -399,7 +399,7 @@ def __init__(self, command, args=[], timeout=30, maxread=2000, searchwindowsize=
self.logfile_read = None # input from child (read_nonblocking)
self.logfile_send = None # output to send (send, sendline)
self.maxread = maxread # max bytes to read at one time into buffer
- self.buffer = '' # This is the read buffer. See maxread.
+ self.buffer = b'' # This is the read buffer. See maxread.
self.searchwindowsize = searchwindowsize # Anything before searchwindowsize point is preserved, but not searched.
# Most Linux machines don't like delaybeforesend to be below 0.03 (30 ms).
self.delaybeforesend = 0.05 # Sets sleep time used just before sending data to child. Time in seconds.
@@ -828,7 +828,7 @@ def read_nonblocking (self, size = 1, timeout = -1):
except OSError, e: # Linux does this
self.flag_eof = True
raise EOF ('End Of File (EOF) in read_nonblocking(). Exception style platform.')
- if s == '': # BSD style
+ if s == b'': # BSD style
self.flag_eof = True
raise EOF ('End Of File (EOF) in read_nonblocking(). Empty string style platform.')
@@ -936,12 +936,14 @@ def writelines (self, sequence): # File-like object.
for s in sequence:
self.write (s)
- def send(self, s):
+ def send(self, s, encoding='utf-8'):
"""This sends a string to the child process. This returns the number of
bytes written. If a log file was set then the data is also written to
the log. """
+ if isinstance(s, unicode):
+ s = s.encode(encoding)
if self.logfile is not None:
self.logfile.write (s)
@@ -1208,7 +1210,7 @@ def compile_pattern_list(self, patterns):
if patterns is None:
return []
- if type(patterns) is not types.ListType:
+ if not isinstance(patterns, list):
patterns = [patterns]
compile_flags = re.DOTALL # Allow dot to match \n
@@ -1216,7 +1218,7 @@ def compile_pattern_list(self, patterns):
compile_flags = compile_flags | re.IGNORECASE
compiled_pattern_list = []
for p in patterns:
- if type(p) in types.StringTypes:
+ if isinstance(p, basestring):
compiled_pattern_list.append(re.compile(p, compile_flags))
elif p is EOF:
@@ -1337,7 +1339,7 @@ def expect_exact(self, pattern_list, timeout = -1, searchwindowsize = -1):
This method is also useful when you don't want to have to worry about
escaping regular expression characters that you want to match."""
- if type(pattern_list) in types.StringTypes or pattern_list in (TIMEOUT, EOF):
+ if isinstance(pattern_list, basestring) or pattern_list in (TIMEOUT, EOF):
pattern_list = [pattern_list]
return self.expect_loop(searcher_string(pattern_list), timeout, searchwindowsize)
@@ -1371,7 +1373,7 @@ def expect_loop(self, searcher, timeout = -1, searchwindowsize = -1):
self.match_index = index
return self.match_index
# No match at this point
- if timeout < 0 and timeout is not None:
+ if timeout is not None and timeout < 0:
raise TIMEOUT ('Timeout exceeded in expect_any().')
# Still have time left, so read more data
c = self.read_nonblocking (self.maxread, timeout)
@@ -1381,7 +1383,7 @@ def expect_loop(self, searcher, timeout = -1, searchwindowsize = -1):
if timeout is not None:
timeout = end_time - time.time()
except EOF, e:
- self.buffer = ''
+ self.buffer = b''
self.before = incoming
self.after = EOF
index = searcher.eof_index
@@ -1484,7 +1486,7 @@ def sigwinch_passthrough (sig, data):
# Flush the buffer.
self.stdout.write (self.buffer)
- self.buffer = ''
+ self.buffer = b''
mode = tty.tcgetattr(self.STDIN_FILENO)
@@ -1700,7 +1702,7 @@ def __init__(self, patterns):
self.eof_index = -1
self.timeout_index = -1
self._searches = []
- for n, s in zip(range(len(patterns)), patterns):
+ for n, s in enumerate(patterns):
if s is EOF:
self.eof_index = n
@@ -1721,7 +1723,7 @@ def __str__(self):
if self.timeout_index >= 0:
ss.append ((self.timeout_index,' %d: TIMEOUT' % self.timeout_index))
- ss = zip(*ss)[1]
+ ss = [a[1] for a in ss]
return '\n'.join(ss)
def search(self, buffer, freshlen, searchwindowsize=None):
16 IPython/utils/
@@ -19,17 +19,12 @@
import subprocess as sp
import sys
-# Third-party
-# We ship our own copy of pexpect (it's a single file) to minimize dependencies
-# for users, but it's only used if we don't find the system copy.
- import pexpect
-except ImportError:
- from IPython.external import pexpect
+from IPython.external import pexpect
# Our own
from .autoattr import auto_attr
from ._process_common import getoutput
+from IPython.utils import text
# Function definitions
@@ -132,6 +127,9 @@ def system(self, cmd):
int : child's exitstatus
+ # Get likely encoding for the output.
+ enc = text.getdefaultencoding()
pcmd = self._make_cmd(cmd)
# Patterns to match on the output, for pexpect. We read input and
# allow either a short timeout or EOF
@@ -155,7 +153,7 @@ def system(self, cmd):
# res is the index of the pattern that caused the match, so we
# know whether we've finished (if we matched EOF) or not
res_idx = child.expect_list(patterns, self.read_timeout)
- print(child.before[out_size:], end='')
+ print(child.before[out_size:].decode(enc, 'replace'), end='')
if res_idx==EOF_index:
@@ -171,7 +169,7 @@ def system(self, cmd):
out_size = len(child.before)
child.expect_list(patterns, self.terminate_timeout)
- print(child.before[out_size:], end='')
+ print(child.before[out_size:].decode(enc, 'replace'), end='')
except KeyboardInterrupt:
# Impatient users tend to type it multiple times
2  IPython/utils/
@@ -47,7 +47,7 @@ def getdefaultencoding():
and usually ASCII.
enc = sys.stdin.encoding
- if not enc:
+ if not enc or enc=='ascii':
# There are reports of getpreferredencoding raising errors
# in some cases, which may well be fixed, but let's be conservative here.
Something went wrong with that request. Please try again.