Skip to content
This repository

pexpect & Python 3 #798

Merged
merged 3 commits into from over 2 years ago

3 participants

Thomas Kluyver Fernando Perez Min RK
Thomas Kluyver
Collaborator

Changes to pexpect so running external processes from the Qt console works in Python 3.

Fernando Perez
Owner

These look fine to merge, but I'd suggest you split it into two separate commits. Since one entails changes to an external dependency, I think it's best to always have those available in nice, isolated commits. It makes it easier to send the patch upstream, reapply it if we update to a new version from upstream that doesn't have the changes, etc.

So my vote is: rebase by splitting into two commits, one for each file, then merge away.

Thanks!

added some commits September 17, 2011
Thomas Kluyver Changes to pexpect so it does what we need after conversion to Python 3. a69359f
Thomas Kluyver Decode output from unix subprocesses using platform default encoding. 2cda1d8
Thomas Kluyver Try locale encoding if stdin encoding is ascii.
Starting the Qt console on Python 3, the kernel's stdin ends up with a .encoding of 'ascii' (whereas on Python 2 it is None). Since most platforms can handle a superset of ASCII, we may as well try locale.getpreferredencoding() in this case.
c06689d
Thomas Kluyver
Collaborator

OK, I've split them, but I found another minor bug. The kernel in Python 2 has a sys.stdin.encoding of None, so @minrk's getdefaultencoding() function fell back to using the locale. In Python 3, sys.stdin.encoding is 'ascii', although the system is using UTF-8.

On the principle that most systems now can handle a superset of ascii (either utf-8 or a Windows code page), I've made getdefaultencoding() try the locale encoding if sys.stdin.encoding is None or ascii. I think the only way this could make anything worse is if a subprocess is returning ascii output on a system where the locale encoding is not ascii compatible (e.g. UTF-16).

Thomas Kluyver
Collaborator

I'll merge this soon unless anyone objects.

Min RK
Owner

seems fine to me.

Do you know why Python3 incorrectly marks sys.stdin.encoding as ascii? Or is that the new replacement for None? The problem is that if the terminal really is ASCII, we might run into problems. But I wouldn't worry too much about that.

Thomas Kluyver
Collaborator

I think that the object which is now used for sys.stdin (io.TextIOWrapper) has to have a non-None encoding attribute, so ascii is the default if it's not told anything else.

I don't think it should cause problems, because:

  1. If the terminal is ASCII, the default locale should presumably indicate ascii as well.
  2. If the terminal is ASCII and the default locale indicates another encoding, it will most likely be either UTF-8 or an ascii compatible code page. So any characters we get from the terminal should be correctly decoded anyway. The only situation in which it could is if the locale incorrectly indicates a non-ascii-compatible encoding, such as UTF-16.
Min RK
Owner

Sounds good.

Thomas Kluyver takluyver merged commit 5590396 into from September 29, 2011
Thomas Kluyver takluyver closed this September 29, 2011
Thomas Kluyver
Collaborator

Thanks, Min. Merged.

Brian E. Granger ellisonbg referenced this pull request from a commit January 10, 2012
Commit has since been removed from the repository and is no longer available.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Showing 3 unique commits by 1 author.

Sep 17, 2011
Thomas Kluyver Changes to pexpect so it does what we need after conversion to Python 3. a69359f
Thomas Kluyver Decode output from unix subprocesses using platform default encoding. 2cda1d8
Thomas Kluyver Try locale encoding if stdin encoding is ascii.
Starting the Qt console on Python 3, the kernel's stdin ends up with a .encoding of 'ascii' (whereas on Python 2 it is None). Since most platforms can handle a superset of ASCII, we may as well try locale.getpreferredencoding() in this case.
c06689d
This page is out of date. Refresh to see the latest.
32  IPython/external/pexpect/_pexpect.py
@@ -229,16 +229,16 @@ def print_ticks(d):
229 229
     while 1:
230 230
         try:
231 231
             index = child.expect (patterns)
232  
-            if type(child.after) in types.StringTypes:
  232
+            if isinstance(child.after, basestring):
233 233
                 child_result_list.append(child.before + child.after)
234 234
             else: # child.after may have been a TIMEOUT or EOF, so don't cat those.
235 235
                 child_result_list.append(child.before)
236  
-            if type(responses[index]) in types.StringTypes:
  236
+            if isinstance(responses[index], basestring):
237 237
                 child.send(responses[index])
238  
-            elif type(responses[index]) is types.FunctionType:
  238
+            elif isinstance(responses[index], types.FunctionType):
239 239
                 callback_result = responses[index](locals())
240 240
                 sys.stdout.flush()
241  
-                if type(callback_result) in types.StringTypes:
  241
+                if isinstance(callback_result, basestring):
242 242
                     child.send(callback_result)
243 243
                 elif callback_result:
244 244
                     break
@@ -399,7 +399,7 @@ def __init__(self, command, args=[], timeout=30, maxread=2000, searchwindowsize=
399 399
         self.logfile_read = None # input from child (read_nonblocking)
400 400
         self.logfile_send = None # output to send (send, sendline)
401 401
         self.maxread = maxread # max bytes to read at one time into buffer
402  
-        self.buffer = '' # This is the read buffer. See maxread.
  402
+        self.buffer = b'' # This is the read buffer. See maxread.
403 403
         self.searchwindowsize = searchwindowsize # Anything before searchwindowsize point is preserved, but not searched.
404 404
         # Most Linux machines don't like delaybeforesend to be below 0.03 (30 ms).
405 405
         self.delaybeforesend = 0.05 # Sets sleep time used just before sending data to child. Time in seconds.
@@ -828,7 +828,7 @@ def read_nonblocking (self, size = 1, timeout = -1):
828 828
             except OSError, e: # Linux does this
829 829
                 self.flag_eof = True
830 830
                 raise EOF ('End Of File (EOF) in read_nonblocking(). Exception style platform.')
831  
-            if s == '': # BSD style
  831
+            if s == b'': # BSD style
832 832
                 self.flag_eof = True
833 833
                 raise EOF ('End Of File (EOF) in read_nonblocking(). Empty string style platform.')
834 834
 
@@ -936,12 +936,14 @@ def writelines (self, sequence):   # File-like object.
936 936
         for s in sequence:
937 937
             self.write (s)
938 938
 
939  
-    def send(self, s):
  939
+    def send(self, s, encoding='utf-8'):
940 940
 
941 941
         """This sends a string to the child process. This returns the number of
942 942
         bytes written. If a log file was set then the data is also written to
943 943
         the log. """
944 944
 
  945
+        if isinstance(s, unicode):
  946
+            s = s.encode(encoding)
945 947
         time.sleep(self.delaybeforesend)
946 948
         if self.logfile is not None:
947 949
             self.logfile.write (s)
@@ -1208,7 +1210,7 @@ def compile_pattern_list(self, patterns):
1208 1210
 
1209 1211
         if patterns is None:
1210 1212
             return []
1211  
-        if type(patterns) is not types.ListType:
  1213
+        if not isinstance(patterns, list):
1212 1214
             patterns = [patterns]
1213 1215
 
1214 1216
         compile_flags = re.DOTALL # Allow dot to match \n
@@ -1216,7 +1218,7 @@ def compile_pattern_list(self, patterns):
1216 1218
             compile_flags = compile_flags | re.IGNORECASE
1217 1219
         compiled_pattern_list = []
1218 1220
         for p in patterns:
1219  
-            if type(p) in types.StringTypes:
  1221
+            if isinstance(p, basestring):
1220 1222
                 compiled_pattern_list.append(re.compile(p, compile_flags))
1221 1223
             elif p is EOF:
1222 1224
                 compiled_pattern_list.append(EOF)
@@ -1337,7 +1339,7 @@ def expect_exact(self, pattern_list, timeout = -1, searchwindowsize = -1):
1337 1339
         This method is also useful when you don't want to have to worry about
1338 1340
         escaping regular expression characters that you want to match."""
1339 1341
 
1340  
-        if type(pattern_list) in types.StringTypes or pattern_list in (TIMEOUT, EOF):
  1342
+        if isinstance(pattern_list, basestring) or pattern_list in (TIMEOUT, EOF):
1341 1343
             pattern_list = [pattern_list]
1342 1344
         return self.expect_loop(searcher_string(pattern_list), timeout, searchwindowsize)
1343 1345
 
@@ -1371,7 +1373,7 @@ def expect_loop(self, searcher, timeout = -1, searchwindowsize = -1):
1371 1373
                     self.match_index = index
1372 1374
                     return self.match_index
1373 1375
                 # No match at this point
1374  
-                if timeout < 0 and timeout is not None:
  1376
+                if timeout is not None and timeout < 0:
1375 1377
                     raise TIMEOUT ('Timeout exceeded in expect_any().')
1376 1378
                 # Still have time left, so read more data
1377 1379
                 c = self.read_nonblocking (self.maxread, timeout)
@@ -1381,7 +1383,7 @@ def expect_loop(self, searcher, timeout = -1, searchwindowsize = -1):
1381 1383
                 if timeout is not None:
1382 1384
                     timeout = end_time - time.time()
1383 1385
         except EOF, e:
1384  
-            self.buffer = ''
  1386
+            self.buffer = b''
1385 1387
             self.before = incoming
1386 1388
             self.after = EOF
1387 1389
             index = searcher.eof_index
@@ -1484,7 +1486,7 @@ def sigwinch_passthrough (sig, data):
1484 1486
         # Flush the buffer.
1485 1487
         self.stdout.write (self.buffer)
1486 1488
         self.stdout.flush()
1487  
-        self.buffer = ''
  1489
+        self.buffer = b''
1488 1490
         mode = tty.tcgetattr(self.STDIN_FILENO)
1489 1491
         tty.setraw(self.STDIN_FILENO)
1490 1492
         try:
@@ -1700,7 +1702,7 @@ def __init__(self, patterns):
1700 1702
         self.eof_index = -1
1701 1703
         self.timeout_index = -1
1702 1704
         self._searches = []
1703  
-        for n, s in zip(range(len(patterns)), patterns):
  1705
+        for n, s in enumerate(patterns):
1704 1706
             if s is EOF:
1705 1707
                 self.eof_index = n
1706 1708
                 continue
@@ -1721,7 +1723,7 @@ def __str__(self):
1721 1723
         if self.timeout_index >= 0:
1722 1724
             ss.append ((self.timeout_index,'    %d: TIMEOUT' % self.timeout_index))
1723 1725
         ss.sort()
1724  
-        ss = zip(*ss)[1]
  1726
+        ss = [a[1] for a in ss]
1725 1727
         return '\n'.join(ss)
1726 1728
 
1727 1729
     def search(self, buffer, freshlen, searchwindowsize=None):
16  IPython/utils/_process_posix.py
@@ -19,17 +19,12 @@
19 19
 import subprocess as sp
20 20
 import sys
21 21
 
22  
-# Third-party
23  
-# We ship our own copy of pexpect (it's a single file) to minimize dependencies
24  
-# for users, but it's only used if we don't find the system copy.
25  
-try:
26  
-    import pexpect
27  
-except ImportError:
28  
-    from IPython.external import pexpect
  22
+from IPython.external import pexpect
29 23
 
30 24
 # Our own
31 25
 from .autoattr import auto_attr
32 26
 from ._process_common import getoutput
  27
+from IPython.utils import text
33 28
 
34 29
 #-----------------------------------------------------------------------------
35 30
 # Function definitions
@@ -132,6 +127,9 @@ def system(self, cmd):
132 127
         -------
133 128
         int : child's exitstatus
134 129
         """
  130
+        # Get likely encoding for the output.
  131
+        enc = text.getdefaultencoding()
  132
+        
135 133
         pcmd = self._make_cmd(cmd)
136 134
         # Patterns to match on the output, for pexpect.  We read input and
137 135
         # allow either a short timeout or EOF
@@ -155,7 +153,7 @@ def system(self, cmd):
155 153
                 # res is the index of the pattern that caused the match, so we
156 154
                 # know whether we've finished (if we matched EOF) or not
157 155
                 res_idx = child.expect_list(patterns, self.read_timeout)
158  
-                print(child.before[out_size:], end='')
  156
+                print(child.before[out_size:].decode(enc, 'replace'), end='')
159 157
                 flush()
160 158
                 if res_idx==EOF_index:
161 159
                     break
@@ -171,7 +169,7 @@ def system(self, cmd):
171 169
             try:
172 170
                 out_size = len(child.before)
173 171
                 child.expect_list(patterns, self.terminate_timeout)
174  
-                print(child.before[out_size:], end='')
  172
+                print(child.before[out_size:].decode(enc, 'replace'), end='')
175 173
                 sys.stdout.flush()
176 174
             except KeyboardInterrupt:
177 175
                 # Impatient users tend to type it multiple times
2  IPython/utils/text.py
@@ -47,7 +47,7 @@ def getdefaultencoding():
47 47
     and usually ASCII.
48 48
     """
49 49
     enc = sys.stdin.encoding
50  
-    if not enc:
  50
+    if not enc or enc=='ascii':
51 51
         try:
52 52
             # There are reports of getpreferredencoding raising errors
53 53
             # in some cases, which may well be fixed, but let's be conservative here.
Commit_comment_tip

Tip: You can add notes to lines in a file. Hover to the left of a line to make a note

Something went wrong with that request. Please try again.