Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Issue #558] Coerce compiled regex patterns type according to spawn encoding #559

Closed
wants to merge 5 commits into from

Conversation

EmersonPrado
Copy link

This change extends the type coercing already present in exact matches for compiled regex matches. In short: if spawn encoding is None, it changes the compiled regex to have a bytes type pattern. Otherwise, changes to non-bytes type pattern.

@EmersonPrado EmersonPrado changed the title [Issue 558] Coerce compiled regex patterns type according to spawn encoding [Issue #558] Coerce compiled regex patterns type according to spawn encoding Mar 4, 2019
@EmersonPrado EmersonPrado force-pushed the coerce-regex-bytes branch 2 times, most recently from 068424c to 5887e61 Compare March 4, 2019 19:34
@EmersonPrado
Copy link
Author

Some tests from myself:

Test code

import re
from lib.pexpect.pexpect import spawn, EOF, TIMEOUT

for spawn_string, encoding in [("ã", None), ("ã", 'ascii'), ("a", 'ascii'), ("ã", 'utf-8')]:
  for pattern in ['ã', b'a', r'ã']:
    try:
      print('Spawn string: "{}"\tSpawn encoding: "{}"\tMatch pattern: "{}"'
            .format(spawn_string, encoding, pattern))
      spawn_object = spawn('echo "{}"'.format(spawn_string), encoding=encoding)
    except UnicodeEncodeError as erro:
      print('Spawn string format and encoding mismatch')
      continue
    index = -1
    while index < 1:
      try:
        index = spawn_object.expect([re.compile(pattern), EOF, TIMEOUT])
      except UnicodeDecodeError as erro:
        print('Invalid pattern')
        break
      if index == 0:
        print('Match')

Before fix

Python 3

Spawn string: "ã"	Spawn encoding: "None"	Match pattern: "ã"
Traceback (most recent call last):
  File "<stdin>", line 12, in <module>
  File "<Pexpect path>/pexpect/spawnbase.py", line 341, in expect
    timeout, searchwindowsize, async_)
  File "<Pexpect path>/pexpect/spawnbase.py", line 369, in expect_list
    return exp.expect_loop(timeout)
  File "<Pexpect path>/pexpect/expect.py", line 103, in expect_loop
    idx = self.new_data(incoming)
  File "<Pexpect path>/pexpect/expect.py", line 29, in new_data
    index = searcher.search(window, len(data))
  File "<Pexpect path>/pexpect/expect.py", line 293, in search
    match = s.search(buffer, searchstart)
TypeError: cannot use a string pattern on a bytes-like object

Python 2

Spawn string: "ã"	Spawn encoding: "None"	Match pattern: "ã"
Match
Spawn string: "ã"	Spawn encoding: "None"	Match pattern: "a"
Spawn string: "ã"	Spawn encoding: "None"	Match pattern: "ã"
Match
Spawn string: "ã"	Spawn encoding: "ascii"	Match pattern: "ã"
Invalid pattern
Spawn string: "ã"	Spawn encoding: "ascii"	Match pattern: "a"
Invalid pattern
Spawn string: "ã"	Spawn encoding: "ascii"	Match pattern: "ã"
Invalid pattern
Spawn string: "a"	Spawn encoding: "ascii"	Match pattern: "ã"
Spawn string: "a"	Spawn encoding: "ascii"	Match pattern: "a"
Match
Spawn string: "a"	Spawn encoding: "ascii"	Match pattern: "ã"
Spawn string: "ã"	Spawn encoding: "utf-8"	Match pattern: "ã"  # Not nice
Spawn string: "ã"	Spawn encoding: "utf-8"	Match pattern: "a"
Spawn string: "ã"	Spawn encoding: "utf-8"	Match pattern: "ã"  # Not nice

After fix

Python 3

Spawn string: "ã"	Spawn encoding: "None"	Match pattern: "ã"
Match
Spawn string: "ã"	Spawn encoding: "None"	Match pattern: "b'a'"
Spawn string: "ã"	Spawn encoding: "None"	Match pattern: "ã"
Match
Spawn string: "ã"	Spawn encoding: "ascii"	Match pattern: "ã"
Spawn string format and encoding mismatch
Spawn string: "ã"	Spawn encoding: "ascii"	Match pattern: "b'a'"
Spawn string format and encoding mismatch
Spawn string: "ã"	Spawn encoding: "ascii"	Match pattern: "ã"
Spawn string format and encoding mismatch
Spawn string: "a"	Spawn encoding: "ascii"	Match pattern: "ã"
Spawn string: "a"	Spawn encoding: "ascii"	Match pattern: "b'a'"
Match
Spawn string: "a"	Spawn encoding: "ascii"	Match pattern: "ã"
Spawn string: "ã"	Spawn encoding: "utf-8"	Match pattern: "ã"
Match
Spawn string: "ã"	Spawn encoding: "utf-8"	Match pattern: "b'a'"
Spawn string: "ã"	Spawn encoding: "utf-8"	Match pattern: "ã"
Match

Python 2

Spawn string: "ã"	Spawn encoding: "None"	Match pattern: "ã"
Match
Spawn string: "ã"	Spawn encoding: "None"	Match pattern: "a"
Spawn string: "ã"	Spawn encoding: "None"	Match pattern: "ã"
Match
Spawn string: "ã"	Spawn encoding: "ascii"	Match pattern: "ã"
Invalid pattern
Spawn string: "ã"	Spawn encoding: "ascii"	Match pattern: "a"
Invalid pattern
Spawn string: "ã"	Spawn encoding: "ascii"	Match pattern: "ã"
Invalid pattern
Spawn string: "a"	Spawn encoding: "ascii"	Match pattern: "ã"
Spawn string: "a"	Spawn encoding: "ascii"	Match pattern: "a"
Match
Spawn string: "a"	Spawn encoding: "ascii"	Match pattern: "ã"
Spawn string: "ã"	Spawn encoding: "utf-8"	Match pattern: "ã"  # Nice
Match
Spawn string: "ã"	Spawn encoding: "utf-8"	Match pattern: "a"
Spawn string: "ã"	Spawn encoding: "utf-8"	Match pattern: "ã"  # Nice
Match

@EmersonPrado EmersonPrado force-pushed the coerce-regex-bytes branch 2 times, most recently from 0e53f6f to ec916dd Compare March 5, 2019 17:24
@EmersonPrado
Copy link
Author

I messed this thread too much. Sorry for the mess. I'll start over with a fresh branch and PR.

@EmersonPrado EmersonPrado deleted the coerce-regex-bytes branch March 5, 2019 17:30
@EmersonPrado EmersonPrado restored the coerce-regex-bytes branch March 5, 2019 17:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant