-
-
Notifications
You must be signed in to change notification settings - Fork 31.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
test_builtin.PtyTests fail on non-ASCII characters if the readline module is loaded #58094
Comments
I've recently come across a strange failure in the tests for the input() $ ./python -E -m test -v test_readline test_builtin
Traceback (most recent call last):
File "/home/nadeem/src/cpython/def/Lib/test/test_builtin.py", line 1079, in test_input_tty_non_ascii
self.check_input_tty("prompté", b"quux\xe9", "utf-8")
File "/home/nadeem/src/cpython/def/Lib/test/test_builtin.py", line 1070, in check_input_tty
self.assertEqual(input_result, expected)
AssertionError: 'quux' != 'quux\udce9'
- quux
+ quux\udce9
? +
Traceback (most recent call last):
File "/home/nadeem/src/cpython/def/Lib/test/test_builtin.py", line 1083, in test_input_tty_non_ascii_unicode_errors
self.check_input_tty("prompté", b"quux\xe9", "ascii")
File "/home/nadeem/src/cpython/def/Lib/test/test_builtin.py", line 1070, in check_input_tty
self.assertEqual(input_result, expected)
AssertionError: 'quux' != 'quux\udce9'
- quux
+ quux\udce9
? + The failure only manifests itself if the readline module is loaded before The problem seems to be that readline assumes that its input should use This problem doesn't crop up if readline is *not* loaded, because in that I have been able to fix the test failures with the attached patch, which |
Here's another patch that ensures the test always exercises the GNU Ideally we'd want to test both code paths, but I'm not sure how to |
I had exactly the same error on 3.3b1 when running the test suite with randomized order. |
This problem still exists on 3.4. |
New changeset cf0f450b3299 by Nadeem Vawda in branch '3.2': |
New changeset 5c8732049dd5 by Nadeem Vawda in branch '3.3': |
New changeset 9774721bfc32 by Nadeem Vawda in branch 'default': |
New changeset 12223782031f by Nadeem Vawda in branch '2.7': |
This looks dangerous to me. Are you sure readline's behavior doesn't change because of this? |
You're right; it breaks backspacing over multibyte characters. I should |
New changeset e6952acd5a55 by Nadeem Vawda in branch '3.2': |
New changeset 5c7e884b205a by Nadeem Vawda in branch '3.3': |
New changeset 8b8c6abda7e8 by Nadeem Vawda in branch 'default': |
New changeset 5bf91dfb1e34 by Nadeem Vawda in branch '2.7': |
Issue bpo-18230 is another test_builtin failure related to tty tests. |
This happens even when run test_builtin twice.
or
|
I don’t think this affects Python 2. The failing tests were added in revision 421c8e291221, bpo-13342, for 3.2+ only. They invole input() doing text decoding. AFAIK Python 2’s equivalent, raw_input(), does not do text decoding. I suspect we can’t really change how Readline handles text encoding errors, which seems to be what Nadeem was trying to do. I suggest to just fix the tests without changing Readline. As far as I know there is no way to un-register the Readline module once it has been loaded. A quick and dirty workaround might be to skip the test(s) if the Readline has been loaded ("readline" in sys.modules). But a better fix would probably be to run the test in a subprocess, where we can start a new interpreter from scratch and control whether Readline is loaded. Looking closer at the tests, they mention invoking Gnu Readline. But the associated bug fix is in the wrapper code around PyOS_Readline(), which may call Gnu Readline if it is loaded, or may call a simpler internal routine otherwise. So ideally the tests should be repeated with Readline unloaded and loaded. Also, the comment for test_input_tty_non_ascii() implies it is testing UTF-8 encoding. But the data is not valid UTF-8 so it ends up testing the error handling. I think that test should use valid UTF-8 input. |
Here is a possible patch which:
With my patch applied, there are a couple of prompts mixed up in the test log output via stderr, due to bpo-1927. It is not a major problem, but perhaps we should work on fixing that bug first. |
Since people keep coming upon this bug, perhaps we should inhibit push my fix without fixing that other prompt bug (now a feature change I think). Probably have to capture stderr to avoid it coming out in the test output. |
input-readline.v2.patch sets stderr=DEVNULL so that the prompt does not come out in the test log. A disadvantage of this is that if there is a failure, any error messages, stack trace, etc is also lost. To fix this properly, we would probably have to capture stderr using a generalized communicate() loop, like I have done at the bottom of <https://hg.python.org/cpython/annotate/v3.6.0b4/Lib/test/test_readline.py#l231\> (see also bpo-1260171). |
Hi Xavier, I was about to push input-readline.v2.patch, but I thought it might be good to check with you first if this causes problems with Android, based on my experience with bpo-28997. With the patch applied, test.test_builtin.PtyTests.test_input_tty_non_ascii() will call the input() function twice, with and without Readline enabled. It inputs non-ASCII data in into a pseudoterminal, and expects to see it in the function return value. Perhaps we may need to skip the Readline half of the test on Android. |
Thanks for waiting for a run of the patch on Android. if readline and not is_android:
tests.append(True) |
V3 of my patch skips the Readline tests in cases involving non-ASCII bytes when the locale seen by Readline would be ASCII. Readline may translate the non-ASCII bytes to escape sequences, and treat them as special Meta (Alt) key combinations. This behaviour depends on Readline configuration (“set convert-meta off” in /etc/inputrc in my case). It also includes a potential workaround for Android, depending on the resolution of bpo-28997. |
With input-readline.v3.patch, test_builtin runs with success on Android api 21. With pep538_coerce_legacy_c_locale_v4.diff that implements PEP-538 in bpo-28180, and with input-readline.v3.patch modified to have 'readline_encoding = locale.getpreferredencoding()' even when 'is_android' is True, test_builtin runs with success. This means that no specific Android handling would be needed if PEP-538 were to be adopted. The new input-readline.v4.patch is a slight improvement over the previous patch and sets readline_encoding to 'UTF-8' on Android when test_builtin is run with the environment variable LANG set to 'en_US.UTF-8' and in that case the test exercises all the code paths including those with the readline module. This is because locale.getdefaultlocale() returns ('en_US', 'UTF-8') on Android in that case and because both getdefaultlocale() and readline scan the environment to check for the locale. This new patch is only useful if tests on Android are expected to be run also with one of the locale environment variables set to UTF-8 (and PEP-538 is rejected). I have left some comments on Rietveld. |
bpo-31703 has been marked as a duplicate of this bug. |
As well as other 5 bugs. |
This old issue still is not fixed. Martin, Xavier, could you open a PR please? |
As stated in msg285810, bpo-28180 removed the need for an Android workaround. After checking this point on the Android API 24 emulator with the current master branch, I have opened PR 7133 which is Martin's input-readline.v3.patch with the Android workaround removed. The patch is commited with Martin as its author. |
All checks have passed on PR 7133, even on macOs, and the PR is ready to be reviewed :-) |
On 3.11 I don't see a problem on windows, but on mac test_input_tty from test_builtin hangs when test_readline runs first, while test_builtin passes on its own. |
See also bpo-44887. |
On Fedora 35, I still reproduce the initial issue on the main branch of Python: $ ./python -E -m test -v test_readline test_builtin
(...) ====================================================================== Traceback (most recent call last):
File "/home/vstinner/python/main/Lib/test/test_builtin.py", line 2095, in test_input_tty_non_ascii
self.check_input_tty("prompté", b"quux\xe9", "utf-8")
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/vstinner/python/main/Lib/test/test_builtin.py", line 2086, in check_input_tty
self.assertEqual(input_result, expected)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AssertionError: 'quux' != 'quux\udce9'
- quux
+ quux\udce9
? + ====================================================================== Traceback (most recent call last):
File "/home/vstinner/python/main/Lib/test/test_builtin.py", line 2099, in test_input_tty_non_ascii_unicode_errors
self.check_input_tty("prompté", b"quux\xe9", "ascii")
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/vstinner/python/main/Lib/test/test_builtin.py", line 2086, in check_input_tty
self.assertEqual(input_result, expected)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AssertionError: 'quux' != 'quux\udce9'
- quux
+ quux\udce9
? +
(...) Fedora 35 uses readline 8.1: $ ldd $(./python -c 'import readline; print(readline.__file__)')
...
libreadline.so.8 => /lib64/libreadline.so.8 (0x00007fd00e553000)
...
$ rpm -qf /lib64/libreadline.so.8
readline-8.1-3.fc35.x86_64
$ make pythoninfo|grep readline
readline._READLINE_LIBRARY_VERSION: 8.1
readline._READLINE_RUNTIME_VERSION: 0x801
readline._READLINE_VERSION: 0x801 |
Oh, the test_builtin.test_input_tty_non_ascii() fails just if test_readline is loaded previously: $ ./python -E -m test -m test.test_builtin.PtyTests.test_input_tty_non_ascii -v test_readline test_builtin
== CPython 3.11.0a4+ (heads/main:7f4b69b9076, Jan 17 2022, 12:28:15) [GCC 11.2.1 20211203 (Red Hat 11.2.1-7)]
== Linux-5.15.12-200.fc35.x86_64-x86_64-with-glibc2.34 little-endian
== cwd: /home/vstinner/python/main/build/test_python_49429æ
== CPU count: 8
== encodings: locale=UTF-8, FS=utf-8
0:00:00 load avg: 0.48 Run tests sequentially
0:00:00 load avg: 0.48 [1/2] test_readline Ran 0 tests in 0.000s OK ====================================================================== Traceback (most recent call last):
File "/home/vstinner/python/main/Lib/test/test_builtin.py", line 2095, in test_input_tty_non_ascii
self.check_input_tty("prompté", b"quux\xe9", "utf-8")
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/vstinner/python/main/Lib/test/test_builtin.py", line 2086, in check_input_tty
self.assertEqual(input_result, expected)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AssertionError: 'quux' != 'quux\udce9'
- quux
+ quux\udce9
? + Ran 1 test in 0.013s FAILED (failures=1) == Tests result: FAILURE == 1 test failed: 1 test run no tests: Total duration: 559 ms In just, just importing readline is enough to make the test fails: $ git diff Lib/test/test_builtin.py
diff --git a/Lib/test/test_builtin.py b/Lib/test/test_builtin.py
index 6dc4fa55502..20d3d33d9fb 100644
--- a/Lib/test/test_builtin.py
+++ b/Lib/test/test_builtin.py
@@ -1,3 +1,5 @@
+import readline
+
# Python test set -- built-in functions
import ast $ ./python -E -m test -m test.test_builtin.PtyTests.test_input_tty_non_ascii -v test_builtin
(...) ====================================================================== Traceback (most recent call last):
File "/home/vstinner/python/main/Lib/test/test_builtin.py", line 2097, in test_input_tty_non_ascii
self.check_input_tty("prompté", b"quux\xe9", "utf-8")
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/vstinner/python/main/Lib/test/test_builtin.py", line 2088, in check_input_tty
self.assertEqual(input_result, expected)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AssertionError: 'quux' != 'quux\udce9'
- quux
+ quux\udce9
? + (...) |
Since nobody managed to fix this issue in 10 years and the test still fails if the readline module is loaded, I wrote python/issues-test-cpython#30631 to skip the test if the readline module is loaded. |
rl-locale.diff changes the readline implementation of the PyOS_Readline() to set LC_CTYPE locale to "C": setlocale(LC_CTYPE, "C"), rather to the user preferred locale: setlocale(LC_CTYPE, ""). IMO it's a bad idea. Python made great progress in Unicode support, readline has a good Unicode support, and in most cases, it just works like a charm. This change looks a hack just to get these 2 specific tests to pass, but it breaks any other usage of readline. rl-test.diff skips the test if the readline module can be imported, and it always import the readline module. It's different than that python/issues-test-cpython#30631 which only checks if the readline module is currently imported: my change doesn't import readline in test_builtin. "input-readline*.patch" patches and #51382 spawn a fresh Python process to make sure that the readline is not imported or to import readline explicitly. They are better than my fix, but they are more complicated. It seems likle these changes also fix test_input_tty_non_ascii() but I don't understand how. |
I marked bpo-41034 "test_builtin: PtyTests fail when run twice" as a duplicate of this issue. Moreover, I tested manually: my change python/issues-test-cpython#30631 fix the "./python -m test -R 3:3 test_builtin" command. |
With my change, the two following commands now pass successfully:
|
Ok, the initial issue is now fixed: the test pass. If someone wants to write test input() with non-ASCII input and readline, I suggest to open a new issue and add the test in the test_readline module instead. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: