test_pty: test_basic() fails randomly on Travis CI #75341

vstinner · 2017-08-09T10:52:14Z

BPO	31158
Nosy	@vstinner, @xdegaye, @vadmium, @stratakis, @diekmann
PRs	bpo-31158: test_pty now handles partial writes #3037 bpo-31158: Fix test_pty.test_basic() #3802 bpo-31158: Fix nondeterministic read in test_pty #3808 [3.6] bpo-31158: Fix nondeterministic read in test_pty (GH-3808) #3852 [2.7] bpo-31158: Fix nondeterministic read in test_pty (#3808) #3853

^{Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.}

Show more details

GitHub fields:

assignee = None
closed_at = <Date 2017-10-02.09:59:15.484>
created_at = <Date 2017-08-09.10:52:13.951>
labels = ['3.7', 'tests']
title = 'test_pty: test_basic() fails randomly on Travis CI'
updated_at = <Date 2017-10-02.09:59:15.483>
user = 'https://github.com/vstinner'

bugs.python.org fields:

activity = <Date 2017-10-02.09:59:15.483>
actor = 'vstinner'
assignee = 'none'
closed = True
closed_date = <Date 2017-10-02.09:59:15.484>
closer = 'vstinner'
components = ['Tests']
creation = <Date 2017-08-09.10:52:13.951>
creator = 'vstinner'
dependencies = []
files = []
hgrepos = []
issue_num = 31158
keywords = ['patch']
message_count = 21.0
messages = ['299987', '299988', '299991', '299995', '300067', '303199', '303241', '303249', '303252', '303257', '303259', '303261', '303275', '303276', '303286', '303397', '303518', '303519', '303522', '303523', '303524']
nosy_count = 5.0
nosy_names = ['vstinner', 'xdegaye', 'martin.panter', 'cstratak', 'Cornelius Diekmann']
pr_nums = ['3037', '3802', '3808', '3852', '3853']
priority = 'normal'
resolution = 'fixed'
stage = 'resolved'
status = 'closed'
superseder = None
type = None
url = 'https://bugs.python.org/issue31158'
versions = ['Python 2.7', 'Python 3.6', 'Python 3.7']

vstinner · 2017-08-09T10:52:14Z

https://travis-ci.org/python/cpython/jobs/262606831

0:13:26 load avg: 134.81 [256/403] test_pty
test test_pty failed -- Traceback (most recent call last):
  File "/home/travis/build/python/cpython/Lib/test/test_pty.py", line 109, in test_basic
    self.assertEqual(b'For my pet fish, Eric.\n', normalize_output(s2))
AssertionError: b'For my pet fish, Eric.\n' != b'For m'

vstinner · 2017-08-09T11:02:16Z

I don't know why test_pty fails.

#3037 makes sure that all data is written into the master fd using a new write_all() helper function which loops until all data is written: handle os.write() partial write.

vadmium · 2017-08-09T11:55:50Z

Checking for short writes is worthwhile, but in bpo-29070 it looks like Cornelius identified the main problem was short _reads_. See the parts of his patch to do with “_os_read_exactly” and related functions.

stratakis · 2017-08-09T12:24:31Z

I've observed the exact same issue on Fedora at random times.

diekmann · 2017-08-10T09:28:59Z

I observed the same issue, but the problem occurs mainly when reading data. In my proposed patch in bpo-29070, I use the existing pty._writen() to make sure all data is written. As Martin mentioned, reading is a problem. My patch proposes _os_timeout_read(), _os_readline(), _os_read_exactly(), _os_read_exhaust_exactly() to give fine-grained, deterministic control. Cheeky advertisement: Anybody cares to review bpo-29070 or cherry pick the pty._writen() and _os_read* parts? :)

vstinner · 2017-09-28T09:06:41Z

Failure on x86 Gentoo Refleaks 3.x too:

http://buildbot.python.org/all/builders/x86%20Gentoo%20Refleaks%203.x/builds/104/steps/test/logs/stdio

======================================================================
FAIL: test_basic (test.test_pty.PtyTest)
----------------------------------------------------------------------

Traceback (most recent call last):
  File "/buildbot/buildarea/3.x.ware-gentoo-x86.refleak/build/Lib/test/test_pty.py", line 105, in test_basic
    self.assertEqual(b'For my pet fish, Eric.\n', normalize_output(s2))
AssertionError: b'For my pet fish, Eric.\n' != b'For m'

vstinner · 2017-09-28T14:28:02Z

Oh, in fact, it's not that hard to reproduce the failure on Linux:

haypo@selma$ ./python -m test test_pty -m test_basic -F 
Run tests sequentially
0:00:00 load avg: 2.17 [  1] test_pty
0:00:00 load avg: 2.17 [  2] test_pty
0:00:00 load avg: 2.17 [  3] test_pty
(...)
0:00:00 load avg: 2.17 [ 15] test_pty
0:00:00 load avg: 2.17 [ 16] test_pty
test test_pty failed -- Traceback (most recent call last):
  File "/home/haypo/prog/python/master/Lib/test/test_pty.py", line 105, in test_basic
    self.assertEqual(b'For my pet fish, Eric.\n', normalize_output(s2))
AssertionError: b'For my pet fish, Eric.\n' != b'For m'

test_pty failed
15 tests OK.

1 test failed:
test_pty

Total duration: 486 ms
Tests result: FAILURE

diekmann · 2017-09-28T14:50:10Z

Regarding PR 3802:

Does reading the string b'I wish to buy a fish license.\n' not cause a problem, too?
Is reading len(expected) bytes the correct behavior for systems where normalize_output is needed?

I also fixed these issues in one of my (unmerged) pull requests (bpo bpo-29070). Unfortunately, the patch I propose does way too many things at the same time and I don't have time currently to untangle it :( Feel free to cherry pick the _os_readline function:
https://github.com/python/cpython/pull/2932/files#diff-6c897ae50dce4ffa73bde779206a2059R60
Don't hesitate to put your name on it, if you take the time to review and cherry pick :)

Cheers!

vstinner · 2017-09-28T15:00:16Z

Does reading the string b'I wish to buy a fish license.\n' not cause a problem, too?

This string TEST_STRING_1 is used for a single os.write() call, whereas TEST_STRING_2 is splitted and written in two parts with two os.write() calls.

I prefer minimal change, since I don't know well the pty module.

Is reading len(expected) bytes the correct behavior for systems where normalize_output is needed?

Yeah, I looked at this function. normalize_output() can return a string shorter than the input: len(normalize_output(s2)) <= len(s2).

So I think that len(s2) < len(expected) is correct.

diekmann · 2017-09-28T15:47:44Z

> Does reading the string b'I wish to buy a fish license.\n' not cause a problem, too?

This string TEST_STRING_1 is used for a single os.write() call, whereas TEST_STRING_2 is splitted and written in two parts with two os.write() calls.

I prefer minimal change, since I don't know well the pty module.

I like the idea of minimal change, too :D

Yet, I think your patch does not solve the core problem of read/write being nondeterministic. In theory, a pty is similar to a pipe (with termios processing in the middle). So any os.write to a pty fd is nondeterministic and may put less bytes into the pty buffer than given to the write call (see the return value of os.write, which test_pty.py does not check). Multiple writes are buffered by the kernel, until the buffer is full. So the kernel already accumulates the chunked writing for us. Usually, this works fine.

Similarly, a os.read may also be nondeterministic, depending on how many bytes are ready in the pty buffer. This may have nothing to do with the chunked writing of the test_pty.py module because the kernel is doing the read/write syscalls and handling the pty buffer.

Here is a PoC: I checked out your code and stressed my system with it. I have 2 physical and 2 virtual cores and started 8 instances of the test to stress my kernel with scheduling, locking of kernel buffers (the pty buffer), and making read/write more nondeterministic.

./python -b -m test -j 8 test_pty -m test_basic -F

Here is what I got:
0:00:13 load avg: 3.25 [119/1] test_pty failed
test test_pty failed -- Traceback (most recent call last):
  File "XXX/cpython/Lib/test/test_pty.py", line 99, in test_basic
    normalize_output(s1))
AssertionError: b'I wish to buy a fish license.\n' != b'I wish to buy a fish license.'

> Is reading len(expected) bytes the correct behavior for systems where normalize_output is needed?

Yeah, I looked at this function. normalize_output() can return a string shorter than the input: len(normalize_output(s2)) <= len(s2).

So I think that len(s2) < len(expected) is correct.

Warning, obscure corner cases ahead.
In theory, given read is completely nondeterministic and we are on one of the strange systems which need normalize_output, the following could happen:
We have b'For my pet fish, Eric.\r\n' in the pty buffer. We read b'For my pet fish, Eric.\r' from the buffer into s2. Now len(s2) == len(expected) but a b'\n' is still unread in the buffer. This would make the test fail. I admit, this is a corner case, but also an argument that a clean test case may want to have a readline function.

diekmann · 2017-09-28T16:02:17Z

Oh, one more thing, since it is unclear whether read or write is causing the error. I replaced
os.write(slave_fd, TEST_STRING_1)
by
pty._writen(slave_fd, TEST_STRING_1)
which makes sure all bytes are written.
After some time, when the load gets high and the machine gets noisy, I get the same error:

0:01:09 load avg: 6.17 [597/1] test_pty failed
test test_pty failed -- Traceback (most recent call last):
  File "XXX/cpython/Lib/test/test_pty.py", line 99, in test_basic
    normalize_output(s1))
AssertionError: b'I wish to buy a fish license.\n' != b'I wish to buy a fish license.'

So the main problem is a nondeterministic read. But I'm only testing on Linux. Other systems with different kernel may behave differently. And the documentation allows a kernel to make both short reads and writes.

diekmann · 2017-09-28T16:38:12Z

I also added a pull request which I cannot get to fail on my system. What do you think, haypo?

Sorry for the noise!

vstinner · 2017-09-28T20:18:31Z

Cornelius:Here is a PoC: I checked out your code and stressed my system with it. (...) AssertionError: ..."

Did you get this failure with my PR 3802?

Cornelius: "We have b'For my pet fish, Eric.\r\n' in the pty buffer. We read b'For my pet fish, Eric.\r' from the buffer into s2. Now len(s2) == len(expected) but a b'\n' is still unread in the buffer. This would make the test fail. I admit, this is a corner case, but also an argument that a clean test case may want to have a readline function."

Right. I can just modify my condition from "len(s2) < len(expected)" to "b'\n' not in s2".

Cornelius: "I also added a pull request which I cannot get to fail on my system. What do you think, haypo?"

Your code calls read(1) in a loop until it gets the newline character b'\n'.

Is it better to os.read(1024) in a loop until the output string contains b'\n'?

I have no strong preference.

vstinner · 2017-09-28T20:19:22Z

Hum, let me clarify: my main goal is to get a stable CI. I don't really care of the exact implementation of one specific unit test in test_pty ;-)

diekmann · 2017-09-28T22:27:35Z

Yes, I get the AssertionError with the latest version of PR 3802. From the high load avg of my system, you can see that the error occurs very rarely and that I need to stress my system to trigger it. With PR 3802, the error occurs way less frequently than without it. So we are definitely moving into the right direction. With PR 3808, I could not trigger any error (on my system).

Changing the conditions to "b'\n' not in s2" should work. But we actually mean to read a line, so this should be better reflected in the code. I prefer calling a readline() function directly, which is almost self-documenting code.

Your code calls read(1) in a loop until it gets the newline character b'\n'.
Is it better to os.read(1024) in a loop until the output string contains b'\n'?
Behavior may be different if there are multiple short lines in the buffer (which should not be the case in the unit test, but this may be a problem if somebody copies the code and uses it somewhere else).
pitrou in the discussion of PR 3808 suggests the ultimate answer to the question: Just use an existing readline function from the library :-)
I added this to PR 3808.

Personal thought: I care about good code in the unit tests because people might look at this as reference how to use a module and copy&paste from it. I want the tests to be deterministic, which---as long as the tests pass---implies a stable CI ;-)

vadmium · 2017-09-30T07:03:11Z

I prefer Cornelius’s current proposal (rev 4f8137b) because it fixes both sites, rather than just patching the immediate problem site.

I don’t think read(1) is a big problem, just less efficient. But if you prefer to do larger reads, that should be fine too. You could even use os.fdopen(...).readline(), which would use BufferedReader. It is not documented, but BufferedReader should do large reads and return the line without waiting to fill its internal buffer.

FWIW I think it would be okay to remove the “chunked output” test. It isn’t exercising the “pty” module any more than the “Writing to slave_fd” test above it. All we need to do is verify that the “master_open” function returns a PTY master and that “slave_open” returns the corresponding slave terminal file descriptor. Checking “isatty” and sending one line of data through seems sufficient.

vstinner · 2017-10-02T09:39:58Z

New changeset e6f62f6 by Victor Stinner (Cornelius Diekmann) in branch 'master':
bpo-31158: Fix nondeterministic read in test_pty (bpo-3808)
e6f62f6

vstinner · 2017-10-02T09:48:53Z

I prefer Cornelius’s current proposal (rev 4f8137b)

Ok, fine. I merged his PR 3852. I created backports to 2.7 and 3.6.

vstinner · 2017-10-02T09:57:59Z

New changeset 66fb5ef by Victor Stinner (Miss Islington (bot)) in branch '3.6':
[3.6] bpo-31158: Fix nondeterministic read in test_pty (GH-3808) (GH-3852)
66fb5ef

vstinner · 2017-10-02T09:58:11Z

New changeset 20cbc1d by Victor Stinner in branch '2.7':
bpo-31158: Fix nondeterministic read in test_pty (bpo-3808) (bpo-3853)
20cbc1d

vstinner · 2017-10-02T09:59:15Z

Thanks Cornelius Diekmann for your fix ;-)

vstinner added 3.7 (EOL) end of life tests Tests in the Lib/test dir labels Aug 9, 2017

vstinner closed this as completed Oct 2, 2017

ezio-melotti transferred this issue from another repository Apr 10, 2022

vstinner mentioned this issue Oct 11, 2023

test_pty: test_openpty() timed out after 3h 20min on aarch64 RHEL8 Refleaks 3.x: #110673

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

test_pty: test_basic() fails randomly on Travis CI #75341

test_pty: test_basic() fails randomly on Travis CI #75341

vstinner commented Aug 9, 2017

vstinner commented Aug 9, 2017

vstinner commented Aug 9, 2017

vadmium commented Aug 9, 2017

stratakis mannequin commented Aug 9, 2017

diekmann mannequin commented Aug 10, 2017

vstinner commented Sep 28, 2017

vstinner commented Sep 28, 2017

diekmann mannequin commented Sep 28, 2017

vstinner commented Sep 28, 2017

diekmann mannequin commented Sep 28, 2017

diekmann mannequin commented Sep 28, 2017

diekmann mannequin commented Sep 28, 2017

vstinner commented Sep 28, 2017

vstinner commented Sep 28, 2017

diekmann mannequin commented Sep 28, 2017

vadmium commented Sep 30, 2017

vstinner commented Oct 2, 2017

vstinner commented Oct 2, 2017

vstinner commented Oct 2, 2017

vstinner commented Oct 2, 2017

vstinner commented Oct 2, 2017

test_pty: test_basic() fails randomly on Travis CI #75341

test_pty: test_basic() fails randomly on Travis CI #75341

Comments

vstinner commented Aug 9, 2017

vstinner commented Aug 9, 2017

vstinner commented Aug 9, 2017

vadmium commented Aug 9, 2017

stratakis mannequin commented Aug 9, 2017

diekmann mannequin commented Aug 10, 2017

vstinner commented Sep 28, 2017

vstinner commented Sep 28, 2017

diekmann mannequin commented Sep 28, 2017

vstinner commented Sep 28, 2017

diekmann mannequin commented Sep 28, 2017

diekmann mannequin commented Sep 28, 2017

diekmann mannequin commented Sep 28, 2017

vstinner commented Sep 28, 2017

vstinner commented Sep 28, 2017

diekmann mannequin commented Sep 28, 2017

vadmium commented Sep 30, 2017

vstinner commented Oct 2, 2017

vstinner commented Oct 2, 2017

vstinner commented Oct 2, 2017

vstinner commented Oct 2, 2017

vstinner commented Oct 2, 2017