[wptrunner] Tests which output large amounts of data cause process to hang #13446

jugglinmike · 2018-10-10T00:35:34Z

While collecting results for Safari, the WPT CLI frequently stalls. This always occurs while executing WebDriver specification tests, specifically tests concerning user prompts. This suggested that the problem is related to the browser's WebDriver implementation, but it appears to actually concern wptrunner itself.

The WebDriver specification tests are written with pytest, and they are designed to output full failure logs. A failing test includes a full Python stack trace, so failures can be quite verbose.

Today, Safari fails many tests concerning user prompts, and because those tests are procedurally generated using pytest's "fixtures" feature, this leads to a very large amount of data being transferred between processes.

I believe that the glut of logging data is the actual cause of the problem. To demonstrate, the same symptoms are exhibited when using any browser to execute the following two test files:

# file: lots-of-failures.py
import pytest

message = 'x' * 1024 * 5

@pytest.mark.parametrize("num", xrange(100))
def test_fail(num):
    raise Exception(message)

# file: one-more-test.py
def test_pass():
    pass

The first file will cause a large amount of data to be written to the child process's standard output and subsequently transferred to the parent process. wptrunner will report, "Forcibly terminating runner process", attempt to execute the second test file, and hang indefinitely. This matches the behavior observed when executing in-tree WebDriver spec tests with Safari.

"TestRunner" sub-processes forward their standard output streams to the "TestRunnerManager" process via a Python multiprocessing Queue. When such a process produces a large amount of output (e.g. in failing WebDriver specification tests), the data may be buffered in the underlying operating system pipe. In this state, such a process will not exit naturally: > Bear in mind that a process that has put items in a queue will wait > before terminating until all the buffered items are fed by the > "feeder" thread to the underlying pipe. [1] Previously, the TestRunnerManager forcibly terminated the sub-process and re-used the message queue, providing it to a new sub-process and waiting for new items to be inserted. However, the queue's behavior is unpredictable in this state. It has been observed to block indefinitely on GNU/Linux and macOS systems [2]. To avoid this behavior, discard the queue and create a new instance for use in subsequent tests. [1] https://docs.python.org/2/library/multiprocessing.html#all-platforms [2] web-platform-tests#13446

"TestRunner" sub-processes forward their standard output streams to the "TestRunnerManager" process via a Python multiprocessing Queue. When such a process produces a large amount of output (e.g. in failing WebDriver specification tests), the data may be buffered in the underlying operating system pipe. In this state, such a process will not exit naturally: > Bear in mind that a process that has put items in a queue will wait > before terminating until all the buffered items are fed by the > "feeder" thread to the underlying pipe. [1] Previously, the TestRunnerManager forcibly terminated the sub-process and re-used the message queue, providing it to a new sub-process and waiting for new items to be inserted. However, the queue's behavior is unpredictable in this state. It has been observed to block indefinitely on GNU/Linux and macOS systems [2]. To avoid this behavior, discard the queue and create a new instance for use in subsequent tests. [1] https://docs.python.org/2/library/multiprocessing.html#all-platforms [2] #13446

jugglinmike · 2018-10-11T20:05:01Z

Resolved via #13447

…eues, a=testonly Automatic update from web-platform-tests[wptrunner] Discard corrupted message queues (#13447) "TestRunner" sub-processes forward their standard output streams to the "TestRunnerManager" process via a Python multiprocessing Queue. When such a process produces a large amount of output (e.g. in failing WebDriver specification tests), the data may be buffered in the underlying operating system pipe. In this state, such a process will not exit naturally: > Bear in mind that a process that has put items in a queue will wait > before terminating until all the buffered items are fed by the > "feeder" thread to the underlying pipe. [1] Previously, the TestRunnerManager forcibly terminated the sub-process and re-used the message queue, providing it to a new sub-process and waiting for new items to be inserted. However, the queue's behavior is unpredictable in this state. It has been observed to block indefinitely on GNU/Linux and macOS systems [2]. To avoid this behavior, discard the queue and create a new instance for use in subsequent tests. [1] https://docs.python.org/2/library/multiprocessing.html#all-platforms [2] web-platform-tests/wpt#13446 -- wpt-commits: f6bca7b6218f591edc1bcb87c9ab0837ca41970b wpt-pr: 13447

…eues, a=testonly Automatic update from web-platform-tests[wptrunner] Discard corrupted message queues (#13447) "TestRunner" sub-processes forward their standard output streams to the "TestRunnerManager" process via a Python multiprocessing Queue. When such a process produces a large amount of output (e.g. in failing WebDriver specification tests), the data may be buffered in the underlying operating system pipe. In this state, such a process will not exit naturally: > Bear in mind that a process that has put items in a queue will wait > before terminating until all the buffered items are fed by the > "feeder" thread to the underlying pipe. [1] Previously, the TestRunnerManager forcibly terminated the sub-process and re-used the message queue, providing it to a new sub-process and waiting for new items to be inserted. However, the queue's behavior is unpredictable in this state. It has been observed to block indefinitely on GNU/Linux and macOS systems [2]. To avoid this behavior, discard the queue and create a new instance for use in subsequent tests. [1] https://docs.python.org/2/library/multiprocessing.html#all-platforms [2] web-platform-tests/wpt#13446 -- wpt-commits: f6bca7b6218f591edc1bcb87c9ab0837ca41970b wpt-pr: 13447 UltraBlame original commit: 693b05802d989054d34e7ede25ae2f3b0443007d

jugglinmike added infra wptrunner The automated test runner, commonly called through ./wpt run labels Oct 10, 2018

This was referenced Oct 10, 2018

[wptrunner] Discard corrupted message queues #13447

Merged

Intermittent timeouts in Safari web-platform-tests/results-collection#615

Closed

foolip assigned jugglinmike Oct 10, 2018

foolip added the priority:roadmap label Oct 10, 2018

jugglinmike closed this as completed Oct 11, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[wptrunner] Tests which output large amounts of data cause process to hang #13446

[wptrunner] Tests which output large amounts of data cause process to hang #13446

jugglinmike commented Oct 10, 2018

jugglinmike commented Oct 11, 2018

[wptrunner] Tests which output large amounts of data cause process to hang #13446

[wptrunner] Tests which output large amounts of data cause process to hang #13446

Comments

jugglinmike commented Oct 10, 2018

jugglinmike commented Oct 11, 2018