t: Make process handling more robust with IPC::Run #3123

okurz · 2020-05-28T11:22:39Z

Whenever processes are spawned they need to be tracked properly for
cleanup. This posed problems of sometimes leaking orphans when tests are
aborted or crashed. As we already use IPC::Run we can use the same for
other cases where we spawn processes with fork by using IPC::Run::start
instead. Additional benefits are that we can debug the process handling
and IPC going with https://metacpan.org/pod/IPC::Run#Debugging-Tip as
well as be able to catch output of the processes.

One example of a recent problem that should be fixed by this is
t/05-scheduler-full.t failing to stop completely when individual test
steps fail. This prevents the RETRY on the level of tools/retry and the
Makefile to work as the former test never completely finishes and is
stuck until the CI aborts the complete test run without further retries.

https://progress.opensuse.org/issues/59043

t/05-scheduler-full.t

kalikiana · 2020-05-28T12:12:09Z

t/lib/OpenQA/Test/Utils.pm

+    my ($h, $forced) = @_;
+    return unless $h;
+    if ($forced) {
+        $h->kill_kill(grace => 3);


Should we handle the exception and print a warning? Otherwise it presumably breaks the test plan in that case.

I don't think it does. This is basically still the same we were doing in before, we send TERM and then KILL if the parameter was given

Whenever processes are spawned they need to be tracked properly for cleanup. This posed problems of sometimes leaking orphans when tests are aborted or crashed. As we already use IPC::Run we can use the same for other cases where we spawn processes with fork by using IPC::Run::start instead. Additional benefits are that we can debug the process handling and IPC going with https://metacpan.org/pod/IPC::Run#Debugging-Tip as well as be able to catch output of the processes. One example of a recent problem that should be fixed by this is t/05-scheduler-full.t failing to stop completely when individual test steps fail. This prevents the RETRY on the level of tools/retry and the Makefile to work as the former test never completely finishes and is stuck until the CI aborts the complete test run without further retries. https://progress.opensuse.org/issues/59043

okurz · 2020-05-28T13:48:33Z

Test fails with

[13:35:45] t/05-scheduler-full.t ..................... 5/? # Looks like your test exited with 9 just after 5.
[13:35:45] t/05-scheduler-full.t .....................      Dubious, test returned 9 (wstat 2304, 0x900)
All 5 subtests passed

but I could not reproduce this locally

perlpunk · 2020-05-28T14:04:35Z

Also works for me locally. Maybe run prove with -v to get more info?

okurz · 2020-05-28T14:15:48Z

We do have all artifacts available. In https://app.circleci.com/pipelines/github/os-autoinst/openQA/3049/workflows/51f1299a-215e-47e1-85e1-baaf4dde7d56/jobs/29008 I reran also with ssh and both that run succeeded as well as my run within the ssh session

codecov · 2020-05-28T15:05:21Z

Codecov Report

Merging #3123 into master will not change coverage.
The diff coverage is n/a.

@@           Coverage Diff           @@
##           master    #3123   +/-   ##
=======================================
  Coverage   92.05%   92.05%           
=======================================
  Files         211      211           
  Lines       12932    12932           
=======================================
  Hits        11904    11904           
  Misses       1028     1028

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update e0f2892...7285b1a. Read the comment docs.

okurz · 2020-05-28T17:47:09Z

All problems fixed. Somehow I have the feeling the observed problems where somewhat temporary in circle CI

okurz force-pushed the enhance/ipc_run branch from 957dc8f to a4420aa Compare May 28, 2020 11:56

kalikiana reviewed May 28, 2020

View reviewed changes

okurz force-pushed the enhance/ipc_run branch from a4420aa to 7285b1a Compare May 28, 2020 13:14

okurz marked this pull request as draft May 28, 2020 13:47

okurz marked this pull request as ready for review May 28, 2020 17:46

kalikiana approved these changes May 29, 2020

View reviewed changes

Martchus approved these changes May 29, 2020

View reviewed changes

kalikiana merged commit 2508bce into os-autoinst:master May 29, 2020

okurz deleted the enhance/ipc_run branch May 29, 2020 12:43

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

t: Make process handling more robust with IPC::Run #3123

t: Make process handling more robust with IPC::Run #3123

okurz commented May 28, 2020

kalikiana May 28, 2020

okurz May 28, 2020

okurz commented May 28, 2020

perlpunk commented May 28, 2020

okurz commented May 28, 2020

codecov bot commented May 28, 2020

okurz commented May 28, 2020

t: Make process handling more robust with IPC::Run #3123

t: Make process handling more robust with IPC::Run #3123

Conversation

okurz commented May 28, 2020

kalikiana May 28, 2020

Choose a reason for hiding this comment

okurz May 28, 2020

Choose a reason for hiding this comment

okurz commented May 28, 2020

perlpunk commented May 28, 2020

okurz commented May 28, 2020

codecov bot commented May 28, 2020

Codecov Report

okurz commented May 28, 2020