Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test.pl's watchdog() always times out and blocks harness while it is timing out on Win32 #13647

Closed
p5pRT opened this issue Mar 7, 2014 · 6 comments

Comments

@p5pRT
Copy link
Collaborator

commented Mar 7, 2014

Migrated from rt.perl.org#121395 (status was 'resolved')

Searchable as RT121395$

@p5pRT

This comment has been minimized.

Copy link
Collaborator Author

commented Mar 7, 2014

From @bulk88

Created by @bulk88

On Win32 re/subst.t and other subst*.t tests take 300
seconds while their watchdog process times out every time. The reason is
because the kill() in the END block in the .t file is on cmd.exe's PID,
not the
watchdog perl's PID, killing the cmd.exe leaves watchdog perl process
around, which blocks harness script until the timeout at as high as 300
seconds expires, why harness blocks on the watchdog proc IDK. Probably
something to do with inherited stdio handles or open3() handles but I',
not good at debugging stdio.

Here is the problematic code from test.pl
--------------------------------------------------------
  # On Windows and VMS, try launching a watchdog process
  # using system(1, ...) (see perlport.pod)
  if ($is_mswin || $is_vms) {
  # On Windows, try to get the 'real' PID
  if ($is_mswin) {
  eval { require Win32; };
  if (defined(&Win32​::GetCurrentProcessId)) {
  $pid_to_kill = Win32​::GetCurrentProcessId();
  }
  }

  # If we still have a fake PID, we can't use this method at all
  return if ($pid_to_kill <= 0);

  # Launch watchdog process
  my $watchdog;
  eval {
  local $SIG{'__WARN__'} = sub {
  _diag("Watchdog warning​: $_[0]");
  };
  my $sig = $is_vms ? 'TERM' : 'KILL';
  my $cmd = _create_runperl( prog => "sleep($timeout);" .
  "warn qq/#
$timeout_msg" . '\n/;' .
  "kill($sig,
$pid_to_kill);");
  $watchdog = system(1, $cmd);
  };
  if ($@​ || ($watchdog <= 0)) {
  _diag('Failed to start watchdog');
  _diag($@​) if $@​;
  undef($watchdog);
  return;
  }

  # Add END block to parent to terminate and
  # clean up watchdog process
  eval "END { local \$! = 0; local \$? = 0;
  wait() if kill('KILL', $watchdog); };";
  return;
  }
---------------------------------------------------

The problem is, the system() above, $watchdog contains a PID for
cmd.exe, not perl.exe which is a child process of that cmd.exe
TonyC brought up the question of why system() made a cmd.exe
process when perl is not a shell builtin and should have been directly
launched by the parent perl process (a perl that called watchdog()).

1st solution the above issue is being worked out in
https://rt-archive.perl.org/perl5/Ticket/Display.html?id=121283. Fixing that, will
cause this ticket to be fixed. I predict 121283 will take very long to
work out. Not for 5.20.

2nd solution is to a process group kill [on Win32] the watchdog, the
cmd.exe goes away, and the perl.exe underneath it goes away, broken in
5.17.2 until blead, still not commited, but patches are waiting in RT.
https://rt-archive.perl.org/perl5/Ticket/Display.html?id=121230 then change

  wait() if kill('KILL', $watchdog); };";

  wait() if kill('-KILL', $watchdog); };";
in test.pl.

3rd solution is to not kill the process with perl's kill(), but call a
cmd line tool like "taskkill" on WinXP and not included with OS but an
extra free download from microsoft called "pskill" which will do the
process tree kill that perl can't currently do until those patches in
https://rt-archive.perl.org/perl5/Ticket/Display.html?id=121230 are commit.

Perl Info

Flags:
category=core
severity=high

Site configuration information for perl 5.19.9:

Configured by Owner at Wed Feb 12 06:47:30 2014.

Summary of my perl5 (revision 5 version 19 subversion 9) configuration:
Derived from: 633f0fd2ca244ca83cc99b3af3a7d3ac2931850b
Platform:
osname=MSWin32, osvers=5.1, archname=MSWin32-x86-multi-thread
uname=''
config_args='undef'
hint=recommended, useposix=true, d_sigaction=undef
useithreads=define, usemultiplicity=define
use64bitint=undef, use64bitall=undef, uselongdouble=undef
usemymalloc=n, bincompat5005=undef
Compiler:
cc='cl', ccflags ='-nologo -GF -W3 -Od -MD -Zi -DDEBUGGING -DWIN32
-D_CONSOLE -DNO_STRICT -DPERL_TEXTMODE_SCRIPTS
-DPERL_HASH_FUNC_ONE_AT_A_TIME -DPERL_IMPLICIT_CONTEXT
-DPERL_IMPLICIT_SYS -DUSE_PERLIO -D_USE_32BIT_TIME_T',
optimize='-Od -MD -Zi -DDEBUGGING',
cppflags='-DWIN32'
ccversion='13.10.6030', gccversion='', gccosandvers=''
intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234
d_longlong=undef, longlongsize=8, d_longdbl=define, longdblsize=8
ivtype='long', ivsize=4, nvtype='double', nvsize=8, Off_t='__int64',
lseeksize=8
alignbytes=8, prototype=define
Linker and Libraries:
ld='link', ldflags ='-nologo -nodefaultlib -debug
-libpath:"c:\perl519\lib\CORE" -machine:x86'
libpth="C:\Program Files\Microsoft Visual Studio .NET 2003\VC7\lib"
libs=oldnames.lib kernel32.lib user32.lib gdi32.lib winspool.lib
comdlg32.lib advapi32.lib shell32.lib ole32.lib oleaut32.lib
netapi32.lib uuid.lib ws2_32.lib mpr.lib winmm.lib version.lib
odbc32.lib odbccp32.lib comctl32.lib msvcrt.lib
perllibs=oldnames.lib kernel32.lib user32.lib gdi32.lib winspool.lib
comdlg32.lib advapi32.lib shell32.lib ole32.lib oleaut32.lib
netapi32.lib uuid.lib ws2_32.lib mpr.lib winmm.lib version.lib
odbc32.lib odbccp32.lib comctl32.lib msvcrt.lib
libc=msvcrt.lib, so=dll, useshrplib=true, libperl=perl519.lib
gnulibc_version=''
Dynamic Linking:
dlsrc=dl_win32.xs, dlext=dll, d_dlsymun=undef, ccdlflags=' '
cccdlflags=' ', lddlflags='-dll -nologo -nodefaultlib -debug
-libpath:"c:\perl519\lib\CORE" -machine:x86'

Locally applied patches:
uncommitted-changes


@INC for perl 5.19.9:
C:/perl519/site/lib
C:/perl519/lib
.


Environment for perl 5.19.9:
HOME (unset)
LANG (unset)
LANGUAGE (unset)
LD_LIBRARY_PATH (unset)
LOGDIR (unset)
PATH=C:\perl519\bin;C:\Program Files\Microsoft Visual Studio .NET
2003\Common7\IDE;C:\Program Files\Microsoft Visual Studio .NET
2003\VC7\BIN;C:\Program Files\Microsoft Visual Studio .NET
2003\Common7\Tools;C:\Program Files\Microsoft Visual Studio .NET
2003\Common7\Tools\bin\prerelease;C:\WINDOWS\system32;C:\WINDOWS;C:\WINDOWS\system32\wbem;
PERL_BADLANG (unset)
SHELL (unset)



@p5pRT

This comment has been minimized.

Copy link
Collaborator Author

commented Mar 24, 2014

From @bulk88

Patch attached. Since this should go in for 5.20, I picked the easiest path, which is use process group kill to kill the cmd.exe and the perl.exe underneath that, since that is now committed under #121230 (121230 still has a unresolved smoker problem, but watchdog doesnt block now with this patch, I hope it wont block with the smoker). The patch saves about 900 seconds/15 minutes.


Files=2406, Tests=691402, 3263 wallclock secs (125.98 usr + 4.88 sys = 130.86 C
PU)
Result​: FAIL


All tests successful.
Files=2407, Tests=691474, 2357 wallclock secs (123.92 usr + 4.53 sys = 128.45 C
PU)
Result​: PASS


Ignore the fail, it was to prevent a hang so I get a accurate before time count.

--
bulk88 ~ bulk88 at hotmail.com

@p5pRT

This comment has been minimized.

Copy link
Collaborator Author

commented Mar 24, 2014

From @bulk88

0001-121395-fix-on-Win32-test.pl-watchdog-always-blocks-f.patch
From fb9c5a1a0fcfb4a2365d1e1e33b5e738d5ff9302 Mon Sep 17 00:00:00 2001
From: bulk88 <bulk88@hotmail.com>
Date: Sun, 23 Mar 2014 21:01:15 -0400
Subject: [PATCH] #121395 fix, on Win32 test.pl watchdog always blocks for
 entire timeout

See Perl RT #121395 for details.
---
 pod/perldelta.pod |    7 +++++++
 t/test.pl         |   10 ++++++++--
 2 files changed, 15 insertions(+), 2 deletions(-)

diff --git a/pod/perldelta.pod b/pod/perldelta.pod
index 5906e78..358aea3 100644
--- a/pod/perldelta.pod
+++ b/pod/perldelta.pod
@@ -309,6 +309,13 @@ L</Modules and Pragmata> section.
 
 XXX
 
+=item Win32
+
+About 15 minutes of idle sleeping was removed from running C<make test> due to
+a bug in which the timeout monitor used for tests could not be cancelled once
+the test completes, and the full timeout period elapsed before running the next
+test file.
+
 =back
 
 =head1 Internal Changes
diff --git a/t/test.pl b/t/test.pl
index 15282ca..30db88c 100644
--- a/t/test.pl
+++ b/t/test.pl
@@ -1558,8 +1558,14 @@ sub watchdog ($;$)
 
             # Add END block to parent to terminate and
             #   clean up watchdog process
-            eval "END { local \$! = 0; local \$? = 0;
-                        wait() if kill('KILL', $watchdog); };";
+            # Win32 watchdog is launched by cmd.exe shell, so use process group
+            # kill, otherwise the watchdog is never killed and harness waits
+            # every time for the timeout, #121395
+            eval( $is_mswin ?
+            "END { local \$! = 0; local \$? = 0;
+                        wait() if kill('-KILL', $watchdog); };"
+            : "END { local \$! = 0; local \$? = 0;
+                        wait() if kill('KILL', $watchdog); };");
             return;
         }
 
-- 
1.7.9.msysgit.0

@p5pRT

This comment has been minimized.

Copy link
Collaborator Author

commented Mar 24, 2014

From @steve-m-hay

Thanks, patch applied as 18ae2ab.

@p5pRT

This comment has been minimized.

Copy link
Collaborator Author

commented Mar 24, 2014

The RT System itself - Status changed from 'new' to 'open'

@p5pRT

This comment has been minimized.

Copy link
Collaborator Author

commented Mar 24, 2014

@steve-m-hay - Status changed from 'open' to 'resolved'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
1 participant
You can’t perform that action at this time.