Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[PATCH] speed up miniperl @INC searching, buildcustomize #13566

Closed
p5pRT opened this issue Jan 30, 2014 · 44 comments

Comments

@p5pRT
Copy link
Collaborator

commented Jan 30, 2014

Migrated from rt.perl.org#121119 (status was 'resolved')

Searchable as RT121119$

@p5pRT

This comment has been minimized.

Copy link
Collaborator Author

commented Jan 30, 2014

From @bulk88

Created by @bulk88

See attached patch. Each dir being searched that fails on Win32 results
in 10 I/O sys calls which I counted at 17 ms wall time, times 16 build
customize dirs that didn't match is an extra 272 ms of wall time dealing
with the not found dirs during an @​INC search. Also as the build
progresses, more and more things will be found on the 1st try in /lib,
and I presume buildcustomize dirs become totally unused once XS module
start being built since everything needed for XS building (which is a PP
process) will be in /lib and found on 1st try.

regular build
-----------------------------------------------------------------
  if exist ..\x2p\a2p.exe.manifest mt -nologo -manifest
..\x2p\a2p.exe.man
ifest -outputresource​:..\x2p\a2p.exe;1 && if exist
..\x2p\a2p.exe.manifest del
..\x2p\a2p.exe.manifest
Everything is up to date. 'nmake test' to run test suite.
timethis 1​: 1656 wallclock secs ( 0.00 usr + 0.00 sys = 0.00 CPU)
  (warning​: too few iterations for a reliable count)

C​:\p519\src\win32>perl -MBenchmark -e" timethis(1, 'system(\'nmake\')')"
-----------------------------------------------------------------
build with this patch
-----------------------------------------------------------------
  if exist ..\x2p\a2p.exe.manifest mt -nologo -manifest
..\x2p\a2p.exe.man
ifest -outputresource​:..\x2p\a2p.exe;1 && if exist
..\x2p\a2p.exe.manifest del
..\x2p\a2p.exe.manifest
Everything is up to date. 'nmake test' to run test suite.
timethis 1​: 1137 wallclock secs ( 0.00 usr + 0.00 sys = 0.00 CPU)
  (warning​: too few iterations for a reliable count)

C​:\p519\srcnew\win32>perl -MBenchmark -e" timethis(1, 'system(\'nmake\')')"
-----------------------------------------------------------------

(1656 - 1137) / 60 = 8.65 mins

(the following number is a flaky statistic)
(1656 - 1137) / .272 ms= 1908.08824 runs of miniperl???

I've attached a procmon log of miniperl uselessly searching a large num
of @​INC dirs every time a use/require is done. The log is of the "before".

I left the original comment in place for historical reasons. The comment
might be from something that happened in a Module​::Build module back
when M​::B was in core.

Perl Info

Flags:
           category=core
           severity=low

Site configuration information for perl 5.19.7:

Configured by Owner at Thu Nov 28 02:32:44 2013.

Summary of my perl5 (revision 5 version 19 subversion 7) configuration:
         Derived from: 8f47723e28b75530b743941cdd8b07f849ec48e2
         Ancestor: 1061065f7a09399eefb50e9a035502621722bcc0
         Platform:
           osname=MSWin32, osvers=5.1, archname=MSWin32-x86-multi-thread
           uname=''
           config_args='undef'
           hint=recommended, useposix=true, d_sigaction=undef
           useithreads=define, usemultiplicity=define
           useperlio=define, d_sfio=undef, uselargefiles=define,
usesocks=undef
           use64bitint=undef, use64bitall=undef, uselongdouble=undef
           usemymalloc=n, bincompat5005=undef
         Compiler:
           cc='cl', ccflags ='-nologo -GF -W3 -O1 -MD -Zi -DNDEBUG -G7 -GL
-DWIN32 -D_CONSOLE -DNO_STRICT  -DPERL_TEXTMODE_SCRIPTS
-DPERL_HASH_FUNC_ONE_AT_A_TIME -DPERL_IMPLICIT_CONTEXT
-DPERL_IMPLICIT_SYS -DUSE_PERLIO -D_USE_32BIT_TIME_T',
           optimize='-O1 -MD -Zi -DNDEBUG -G7 -GL',
           cppflags='-DWIN32'
           ccversion='13.10.6030', gccversion='', gccosandvers=''
           intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234
           d_longlong=undef, longlongsize=8, d_longdbl=define, longdblsize=8
           ivtype='long', ivsize=4, nvtype='double', nvsize=8,
Off_t='__int64',
lseeksize=8
           alignbytes=8, prototype=define
         Linker and Libraries:
           ld='link', ldflags ='-nologo -nodefaultlib -debug -opt:ref,icf
-ltcg  -libpath:"c:\perl519\lib\CORE"  -machine:x86'
           libpth="C:\Program Files\Microsoft Visual Studio .NET
2003\VC7\lib"
           libs=oldnames.lib kernel32.lib user32.lib gdi32.lib winspool.lib
comdlg32.lib advapi32.lib shell32.lib ole32.lib oleaut32.lib
netapi32.lib uuid.lib ws2_32.lib mpr.lib winmm.lib  version.lib
odbc32.lib odbccp32.lib comctl32.lib msvcrt.lib
           perllibs=oldnames.lib kernel32.lib user32.lib gdi32.lib
winspool.lib  comdlg32.lib advapi32.lib shell32.lib ole32.lib
oleaut32.lib  netapi32.lib uuid.lib ws2_32.lib mpr.lib winmm.lib
version.lib odbc32.lib odbccp32.lib comctl32.lib msvcrt.lib
           libc=msvcrt.lib, so=dll, useshrplib=true, libperl=perl519.lib
           gnulibc_version=''
         Dynamic Linking:
           dlsrc=dl_win32.xs, dlext=dll, d_dlsymun=undef, ccdlflags=' '
           cccdlflags=' ', lddlflags='-dll -nologo -nodefaultlib -debug
-opt:ref,icf -ltcg  -libpath:"c:\perl519\lib\CORE"  -machine:x86'

Locally applied patches:
           uncommitted-changes
           8f47723e28b75530b743941cdd8b07f849ec48e2


@INC for perl 5.19.7:
           C:/perl519/site/lib
           C:/perl519/lib
           .


Environment for perl 5.19.7:
           HOME (unset)
           LANG (unset)
           LANGUAGE (unset)
           LD_LIBRARY_PATH (unset)
           LOGDIR (unset)
           PATH=C:\perl519\bin;C:\Program Files\Microsoft Visual Studio .NET
2003\Common7\IDE;C:\Program Files\Microsoft Visual Studio .NET
2003\VC7\BIN;C:\Program Files\Microsoft Visual Studio .NET
2003\Common7\Tools;C:\Program Files\Microsoft Visual Studio .NET
2003\Common7\Tools\bin\prerelease;C:\WINDOWS\system32;C:\WINDOWS;C:\WINDOWS\system32\wbem;
           PERL_BADLANG (unset)
           SHELL (unset)









@p5pRT

This comment has been minimized.

Copy link
Collaborator Author

commented Jan 30, 2014

@p5pRT

This comment has been minimized.

Copy link
Collaborator Author

commented Jan 30, 2014

From @bulk88

0001-speed-up-miniperl-INC-searching-buildcustomize.patch
From 8af300d3dcbc3e6955a9f2b79fa73cbb323d6c1d Mon Sep 17 00:00:00 2001
From: Daniel Dragan <bulk88@hotmail.com>
Date: Thu, 30 Jan 2014 02:53:09 -0500
Subject: [PATCH] speed up miniperl @INC searching, buildcustomize

most modules are pre-installed in /lib. The buildcustomize modules are
more rarely used than warnings and strict for example. Previously /lib
was the last dir searched with ~1.5 dozen dirs uselessly searched first.
This commit reduced my build time by ~8-9 mins.
---
 write_buildcustomize.pl |    7 ++++++-
 1 files changed, 6 insertions(+), 1 deletions(-)

diff --git a/write_buildcustomize.pl b/write_buildcustomize.pl
index 64bf4ce..5309988 100644
--- a/write_buildcustomize.pl
+++ b/write_buildcustomize.pl
@@ -48,13 +48,18 @@ push @toolchain, 'ext/VMS-Filespec/lib' if $^O eq 'VMS';
 unshift @INC, @toolchain;
 require File::Spec::Functions;
 
+# former comment
+#
 # lib must be last, as the toolchain modules write themselves into it
 # as they build, and it's important that @INC order ensures that the partially
 # written files are always masked by the complete versions.
 
 my $inc = join ",\n        ",
     map { "q\0$_\0" }
-    (map {File::Spec::Functions::rel2abs($_)} @toolchain, 'lib');
+# putting lib first shaves a couple minutes off the build time since the most
+# common modules like warnings and strict are in lib, and as extensions are
+# built the chances of the module being found in lib increases
+    (map {File::Spec::Functions::rel2abs($_)} 'lib', @toolchain);
 
 open my $fh, '>', $file
     or die "Can't open $file: $!";
-- 
1.7.9.msysgit.0

@p5pRT

This comment has been minimized.

Copy link
Collaborator Author

commented Jan 30, 2014

From @bulk88

On Thu Jan 30 00​:01​:05 2014, bulk88 wrote​:

This is a bug report for perl from bulk88@​hotmail.com,
generated with the help of perlbug 1.39 running under perl 5.19.7.

-----------------------------------------------------------------
[Please describe your issue here]

See attached patch. Each dir being searched that fails on Win32
results
in 10 I/O sys calls which I counted at 17 ms wall time, times 16 build
customize dirs that didn't match is an extra 272 ms of wall time
dealing
with the not found dirs during an @​INC search. Also as the build
progresses, more and more things will be found on the 1st try in /lib,
and I presume buildcustomize dirs become totally unused once XS module
start being built since everything needed for XS building (which is a
PP
process) will be in /lib and found on 1st try.

regular build
-----------------------------------------------------------------
if exist ..\x2p\a2p.exe.manifest mt -nologo -manifest
..\x2p\a2p.exe.man
ifest -outputresource​:..\x2p\a2p.exe;1 && if exist
..\x2p\a2p.exe.manifest del
..\x2p\a2p.exe.manifest
Everything is up to date. 'nmake test' to run test suite.
timethis 1​: 1656 wallclock secs ( 0.00 usr + 0.00 sys = 0.00 CPU)
(warning​: too few iterations for a reliable count)

C​:\p519\src\win32>perl -MBenchmark -e" timethis(1,
'system(\'nmake\')')"
-----------------------------------------------------------------
build with this patch
-----------------------------------------------------------------
if exist ..\x2p\a2p.exe.manifest mt -nologo -manifest
..\x2p\a2p.exe.man
ifest -outputresource​:..\x2p\a2p.exe;1 && if exist
..\x2p\a2p.exe.manifest del
..\x2p\a2p.exe.manifest
Everything is up to date. 'nmake test' to run test suite.
timethis 1​: 1137 wallclock secs ( 0.00 usr + 0.00 sys = 0.00 CPU)
(warning​: too few iterations for a reliable count)

C​:\p519\srcnew\win32>perl -MBenchmark -e" timethis(1,
'system(\'nmake\')')"
-----------------------------------------------------------------

(1656 - 1137) / 60 = 8.65 mins

(the following number is a flaky statistic)
(1656 - 1137) / .272 ms= 1908.08824 runs of miniperl???

I've attached a procmon log of miniperl uselessly searching a large
num
of @​INC dirs every time a use/require is done. The log is of the
"before".

I left the original comment in place for historical reasons. The
comment
might be from something that happened in a Module​::Build module back
when M​::B was in core.

[Please do not change anything below this line]
-----------------------------------------------------------------
---
Flags​:
category=core
severity=low
---
Site configuration information for perl 5.19.7​:

Configured by Owner at Thu Nov 28 02​:32​:44 2013.

Summary of my perl5 (revision 5 version 19 subversion 7)
configuration​:
Derived from​: 8f47723e28b75530b743941cdd8b07f849ec48e2
Ancestor​: 1061065
Platform​:
osname=MSWin32, osvers=5.1, archname=MSWin32-x86-multi-
thread
uname=''
config_args='undef'
hint=recommended, useposix=true, d_sigaction=undef
useithreads=define, usemultiplicity=define
useperlio=define, d_sfio=undef, uselargefiles=define,
usesocks=undef
use64bitint=undef, use64bitall=undef, uselongdouble=undef
usemymalloc=n, bincompat5005=undef
Compiler​:
cc='cl', ccflags ='-nologo -GF -W3 -O1 -MD -Zi -DNDEBUG -G7
-GL
-DWIN32 -D_CONSOLE -DNO_STRICT -DPERL_TEXTMODE_SCRIPTS
-DPERL_HASH_FUNC_ONE_AT_A_TIME -DPERL_IMPLICIT_CONTEXT
-DPERL_IMPLICIT_SYS -DUSE_PERLIO -D_USE_32BIT_TIME_T',
optimize='-O1 -MD -Zi -DNDEBUG -G7 -GL',
cppflags='-DWIN32'
ccversion='13.10.6030', gccversion='', gccosandvers=''
intsize=4, longsize=4, ptrsize=4, doublesize=8,
byteorder=1234
d_longlong=undef, longlongsize=8, d_longdbl=define,
longdblsize=8
ivtype='long', ivsize=4, nvtype='double', nvsize=8,
Off_t='__int64',
lseeksize=8
alignbytes=8, prototype=define
Linker and Libraries​:
ld='link', ldflags ='-nologo -nodefaultlib -debug
-opt​:ref,icf
-ltcg -libpath​:"c​:\perl519\lib\CORE" -machine​:x86'
libpth="C​:\Program Files\Microsoft Visual Studio .NET
2003\VC7\lib"
libs=oldnames.lib kernel32.lib user32.lib gdi32.lib
winspool.lib
comdlg32.lib advapi32.lib shell32.lib ole32.lib oleaut32.lib
netapi32.lib uuid.lib ws2_32.lib mpr.lib winmm.lib version.lib
odbc32.lib odbccp32.lib comctl32.lib msvcrt.lib
perllibs=oldnames.lib kernel32.lib user32.lib gdi32.lib
winspool.lib comdlg32.lib advapi32.lib shell32.lib ole32.lib
oleaut32.lib netapi32.lib uuid.lib ws2_32.lib mpr.lib winmm.lib
version.lib odbc32.lib odbccp32.lib comctl32.lib msvcrt.lib
libc=msvcrt.lib, so=dll, useshrplib=true,
libperl=perl519.lib
gnulibc_version=''
Dynamic Linking​:
dlsrc=dl_win32.xs, dlext=dll, d_dlsymun=undef, ccdlflags='
'
cccdlflags=' ', lddlflags='-dll -nologo -nodefaultlib
-debug
-opt​:ref,icf -ltcg -libpath​:"c​:\perl519\lib\CORE" -machine​:x86'

Locally applied patches​:
uncommitted-changes
8f47723e28b75530b743941cdd8b07f849ec48e2

---
@​INC for perl 5.19.7​:
C​:/perl519/site/lib
C​:/perl519/lib
.

---
Environment for perl 5.19.7​:
HOME (unset)
LANG (unset)
LANGUAGE (unset)
LD_LIBRARY_PATH (unset)
LOGDIR (unset)
PATH=C​:\perl519\bin;C​:\Program Files\Microsoft Visual
Studio .NET
2003\Common7\IDE;C​:\Program Files\Microsoft Visual Studio .NET
2003\VC7\BIN;C​:\Program Files\Microsoft Visual Studio .NET
2003\Common7\Tools;C​:\Program Files\Microsoft Visual Studio .NET
2003\Common7\Tools\bin\prerelease;C​:\WINDOWS\system32;C​:\WINDOWS;C​:\WINDOWS\system32\wbem;
PERL_BADLANG (unset)
SHELL (unset)

This was rejected from the P5P ML due to attachment size (250KB limit IIRC). Bumping.

--
bulk88 ~ bulk88 at hotmail.com

@p5pRT

This comment has been minimized.

Copy link
Collaborator Author

commented Jan 30, 2014

From @iabyn

On Thu, Jan 30, 2014 at 12​:13​:16AM -0800, bulk88 via RT wrote​:

See attached patch. Each dir being searched that fails on Win32
results
in 10 I/O sys calls which I counted at 17 ms wall time, times 16 build
customize dirs that didn't match is an extra 272 ms of wall time
dealing
with the not found dirs during an @​INC search. Also as the build
progresses, more and more things will be found on the 1st try in /lib,
and I presume buildcustomize dirs become totally unused once XS module
start being built since everything needed for XS building (which is a
PP
process) will be in /lib and found on 1st try.

[snip; this cuts the 'make' time by​: ]

(1656 - 1137) / 60 = 8.65 mins

[snip]

I left the original comment in place for historical reasons. The
comment
might be from something that happened in a Module​::Build module back
when M​::B was in core.

For the convenience of others, here is the original patch​:

Inline Patch
diff --git a/write_buildcustomize.pl b/write_buildcustomize.pl
index 64bf4ce..5309988 100644
--- a/write_buildcustomize.pl
+++ b/write_buildcustomize.pl
@@ -48,13 +48,18 @@ push @toolchain, 'ext/VMS-Filespec/lib' if $^O eq 'VMS';
 unshift @INC, @toolchain;
 require File::Spec::Functions;
 
+# former comment
+#
 # lib must be last, as the toolchain modules write themselves into it
 # as they build, and it's important that @INC order ensures that the partially
 # written files are always masked by the complete versions.
 
 my $inc = join ",\n        ",
     map { "q\0$_\0" }
-    (map {File::Spec::Functions::rel2abs($_)} @toolchain, 'lib');
+# putting lib first shaves a couple minutes off the build time since the most
+# common modules like warnings and strict are in lib, and as extensions are
+# built the chances of the module being found in lib increases
+    (map {File::Spec::Functions::rel2abs($_)} 'lib', @toolchain);


Before such a patch went in, I think we'd have to be sure that the reasons stated in the code for needing to put lib last don't still apply\. Although I'm not very familiar with lib/buildcustomize\.pl\, it seems to me that the reason still stands\. Especially on a parallel make\, I could see one instance of miniperl running while another instance is copying files from ext/Foo to lib\, thus allowing it to see empty or truncated files; or even if the installation of files into lib/ is atomic\, a mixture of files from the same distro\, some in lib\, some still under ext/ or whatever might be seem\, and might be harmful\.

So this doesn't seem safe to me.

--
More than any other time in history, mankind faces a crossroads. One path
leads to despair and utter hopelessness. The other, to total extinction.
Let us pray we have the wisdom to choose correctly.
  -- Woody Allen

@p5pRT

This comment has been minimized.

Copy link
Collaborator Author

commented Jan 30, 2014

The RT System itself - Status changed from 'new' to 'open'

@p5pRT

This comment has been minimized.

Copy link
Collaborator Author

commented Jan 30, 2014

From @nwc10

On Thu, Jan 30, 2014 at 11​:25​:28AM +0000, Dave Mitchell wrote​:

On Thu, Jan 30, 2014 at 12​:13​:16AM -0800, bulk88 via RT wrote​:

See attached patch. Each dir being searched that fails on Win32
results
in 10 I/O sys calls which I counted at 17 ms wall time, times 16 build
customize dirs that didn't match is an extra 272 ms of wall time
dealing
with the not found dirs during an @​INC search. Also as the build
progresses, more and more things will be found on the 1st try in /lib,
and I presume buildcustomize dirs become totally unused once XS module
start being built since everything needed for XS building (which is a
PP
process) will be in /lib and found on 1st try.

[snip; this cuts the 'make' time by​: ]

(1656 - 1137) / 60 = 8.65 mins

[snip]

I left the original comment in place for historical reasons. The
comment
might be from something that happened in a Module​::Build module back
when M​::B was in core.

For the convenience of others, here is the original patch​:

diff --git a/write_buildcustomize.pl b/write_buildcustomize.pl
index 64bf4ce..5309988 100644
--- a/write_buildcustomize.pl
+++ b/write_buildcustomize.pl
@​@​ -48,13 +48,18 @​@​ push @​toolchain, 'ext/VMS-Filespec/lib' if $^O eq 'VMS';
unshift @​INC, @​toolchain;
require File​::Spec​::Functions;

+# former comment
+#
# lib must be last, as the toolchain modules write themselves into it
# as they build, and it's important that @​INC order ensures that the partially
# written files are always masked by the complete versions.

my $inc = join ",\n ",
map { "q\0$_\0" }
- (map {File​::Spec​::Functions​::rel2abs($_)} @​toolchain, 'lib');
+# putting lib first shaves a couple minutes off the build time since the most
+# common modules like warnings and strict are in lib, and as extensions are
+# built the chances of the module being found in lib increases
+ (map {File​::Spec​::Functions​::rel2abs($_)} 'lib', @​toolchain);

Before such a patch went in, I think we'd have to be sure that the reasons
stated in the code for needing to put lib last don't still apply.
Although I'm not very familiar with lib/buildcustomize.pl, it seems to me
that the reason still stands. Especially on a parallel make, I could
see one instance of miniperl running while another instance is copying
files from ext/Foo to lib, thus allowing it to see empty or truncated
files; or even if the installation of files into lib/ is atomic, a mixture
of files from the same distro, some in lib, some still under ext/ or
whatever might be seem, and might be harmful.

So this doesn't seem safe to me.

Thanks for ensuring that the patch gets to the list.

No, specifically it's completely unsafe for a parallel make.

You end up with race conditions where the build fails because one process
process loads from lib/ a partially written module and aborts, because it
happens just as another process is copying that file there.
(And the rest of the time you don't have a problem because the file is either
not in lib, so loaded from the original dist/... etc, or it's fully copied to
lib, so loaded from there)

Which is exactly what the comment in the file tries to explain.

However, as Win32 doesn't have parallel makes *yet*, and this seems to be a
big speed hit on Win32, I think that it would be reasonable to do the
proposed re-ordering on Win32, with a comment that it needs to be rethought
once anyone starts getting parallel makes working on Win32.

Nicholas Clark

@p5pRT

This comment has been minimized.

Copy link
Collaborator Author

commented Jan 30, 2014

From @iabyn

On Thu, Jan 30, 2014 at 12​:13​:16AM -0800, bulk88 via RT wrote​:

See attached patch. Each dir being searched that fails on Win32
results
in 10 I/O sys calls which I counted at 17 ms wall time, times 16 build
customize dirs that didn't match is an extra 272 ms of wall time
dealing
with the not found dirs during an @​INC search.

That does sound extraordinarily slow. On my (admittedly fast) Linux laptop,
the following code takes approx 0.9 microsec per -f test​:

  my @​path = (
  "home/davem/perl5/git/bleed-davem/cpan/AutoLoader/lib",
  "home/davem/perl5/git/bleed-davem/dist/Carp/lib",
  "home/davem/perl5/git/bleed-davem/dist/PathTools",
  "home/davem/perl5/git/bleed-davem/dist/PathTools/lib",
  "home/davem/perl5/git/bleed-davem/dist/ExtUtils-Command/lib",
  "home/davem/perl5/git/bleed-davem/dist/ExtUtils-Install/lib",
  "home/davem/perl5/git/bleed-davem/cpan/ExtUtils-MakeMaker/lib",
  "home/davem/perl5/git/bleed-davem/dist/ExtUtils-Manifest/lib",
  "home/davem/perl5/git/bleed-davem/cpan/File-Path/lib",
  "home/davem/perl5/git/bleed-davem/ext/re",
  "home/davem/perl5/git/bleed-davem/dist/Term-ReadLine/lib",
  "home/davem/perl5/git/bleed-davem/dist/Exporter/lib",
  "home/davem/perl5/git/bleed-davem/ext/File-Find/lib",
  "home/davem/perl5/git/bleed-davem/cpan/Text-Tabs/lib",
  "home/davem/perl5/git/bleed-davem/dist/constant/lib",
  "home/davem/perl5/git/bleed-davem/lib",
  );

  my $i = 0;
  for my $file ("aaa" .."zzz") {
  for (@​path) {
  $i++;
  next unless -f "$_/NoSuchFile-$file.pmc" or -f "$_/NoSuchFile-$file.pm";
  }
  }
  print "i=$i\n";

Perhaps this implies that's there's something sub-optimal in the way perl
does its @​INC scanning under Win32?

--
Fire extinguisher (n) a device for holding open fire doors.

@p5pRT

This comment has been minimized.

Copy link
Collaborator Author

commented Jan 30, 2014

From @bulk88

On Thu Jan 30 05​:30​:31 2014, davem wrote​:

On Thu, Jan 30, 2014 at 12​:13​:16AM -0800, bulk88 via RT wrote​:

See attached patch. Each dir being searched that fails on Win32
results
in 10 I/O sys calls which I counted at 17 ms wall time, times 16
build
customize dirs that didn't match is an extra 272 ms of wall time
dealing
with the not found dirs during an @​INC search.

That does sound extraordinarily slow. On my (admittedly fast) Linux
laptop,
the following code takes approx 0.9 microsec per -f test​:

my @​path = (
"home/davem/perl5/git/bleed-davem/cpan/AutoLoader/lib",
"home/davem/perl5/git/bleed-davem/dist/Carp/lib",
"home/davem/perl5/git/bleed-davem/dist/PathTools",
"home/davem/perl5/git/bleed-davem/dist/PathTools/lib",
"home/davem/perl5/git/bleed-davem/dist/ExtUtils-Command/lib",
"home/davem/perl5/git/bleed-davem/dist/ExtUtils-Install/lib",
"home/davem/perl5/git/bleed-davem/cpan/ExtUtils-MakeMaker/lib",
"home/davem/perl5/git/bleed-davem/dist/ExtUtils-Manifest/lib",
"home/davem/perl5/git/bleed-davem/cpan/File-Path/lib",
"home/davem/perl5/git/bleed-davem/ext/re",
"home/davem/perl5/git/bleed-davem/dist/Term-ReadLine/lib",
"home/davem/perl5/git/bleed-davem/dist/Exporter/lib",
"home/davem/perl5/git/bleed-davem/ext/File-Find/lib",
"home/davem/perl5/git/bleed-davem/cpan/Text-Tabs/lib",
"home/davem/perl5/git/bleed-davem/dist/constant/lib",
"home/davem/perl5/git/bleed-davem/lib",
);

my $i = 0;
for my $file ("aaa" .."zzz") {
for (@​path) {
$i++;
next unless -f "$_/NoSuchFile-$file.pmc" or -f
"$_/NoSuchFile-$file.pm";
}
}
print "i=$i\n";

Perhaps this implies that's there's something sub-optimal in the way
perl
does its @​INC scanning under Win32?

Go to the RT ticket through web interface, and look at the bad_inc_procmon.txt attachment, it contains strace like data with wall time on the very left. The calls are sniffed by a filter driver that gets first dibs on all I/O packets (since NT Kernel uses a packetized (or transactional) async I/O model) coming from user mode.

A failed @​INC test, per dir looks like this after hand trimming of the right side of each line (look at the raw data if you want to see a dump of the args to the syscalls). QueryDirectory is related to enumerating a dir. It requires a dir handle, hence the CreateFile on C​:\p519\src\cpan\Test-Harness. "QueryOpen" I'm not sure what C function generates it, or if its called internally by the FS driver, but it fetches this struct http​://msdn.microsoft.com/en-us/library/windows/hardware/ff545822%28v=vs.85%29.aspx (from later research it is GetFileAttributesA)


9​:08​:36.3922806 PM miniperl.exe 1208 CreateFile C​:\p519\src\cpan\Test-Harness\DynaLoader.pmc NAME NOT FOUND
9​:08​:36.3937813 PM miniperl.exe 1208 CreateFile C​:\p519\src\cpan\Test-Harness SUCCESS
9​:08​:36.3949223 PM miniperl.exe 1208 QueryDirectory C​:\p519\src\cpan\Test-Harness\DynaLoader.pmc NO SUCH FILE
9​:08​:36.3960337 PM miniperl.exe 1208 CloseFile C​:\p519\src\cpan\Test-Harness SUCCESS
9​:08​:36.3976316 PM miniperl.exe 1208 QueryOpen C​:\p519\src\cpan\Test-Harness\DynaLoader.pmc NAME NOT FOUND
9​:08​:36.3992307 PM miniperl.exe 1208 CreateFile C​:\p519\src\cpan\Test-Harness\DynaLoader.pm NAME NOT FOUND
9​:08​:36.4007414 PM miniperl.exe 1208 CreateFile C​:\p519\src\cpan\Test-Harness SUCCESS 9​:08​:36.4019127 PM miniperl.exe 1208 QueryDirectory C​:\p519\src\cpan\Test-Harness\DynaLoader.pm NO SUCH FILE
9​:08​:36.4030441 PM miniperl.exe 1208 CloseFile C​:\p519\src\cpan\Test-Harness SUCCESS
9​:08​:36.4046092 PM miniperl.exe 1208 QueryOpen C​:\p519\src\cpan\Test-Harness\DynaLoader.pm NAME NOT FOUND


By setting a BP at the enter kernel mode func, and doing "miniperl -Ilib -e "system('pause'); require FooLoader;"" I'll try to figure out what the syscall were. First one

from win32_stat path 0x00a0b660 "C​:\perl519\src\cpan\AutoLoader\lib/FooLoader.pmc" const char *


  ntdll.dll!_KiFastSystemCall@​0() + 0x2
  ntdll.dll!_NtCreateFile@​44() + 0xc
  kernel32.dll!_CreateFileW@​28() + 0x1b6
  kernel32.dll!_CreateFileA@​28() + 0x2b
  miniperl.exe!win32_stat(const char * path=0x00100080, stat * sbuf=0x0012fe50) Line 1499 + 0xe C
  miniperl.exe!S_doopen_pm(sv * name=0x00070023) Line 3630 + 0x2a C
  miniperl.exe!Perl_pp_require() Line 4012 C
  miniperl.exe!Perl_runops_standard() Line 42 + 0x3 C
  miniperl.exe!S_run_body(long oldscope=0x00000001) Line 2448 C
  miniperl.exe!perl_run(interpreter * my_perl=0x0012ffc0) Line 2362 + 0x8 C
  miniperl.exe!main(int argc=0x00000004, char * * argv=0x00912478, char * * env=0x00912c90) Line 118 + 0x5 C
  miniperl.exe!mainCRTStartup() Line 398 + 0xe C
  kernel32.dll!_BaseProcessStart@​4() + 0x23


+ path 0x00a0b660 "C​:\perl519\src\cpan\AutoLoader\lib/FooLoader.pmc" const char *


  ntdll.dll!_KiFastSystemCall@​0() + 0x2
  ntdll.dll!_ZwOpenFile@​24() + 0xc
  kernel32.dll!_FindFirstFileExW@​24() + 0x179
  kernel32.dll!_FindFirstFileA@​8() + 0x3e
  msvcr71.dll!_stati64(const char * name=0x0012f810, _stati64 * buf=0x00000000) Line 184 C
  miniperl.exe!win32_stat(const char * path=0x00a0b660, _stati64 * sbuf=0x0012fd48) Line 1510 + 0xe C
  miniperl.exe!S_doopen_pm(interpreter * my_perl=0x009e3fb0, sv * name=0x009e7b30) Line 3630 + 0x43 C
  miniperl.exe!Perl_pp_require(interpreter * my_perl=0x009e3fb0) Line 4011 + 0xd C
  miniperl.exe!Perl_runops_standard(interpreter * my_perl=0x009e3fb0) Line 42 + 0xa C
  miniperl.exe!S_run_body(interpreter * my_perl=0x009e3fb0, long oldscope=0x00000001) Line 2446 + 0xd C
  miniperl.exe!perl_run(interpreter * my_perl=0x009e3fb0) Line 2365 C
  miniperl.exe!main(int argc=0x00000004, char * * argv=0x009e2478, char * * env=0x009e2c90) Line 118 + 0xb C
  miniperl.exe!mainCRTStartup() Line 398 + 0xe C
  kernel32.dll!_BaseProcessStart@​4() + 0x23


+ path 0x00a0b660 "C​:\perl519\src\cpan\AutoLoader\lib/FooLoader.pmc" const char *


  ntdll.dll!_KiFastSystemCall@​0() + 0x2
  ntdll.dll!_ZwQueryDirectoryFile@​44() + 0xc
  kernel32.dll!_FindFirstFileExW@​24() + 0x245
  kernel32.dll!_FindFirstFileA@​8() + 0x3e
  msvcr71.dll!_stati64(const char * name=0x0012f810, _stati64 * buf=0x00000000) Line 184 C
  miniperl.exe!win32_stat(const char * path=0x00a0b660, _stati64 * sbuf=0x0012fd48) Line 1510 + 0xe C
  miniperl.exe!S_doopen_pm(interpreter * my_perl=0x009e3fb0, sv * name=0x009e7b30) Line 3630 + 0x43 C
  miniperl.exe!Perl_pp_require(interpreter * my_perl=0x009e3fb0) Line 4011 + 0xd C
  miniperl.exe!Perl_runops_standard(interpreter * my_perl=0x009e3fb0) Line 42 + 0xa C
  miniperl.exe!S_run_body(interpreter * my_perl=0x009e3fb0, long oldscope=0x00000001) Line 2446 + 0xd C
  miniperl.exe!perl_run(interpreter * my_perl=0x009e3fb0) Line 2365 C
  miniperl.exe!main(int argc=0x00000004, char * * argv=0x009e2478, char * * env=0x009e2c90) Line 118 + 0xb C
  miniperl.exe!mainCRTStartup() Line 398 + 0xe C
  kernel32.dll!_BaseProcessStart@​4() + 0x23


+ path 0x00a0b660 "C​:\perl519\src\cpan\AutoLoader\lib/FooLoader.pmc" const char *


  ntdll.dll!_KiFastSystemCall@​0() + 0x2
  ntdll.dll!_NtClose@​4() + 0xc
  kernel32.dll!_FindFirstFileExW@​24() + 0x497
  kernel32.dll!_FindFirstFileA@​8() + 0x3e
  msvcr71.dll!_stati64(const char * name=0x0012f810, _stati64 * buf=0x00000000) Line 184 C
  miniperl.exe!win32_stat(const char * path=0x00a0b660, _stati64 * sbuf=0x0012fd48) Line 1510 + 0xe C
  miniperl.exe!S_doopen_pm(interpreter * my_perl=0x009e3fb0, sv * name=0x009e7b30) Line 3630 + 0x43 C
  miniperl.exe!Perl_pp_require(interpreter * my_perl=0x009e3fb0) Line 4011 + 0xd C
  miniperl.exe!Perl_runops_standard(interpreter * my_perl=0x009e3fb0) Line 42 + 0xa C
  miniperl.exe!S_run_body(interpreter * my_perl=0x009e3fb0, long oldscope=0x00000001) Line 2446 + 0xd C
  miniperl.exe!perl_run(interpreter * my_perl=0x009e3fb0) Line 2365 C
  miniperl.exe!main(int argc=0x00000004, char * * argv=0x009e2478, char * * env=0x009e2c90) Line 118 + 0xb C
  miniperl.exe!mainCRTStartup() Line 398 + 0xe C
  kernel32.dll!_BaseProcessStart@​4() + 0x23


+ path 0x00a0b660 "C​:\perl519\src\cpan\AutoLoader\lib/FooLoader.pmc" const char *


  ntdll.dll!_KiFastSystemCall@​0() + 0x2
  ntdll.dll!_NtQueryAttributesFile@​8() + 0xc
  kernel32.dll!_GetFileAttributesW@​4() + 0x67
  kernel32.dll!_GetFileAttributesA@​4() + 0x1d
  miniperl.exe!win32_stat(const char * path=0x00a0b660, _stati64 * sbuf=0x0012fd48) Line 1521 + 0xa C
  miniperl.exe!S_doopen_pm(interpreter * my_perl=0x009e3fb0, sv * name=0x009e7b30) Line 3630 + 0x43 C
  miniperl.exe!Perl_pp_require(interpreter * my_perl=0x009e3fb0) Line 4011 + 0xd C
  miniperl.exe!Perl_runops_standard(interpreter * my_perl=0x009e3fb0) Line 42 + 0xa C
  miniperl.exe!S_run_body(interpreter * my_perl=0x009e3fb0, long oldscope=0x00000001) Line 2446 + 0xd C
  miniperl.exe!perl_run(interpreter * my_perl=0x009e3fb0) Line 2365 C
  miniperl.exe!main(int argc=0x00000004, char * * argv=0x009e2478, char * * env=0x009e2c90) Line 118 + 0xb C
  miniperl.exe!mainCRTStartup() Line 398 + 0xe C
  kernel32.dll!_BaseProcessStart@​4() + 0x23


*************************now switch to checking .pm, not .pmc anymore****************************************************

+ path 0x00a0a388 "C​:\perl519\src\cpan\AutoLoader\lib/FooLoader.pm" const char *


  ntdll.dll!_KiFastSystemCall@​0() + 0x2
  ntdll.dll!_NtCreateFile@​44() + 0xc
  kernel32.dll!_CreateFileW@​28() + 0x1b6
  kernel32.dll!_CreateFileA@​28() + 0x2b
  miniperl.exe!win32_stat(const char * path=0x00a0a388, _stati64 * sbuf=0x0012fce8) Line 1499 + 0x16 C
  miniperl.exe!S_check_type_and_open(interpreter * my_perl=0x009e3fb0, sv * name=0x009e7b30) Line 3594 + 0xd C
  miniperl.exe!S_doopen_pm(interpreter * my_perl=0x009e3fb0, sv * name=0x009e7b30) Line 3633 + 0xd C
  miniperl.exe!Perl_pp_require(interpreter * my_perl=0x009e3fb0) Line 4011 + 0xd C
  miniperl.exe!Perl_runops_standard(interpreter * my_perl=0x009e3fb0) Line 42 + 0xa C
  miniperl.exe!S_run_body(interpreter * my_perl=0x009e3fb0, long oldscope=0x00000001) Line 2446 + 0xd C
  miniperl.exe!perl_run(interpreter * my_perl=0x009e3fb0) Line 2365 C
  miniperl.exe!main(int argc=0x00000004, char * * argv=0x009e2478, char * * env=0x009e2c90) Line 118 + 0xb C
  miniperl.exe!mainCRTStartup() Line 398 + 0xe C
  kernel32.dll!_BaseProcessStart@​4() + 0x23


+ path 0x00a0a388 "C​:\perl519\src\cpan\AutoLoader\lib/FooLoader.pm" const char *


  ntdll.dll!_KiFastSystemCall@​0() + 0x2
  ntdll.dll!_ZwOpenFile@​24() + 0xc
  kernel32.dll!_FindFirstFileExW@​24() + 0x179
  kernel32.dll!_FindFirstFileA@​8() + 0x3e
  msvcr71.dll!_stati64(const char * name=0x7c927784, _stati64 * buf=0x00000000) Line 184 C
  miniperl.exe!win32_stat(const char * path=0x00a0a388, _stati64 * sbuf=0x0012fce8) Line 1510 + 0xe C
  miniperl.exe!S_check_type_and_open(interpreter * my_perl=0x009e3fb0, sv * name=0x009e7b30) Line 3594 + 0xd C
  miniperl.exe!S_doopen_pm(interpreter * my_perl=0x009e3fb0, sv * name=0x009e7b30) Line 3633 + 0xd C
  miniperl.exe!Perl_pp_require(interpreter * my_perl=0x009e3fb0) Line 4011 + 0xd C
  miniperl.exe!Perl_runops_standard(interpreter * my_perl=0x009e3fb0) Line 42 + 0xa C
  miniperl.exe!S_run_body(interpreter * my_perl=0x009e3fb0, long oldscope=0x00000001) Line 2446 + 0xd C
  miniperl.exe!perl_run(interpreter * my_perl=0x009e3fb0) Line 2365 C
  miniperl.exe!main(int argc=0x00000004, char * * argv=0x009e2478, char * * env=0x009e2c90) Line 118 + 0xb C
  miniperl.exe!mainCRTStartup() Line 398 + 0xe C
  kernel32.dll!_BaseProcessStart@​4() + 0x23


+ path 0x00a0a388 "C​:\perl519\src\cpan\AutoLoader\lib/FooLoader.pm" const char *


  ntdll.dll!_KiFastSystemCall@​0() + 0x2
  ntdll.dll!_ZwQueryDirectoryFile@​44() + 0xc
  kernel32.dll!_FindFirstFileExW@​24() + 0x245
  kernel32.dll!_FindFirstFileA@​8() + 0x3e
  msvcr71.dll!_stati64(const char * name=0x7c927784, _stati64 * buf=0x00000000) Line 184 C
  miniperl.exe!win32_stat(const char * path=0x00a0a388, _stati64 * sbuf=0x0012fce8) Line 1510 + 0xe C
  miniperl.exe!S_check_type_and_open(interpreter * my_perl=0x009e3fb0, sv * name=0x009e7b30) Line 3594 + 0xd C
  miniperl.exe!S_doopen_pm(interpreter * my_perl=0x009e3fb0, sv * name=0x009e7b30) Line 3633 + 0xd C
  miniperl.exe!Perl_pp_require(interpreter * my_perl=0x009e3fb0) Line 4011 + 0xd C
  miniperl.exe!Perl_runops_standard(interpreter * my_perl=0x009e3fb0) Line 42 + 0xa C
  miniperl.exe!S_run_body(interpreter * my_perl=0x009e3fb0, long oldscope=0x00000001) Line 2446 + 0xd C
  miniperl.exe!perl_run(interpreter * my_perl=0x009e3fb0) Line 2365 C
  miniperl.exe!main(int argc=0x00000004, char * * argv=0x009e2478, char * * env=0x009e2c90) Line 118 + 0xb C
  miniperl.exe!mainCRTStartup() Line 398 + 0xe C
  kernel32.dll!_BaseProcessStart@​4() + 0x23


+ path 0x00a0a388 "C​:\perl519\src\cpan\AutoLoader\lib/FooLoader.pm" const char *


  ntdll.dll!_KiFastSystemCall@​0() + 0x2
  ntdll.dll!_NtClose@​4() + 0xc
  kernel32.dll!_FindFirstFileExW@​24() + 0x497
  kernel32.dll!_FindFirstFileA@​8() + 0x3e
  msvcr71.dll!_stati64(const char * name=0x7c927784, _stati64 * buf=0x00000000) Line 184 C
  miniperl.exe!win32_stat(const char * path=0x00a0a388, _stati64 * sbuf=0x0012fce8) Line 1510 + 0xe C
  miniperl.exe!S_check_type_and_open(interpreter * my_perl=0x009e3fb0, sv * name=0x009e7b30) Line 3594 + 0xd C
  miniperl.exe!S_doopen_pm(interpreter * my_perl=0x009e3fb0, sv * name=0x009e7b30) Line 3633 + 0xd C
  miniperl.exe!Perl_pp_require(interpreter * my_perl=0x009e3fb0) Line 4011 + 0xd C
  miniperl.exe!Perl_runops_standard(interpreter * my_perl=0x009e3fb0) Line 42 + 0xa C
  miniperl.exe!S_run_body(interpreter * my_perl=0x009e3fb0, long oldscope=0x00000001) Line 2446 + 0xd C
  miniperl.exe!perl_run(interpreter * my_perl=0x009e3fb0) Line 2365 C
  miniperl.exe!main(int argc=0x00000004, char * * argv=0x009e2478, char * * env=0x009e2c90) Line 118 + 0xb C
  miniperl.exe!mainCRTStartup() Line 398 + 0xe C
  kernel32.dll!_BaseProcessStart@​4() + 0x23


+ path 0x00a0a388 "C​:\perl519\src\cpan\AutoLoader\lib/FooLoader.pm" const char *


  ntdll.dll!_KiFastSystemCall@​0() + 0x2
  ntdll.dll!_NtQueryAttributesFile@​8() + 0xc
  kernel32.dll!_GetFileAttributesW@​4() + 0x67
  kernel32.dll!_GetFileAttributesA@​4() + 0x1d
  miniperl.exe!win32_stat(const char * path=0x00a0a388, _stati64 * sbuf=0x0012fce8) Line 1521 + 0xa C
  miniperl.exe!S_check_type_and_open(interpreter * my_perl=0x009e3fb0, sv * name=0x009e7b30) Line 3594 + 0xd C
  miniperl.exe!S_doopen_pm(interpreter * my_perl=0x009e3fb0, sv * name=0x009e7b30) Line 3633 + 0xd C
  miniperl.exe!Perl_pp_require(interpreter * my_perl=0x009e3fb0) Line 4011 + 0xd C
  miniperl.exe!Perl_runops_standard(interpreter * my_perl=0x009e3fb0) Line 42 + 0xa C
  miniperl.exe!S_run_body(interpreter * my_perl=0x009e3fb0, long oldscope=0x00000001) Line 2446 + 0xd C
  miniperl.exe!perl_run(interpreter * my_perl=0x009e3fb0) Line 2365 C
  miniperl.exe!main(int argc=0x00000004, char * * argv=0x009e2478, char * * env=0x009e2c90) Line 118 + 0xb C
  miniperl.exe!mainCRTStartup() Line 398 + 0xe C
  kernel32.dll!_BaseProcessStart@​4() + 0x23


*****************now on a different dir*************************************************
+ path 0x00a049e8 "C​:\perl519\src\dist\Carp\lib/FooLoader.pmc" const char *


  ntdll.dll!_KiFastSystemCall@​0() + 0x2
  ntdll.dll!_NtCreateFile@​44() + 0xc
  kernel32.dll!_CreateFileW@​28() + 0x1b6
  kernel32.dll!_CreateFileA@​28() + 0x2b
  miniperl.exe!win32_stat(const char * path=0x00a049e8, _stati64 * sbuf=0x0012fd48) Line 1499 + 0x16 C
  miniperl.exe!S_doopen_pm(interpreter * my_perl=0x009e3fb0, sv * name=0x009e7b30) Line 3630 + 0x43 C
  miniperl.exe!Perl_pp_require(interpreter * my_perl=0x009e3fb0) Line 4011 + 0xd C
  miniperl.exe!Perl_runops_standard(interpreter * my_perl=0x009e3fb0) Line 42 + 0xa C
  miniperl.exe!S_run_body(interpreter * my_perl=0x009e3fb0, long oldscope=0x00000001) Line 2446 + 0xd C
  miniperl.exe!perl_run(interpreter * my_perl=0x009e3fb0) Line 2365 C
  miniperl.exe!main(int argc=0x00000004, char * * argv=0x009e2478, char * * env=0x009e2c90) Line 118 + 0xb C
  miniperl.exe!mainCRTStartup() Line 398 + 0xe C
  kernel32.dll!_BaseProcessStart@​4() + 0x23


and the loop continues. Note that by calling all the "SomethingFileMoreSomethingA" function calls, each A call does a ASCII to UTF16 conversion. There goes a lil bit of CPU. Maybe win32_stat should do a utf16 conversion and use only kernel32W() and wclib() calls afterwards which dont convert.

On Thu Jan 30 03​:50​:16 2014, nicholas wrote​:

Thanks for ensuring that the patch gets to the list.

No, specifically it's completely unsafe for a parallel make.

You end up with race conditions where the build fails because one
process
process loads from lib/ a partially written module and aborts, because
it
happens just as another process is copying that file there.
(And the rest of the time you don't have a problem because the file is
either
not in lib, so loaded from the original dist/... etc, or it's fully
copied to
lib, so loaded from there)

Which is exactly what the comment in the file tries to explain.

make_ext has no provisions for parallel building. No select(), no open3. Can you explain?

However, as Win32 doesn't have parallel makes *yet*, and this seems to
be a
big speed hit on Win32, I think that it would be reasonable to do the
proposed re-ordering on Win32, with a comment that it needs to be
rethought
once anyone starts getting parallel makes working on Win32.

What would that process be?

--
bulk88 ~ bulk88 at hotmail.com

@p5pRT

This comment has been minimized.

Copy link
Collaborator Author

commented Jan 31, 2014

From @iabyn

On Thu, Jan 30, 2014 at 11​:49​:41AM -0800, bulk88 via RT wrote​:

On Thu Jan 30 05​:30​:31 2014, davem wrote​:

Perhaps this implies that's there's something sub-optimal in the way
perl
does its @​INC scanning under Win32?

Go to the RT ticket through web interface, and look at the
bad_inc_procmon.txt attachment, it contains strace like data with wall
time on the very left. The calls are sniffed by a filter driver that
gets first dibs on all I/O packets (since NT Kernel uses a packetized
(or transactional) async I/O model) coming from user mode.
[snip lots of trace output]
and the loop continues. Note that by calling all the
"SomethingFileMoreSomethingA" function calls, each A call does a ASCII
to UTF16 conversion. There goes a lil bit of CPU. Maybe win32_stat
should do a utf16 conversion and use only kernel32W() and wclib() calls
afterwards which dont convert.

Well, what's not clear to me is whether it's the windows system calls that
have the big overhead, or perl doing stuff like UTF16 conversions. I would
suspect the former.

Other than that, I'm really in a position to contribute further to to this
thread. I know almost zero about Windows APIs, so I have no way of knowing
whether perl is using those APIs inefficiently, or whether Windows is just
slow in this area. Or even whether you just have very slow hardware.

On Thu Jan 30 03​:50​:16 2014, nicholas wrote​:

Thanks for ensuring that the patch gets to the list.

No, specifically it's completely unsafe for a parallel make.

You end up with race conditions where the build fails because one
process
process loads from lib/ a partially written module and aborts, because
it
happens just as another process is copying that file there.
(And the rest of the time you don't have a problem because the file is
either
not in lib, so loaded from the original dist/... etc, or it's fully
copied to
lib, so loaded from there)

Which is exactly what the comment in the file tries to explain.

make_ext has no provisions for parallel building. No select(), no open3. Can you explain?

On UNIX, it is normal to run make with -j N, which will do a parallel
build, with make building up to N targets in parallel. So all the
parallelism is handled by make, not make_ext.

--
If life gives you lemons, you'll probably develop a citric acid allergy.

@p5pRT

This comment has been minimized.

Copy link
Collaborator Author

commented Feb 3, 2014

From @bulk88

According to TonyC, "[23​:18] <@​TonyC> bulk88​: WRT #121119 - I wonder if setting ${^WIN32_SLOPPY_STAT} in buildcustomize.pl would make a noticable difference". Note to self to investigate.
--
bulk88 ~ bulk88 at hotmail.com

@p5pRT

This comment has been minimized.

Copy link
Collaborator Author

commented Feb 3, 2014

From [Unknown Contact. See original ticket]

According to TonyC, "[23​:18] <@​TonyC> bulk88​: WRT #121119 - I wonder if setting ${^WIN32_SLOPPY_STAT} in buildcustomize.pl would make a noticable difference". Note to self to investigate.
--
bulk88 ~ bulk88 at hotmail.com

@p5pRT

This comment has been minimized.

Copy link
Collaborator Author

commented Feb 3, 2014

From @Hugmeir

On Thu, Jan 30, 2014 at 12​:49 PM, Nicholas Clark <nick@​ccl4.org> wrote​:

On Thu, Jan 30, 2014 at 11​:25​:28AM +0000, Dave Mitchell wrote​:

On Thu, Jan 30, 2014 at 12​:13​:16AM -0800, bulk88 via RT wrote​:

See attached patch. Each dir being searched that fails on Win32
results
in 10 I/O sys calls which I counted at 17 ms wall time, times 16
build
customize dirs that didn't match is an extra 272 ms of wall time
dealing
with the not found dirs during an @​INC search. Also as the build
progresses, more and more things will be found on the 1st try in
/lib,
and I presume buildcustomize dirs become totally unused once XS
module
start being built since everything needed for XS building (which is a
PP
process) will be in /lib and found on 1st try.

[snip; this cuts the 'make' time by​: ]

(1656 - 1137) / 60 = 8.65 mins

[snip]

I left the original comment in place for historical reasons. The
comment
might be from something that happened in a Module​::Build module back
when M​::B was in core.

For the convenience of others, here is the original patch​:

diff --git a/write_buildcustomize.pl b/write_buildcustomize.pl
index 64bf4ce..5309988 100644
--- a/write_buildcustomize.pl
+++ b/write_buildcustomize.pl
@​@​ -48,13 +48,18 @​@​ push @​toolchain, 'ext/VMS-Filespec/lib' if $^O eq
'VMS';
unshift @​INC, @​toolchain;
require File​::Spec​::Functions;

+# former comment
+#
# lib must be last, as the toolchain modules write themselves into it
# as they build, and it's important that @​INC order ensures that the
partially
# written files are always masked by the complete versions.

my $inc = join ",\n ",
map { "q\0$_\0" }
- (map {File​::Spec​::Functions​::rel2abs($_)} @​toolchain, 'lib');
+# putting lib first shaves a couple minutes off the build time since
the most
+# common modules like warnings and strict are in lib, and as extensions
are
+# built the chances of the module being found in lib increases
+ (map {File​::Spec​::Functions​::rel2abs($_)} 'lib', @​toolchain);

Before such a patch went in, I think we'd have to be sure that the
reasons
stated in the code for needing to put lib last don't still apply.
Although I'm not very familiar with lib/buildcustomize.pl, it seems to
me
that the reason still stands. Especially on a parallel make, I could
see one instance of miniperl running while another instance is copying
files from ext/Foo to lib, thus allowing it to see empty or truncated
files; or even if the installation of files into lib/ is atomic, a
mixture
of files from the same distro, some in lib, some still under ext/ or
whatever might be seem, and might be harmful.

So this doesn't seem safe to me.

Thanks for ensuring that the patch gets to the list.

No, specifically it's completely unsafe for a parallel make.

You end up with race conditions where the build fails because one process
process loads from lib/ a partially written module and aborts, because it
happens just as another process is copying that file there.
(And the rest of the time you don't have a problem because the file is
either
not in lib, so loaded from the original dist/... etc, or it's fully copied
to
lib, so loaded from there)

Which is exactly what the comment in the file tries to explain.

However, as Win32 doesn't have parallel makes *yet*,

Well, on a technicality, it does​:

http​://perl5.git.perl.org/perl.git/shortlog/refs/heads/hugmeir/cross-compile-win32

Although parallel make on that branch isn't quite functional yet either,
because of some interaction with create_perllibst_h.pl that I haven't
investigated yet, which causes make to abort if run in parallel before
perllib.o is compiled.

I'll be the first to admit that the branch is of dubious value, but if it's
possible not to break it further, that'd be swell.

@p5pRT

This comment has been minimized.

Copy link
Collaborator Author

commented Feb 3, 2014

From @Hugmeir

On Mon, Feb 3, 2014 at 5​:54 AM, Brian Fraser <fraserbn@​gmail.com> wrote​:

On Thu, Jan 30, 2014 at 12​:49 PM, Nicholas Clark <nick@​ccl4.org> wrote​:

On Thu, Jan 30, 2014 at 11​:25​:28AM +0000, Dave Mitchell wrote​:

On Thu, Jan 30, 2014 at 12​:13​:16AM -0800, bulk88 via RT wrote​:

See attached patch. Each dir being searched that fails on Win32
results
in 10 I/O sys calls which I counted at 17 ms wall time, times 16
build
customize dirs that didn't match is an extra 272 ms of wall time
dealing
with the not found dirs during an @​INC search. Also as the build
progresses, more and more things will be found on the 1st try in
/lib,
and I presume buildcustomize dirs become totally unused once XS
module
start being built since everything needed for XS building (which is
a
PP
process) will be in /lib and found on 1st try.

[snip; this cuts the 'make' time by​: ]

(1656 - 1137) / 60 = 8.65 mins

[snip]

I left the original comment in place for historical reasons. The
comment
might be from something that happened in a Module​::Build module back
when M​::B was in core.

For the convenience of others, here is the original patch​:

diff --git a/write_buildcustomize.pl b/write_buildcustomize.pl
index 64bf4ce..5309988 100644
--- a/write_buildcustomize.pl
+++ b/write_buildcustomize.pl
@​@​ -48,13 +48,18 @​@​ push @​toolchain, 'ext/VMS-Filespec/lib' if $^O eq
'VMS';
unshift @​INC, @​toolchain;
require File​::Spec​::Functions;

+# former comment
+#
# lib must be last, as the toolchain modules write themselves into it
# as they build, and it's important that @​INC order ensures that the
partially
# written files are always masked by the complete versions.

my $inc = join ",\n ",
map { "q\0$_\0" }
- (map {File​::Spec​::Functions​::rel2abs($_)} @​toolchain, 'lib');
+# putting lib first shaves a couple minutes off the build time since
the most
+# common modules like warnings and strict are in lib, and as
extensions are
+# built the chances of the module being found in lib increases
+ (map {File​::Spec​::Functions​::rel2abs($_)} 'lib', @​toolchain);

Before such a patch went in, I think we'd have to be sure that the
reasons
stated in the code for needing to put lib last don't still apply.
Although I'm not very familiar with lib/buildcustomize.pl, it seems to
me
that the reason still stands. Especially on a parallel make, I could
see one instance of miniperl running while another instance is copying
files from ext/Foo to lib, thus allowing it to see empty or truncated
files; or even if the installation of files into lib/ is atomic, a
mixture
of files from the same distro, some in lib, some still under ext/ or
whatever might be seem, and might be harmful.

So this doesn't seem safe to me.

Thanks for ensuring that the patch gets to the list.

No, specifically it's completely unsafe for a parallel make.

You end up with race conditions where the build fails because one process
process loads from lib/ a partially written module and aborts, because it
happens just as another process is copying that file there.
(And the rest of the time you don't have a problem because the file is
either
not in lib, so loaded from the original dist/... etc, or it's fully
copied to
lib, so loaded from there)

Which is exactly what the comment in the file tries to explain.

However, as Win32 doesn't have parallel makes *yet*,

Well, on a technicality, it does​:

http​://perl5.git.perl.org/perl.git/shortlog/refs/heads/hugmeir/cross-compile-win32

Although parallel make on that branch isn't quite functional yet either,
because of some interaction with create_perllibst_h.pl that I haven't
investigated yet, which causes make to abort if run in parallel before
perllib.o is compiled.

I'll be the first to admit that the branch is of dubious value, but if
it's possible not to break it further, that'd be swell.

Actually, on further thought, making the change only take effect for $^O eq
'MSWin32' would work -- at that point during cross-compilation, miniperl is
not yet pretending to be the target system, so it wouldn't kick in.
So nevermind part of my comment, the change would only affect native builds.

@p5pRT

This comment has been minimized.

Copy link
Collaborator Author

commented Feb 3, 2014

From @tonycoz

On Fri Jan 31 03​:31​:05 2014, davem wrote​:

On Thu, Jan 30, 2014 at 11​:49​:41AM -0800, bulk88 via RT wrote​:

On Thu Jan 30 05​:30​:31 2014, davem wrote​:

Perhaps this implies that's there's something sub-optimal in the
way
perl
does its @​INC scanning under Win32?

Go to the RT ticket through web interface, and look at the
bad_inc_procmon.txt attachment, it contains strace like data with
wall
time on the very left. The calls are sniffed by a filter driver that
gets first dibs on all I/O packets (since NT Kernel uses a packetized
(or transactional) async I/O model) coming from user mode.
[snip lots of trace output]
and the loop continues. Note that by calling all the
"SomethingFileMoreSomethingA" function calls, each A call does a
ASCII
to UTF16 conversion. There goes a lil bit of CPU. Maybe win32_stat
should do a utf16 conversion and use only kernel32W() and wclib()
calls
afterwards which dont convert.

Well, what's not clear to me is whether it's the windows system calls
that
have the big overhead, or perl doing stuff like UTF16 conversions. I
would
suspect the former.

win32_stat() does a fair bit to try and emulate a POSIX stat() - including checking for the number of links and making sure file attributes are properly updated for hard links. Unfortunately, setting ${^WIN32_SLOPPY_STAT} to 1 in buildcustomize.pl didn't make much a difference in a rough build benchmark (from 712 to 697 seconds, which is probably just noise, considering I used the machine as my desktop.)

Other than that, I'm really in a position to contribute further to to
this
thread. I know almost zero about Windows APIs, so I have no way of
knowing
whether perl is using those APIs inefficiently, or whether Windows is
just
slow in this area. Or even whether you just have very slow hardware.

I'm not sure why perl builds on Win32 are so slow - I get very fast builds in a VM on the same hardware (eg. a nmake on Win32 took 712 seconds, a Configure + non-parallel make on an Ubuntu VM took 335 seconds).

On UNIX, it is normal to run make with -j N, which will do a parallel
build, with make building up to N targets in parallel. So all the
parallelism is handled by make, not make_ext.

dmake (the tool, not our dmake makefile) supports parallel builds, but from discussing this on IRC, it's not practical since MSVC will attempt to lock the common PDB file and abort if it fails to do so.

Maybe a flag to skip checking for .pmc files would help, but I wouldn't expect require's .pmc checks to be a noticable timesink for typical usage.

Tony

@p5pRT

This comment has been minimized.

Copy link
Collaborator Author

commented Feb 3, 2014

From @nwc10

On Sun, Feb 02, 2014 at 09​:30​:49PM -0800, Tony Cook via RT wrote​:

On Fri Jan 31 03​:31​:05 2014, davem wrote​:

Well, what's not clear to me is whether it's the windows system calls
that
have the big overhead, or perl doing stuff like UTF16 conversions. I
would
suspect the former.

win32_stat() does a fair bit to try and emulate a POSIX stat() - including checking for the number of links and making sure file attributes are properly updated for hard links. Unfortunately, setting ${^WIN32_SLOPPY_STAT} to 1 in buildcustomize.pl didn't make much a difference in a rough build benchmark (from 712 to 697 seconds, which is probably just noise, considering I used the machine as my desktop.)

Sigh. And for most of this we don't need all of that.

The documented behaviour of _ (which I can't find!) is that it re-uses the
last stat buffer. Is the implementation such that this "last stat buffer" is
only the one used by PP filetest code? ie, if it *isn't*, then it seems
viable to replace many internal uses of stat in the C code with something
that isn't doing the full emulation on Win32.

I'm not sure what the cut off should be - looks like the require code in
S_check_type_and_open() is using stat to check the file permissions to
figure out the file type. I know that at least one place wants to know file
length. What is cheap to work out on Win32? And what needs lots of
emulation?

On UNIX, it is normal to run make with -j N, which will do a parallel
build, with make building up to N targets in parallel. So all the
parallelism is handled by make, not make_ext.

Yes, the extension build approach is quite different on *nix from Win32 and
VMS (and has been since both ports were added). make_ext.pl mostly merged
the behaviour of the *three* existing extension building tools, but retained
their calling conventions and uses. (There is only so much one can refactor
at once)

So on *nix, the (generated) Makefile contains a target for each extension to
be built, and all of those targets are dependency of "all". There are a few
pattern rules about how to build those targets (which use make_ext.pl), but
all the parallelism is handled by make.

dmake (the tool, not our dmake makefile) supports parallel builds, but from discussing this on IRC, it's not practical since MSVC will attempt to lock the common PDB file and abort if it fails to do so.

Maybe a flag to skip checking for .pmc files would help, but I wouldn't expect require's .pmc checks to be a noticable timesink for typical usage.

I was going to suggest this. IIRC on the Win32 build there's a complete set
of objects compiled for miniperl, with "bootstrapping" C compiler flags, and
a complete second set compiled for the installed perl, with user chosen flags.

(Unlike *nix and VMS, where most objects are re-used for both)

If I have that correct, I'd suggest adding -DPERL_DISABLE_PMC to the
flags for building miniperl on Win32. It's a simple change, doesn't affect
any installed code, and it should speed things up a bit.

Nicholas Clark

@p5pRT

This comment has been minimized.

Copy link
Collaborator Author

commented Feb 4, 2014

From perl5-porters@perl.org

Nicholas Clark wrote​:

The documented behaviour of _ (which I can't find!)

perlfunc I think.

is that it re-uses the
last stat buffer. Is the implementation such that this "last stat buffer" is
only the one used by PP filetest code? ie, if it *isn't*, then it seems
viable to replace many internal uses of stat in the C code with something
that isn't doing the full emulation on Win32.

PL_laststatval and PL_defgv-as-handle seem to be used only in pp func-
tions (and my_(l)stat, which is only usable from a pp function).

@p5pRT

This comment has been minimized.

Copy link
Collaborator Author

commented Feb 4, 2014

From @tonycoz

On Mon Feb 03 01​:41​:14 2014, nicholas wrote​:

dmake (the tool, not our dmake makefile) supports parallel builds,
but from discussing this on IRC, it's not practical since MSVC will
attempt to lock the common PDB file and abort if it fails to do so.

Maybe a flag to skip checking for .pmc files would help, but I
wouldn't expect require's .pmc checks to be a noticable timesink for
typical usage.

I was going to suggest this. IIRC on the Win32 build there's a
complete set
of objects compiled for miniperl, with "bootstrapping" C compiler
flags, and
a complete second set compiled for the installed perl, with user
chosen flags.

(Unlike *nix and VMS, where most objects are re-used for both)

If I have that correct, I'd suggest adding -DPERL_DISABLE_PMC to the
flags for building miniperl on Win32. It's a simple change, doesn't
affect
any installed code, and it should speed things up a bit.

This made a significant difference, I did 3 runs for a baseline, with build durations of 718, 735 and 727 seconds[1].

I added -DPERL_DISABLE_PMC to the $(MINICORE_OBJ) build command and did another three runs with durations of 643, 698 and 675 seconds.

Taking the median of each that's a 7% reduction in build time.

Tony

[1] this was on my normal desktop, which didn't have a lot of CPU usage, but I guess there was a lot of noise anyway.

@p5pRT

This comment has been minimized.

Copy link
Collaborator Author

commented Feb 4, 2014

From @bulk88

On Mon Feb 03 19​:02​:17 2014, tonyc wrote​:

This made a significant difference, I did 3 runs for a baseline, with
build durations of 718, 735 and 727 seconds[1].

I added -DPERL_DISABLE_PMC to the $(MINICORE_OBJ) build command and
did another three runs with durations of 643, 698 and 675 seconds.

Taking the median of each that's a 7% reduction in build time.

Tony

[1] this was on my normal desktop, which didn't have a lot of CPU
usage, but I guess there was a lot of noise anyway.

Since this ticket is heading down different directions. I'll point the following 3 ideas in this ticket so far.

1. disabling pmc in miniperl can be done for all OSes, there is also the PERL_IS_MINIPERL macro, so no need to -D it

2. ${^WIN32_SLOPPY_STAT} for Win32 miniperl only

3. /lib first on Win32 miniperl only

The reason I/O is slow on the machine I used is probably a combination of 2 things. A horrible HP RAID controller I've never figured out why its slow. Sequential read speed averages 49 MBps. Random access time 8 ms. But its full of 15K SCSIs. And I'm using WOW64 (32 bit Windows process on 64 bit OS, C stack params and passed struct *s are copied and extended/truncated WOW64 before the 64 bit syscall, it will add a lil overhead probably). My time improvements in abs time saved (minutes) will be the largest of any porter because of these 2 negative factors.

Random thoughts on the code, both S_check_type_and_open and S_doopen_pm do SvPV on their SV.In 1 place in pp_require (doopen_pm's caller), SvPVX is done just before doopen_pm. Multiple magic gets issue. I know with no .pmc, 1 sub replaces the other. The whole sv_newmortal and SvSetSV_nosteal is confusing. Why not just cat onto incoming SV the "c", then SvCUR it off? "if (PerlLIO_stat(SvPV_nolen_const(pmcsv)" that can become SvPVX since sv_catpvn guarentees a PV. In .pmc mode, if there is a .pmc, 2 stats are done on it.


  if (!IS_SAFE_PATHNAME(p, len, "require"))
  return NULL;


This is done twice on every path between the 2 funcs. I'm not touching the mess that is called pp_require.
--
bulk88 ~ bulk88 at hotmail.com

@p5pRT

This comment has been minimized.

Copy link
Collaborator Author

commented Feb 4, 2014

From @demerphq

On 4 February 2014 15​:34, bulk88 via RT <perlbug-followup@​perl.org> wrote​:

On Mon Feb 03 19​:02​:17 2014, tonyc wrote​:

This made a significant difference, I did 3 runs for a baseline, with
build durations of 718, 735 and 727 seconds[1].

I added -DPERL_DISABLE_PMC to the $(MINICORE_OBJ) build command and
did another three runs with durations of 643, 698 and 675 seconds.

Taking the median of each that's a 7% reduction in build time.

Tony

[1] this was on my normal desktop, which didn't have a lot of CPU
usage, but I guess there was a lot of noise anyway.

Since this ticket is heading down different directions. I'll point the following 3 ideas in this ticket so far.

1. disabling pmc in miniperl can be done for all OSes, there is also the PERL_IS_MINIPERL macro, so no need to -D it

2. ${^WIN32_SLOPPY_STAT} for Win32 miniperl only

3. /lib first on Win32 miniperl only

The reason I/O is slow on the machine I used is probably a combination of 2 things.

Is there any chance it could also be that you have a virus scanner
running on your perl build dir?

I seem to recall that when I worked under windows the security types
at work insisted all directories were under virus scanner, so building
perl became nightmarishly slow as the virus scanner rescanned every
source file during the build.

Just a thought.

Yves

@p5pRT

This comment has been minimized.

Copy link
Collaborator Author

commented Feb 4, 2014

From @bulk88

On Mon Feb 03 23​:59​:37 2014, demerphq wrote​:

Is there any chance it could also be that you have a virus scanner
running on your perl build dir?

I seem to recall that when I worked under windows the security types
at work insisted all directories were under virus scanner, so building
perl became nightmarishly slow as the virus scanner rescanned every
source file during the build.

Just a thought.

Yves

It is called TortoiseGit but it isn't active (no TGitCache.exe process running) ATM. Checked again right now during a make clean, 12-18 ms per @​INC dir. No TGitCache process. I also tried a non-git bleed build dir, no change in ms per @​INC dir. Its still in the 10-20 ms range per dir.

--
bulk88 ~ bulk88 at hotmail.com

@p5pRT

This comment has been minimized.

Copy link
Collaborator Author

commented Feb 4, 2014

From @druud62

On 2014-02-03 10​:40, Nicholas Clark wrote​:

The documented behaviour of _ (which I can't find!)

Partly 'perldoc stat', mostly 'perldoc -f -X'.

is that it re-uses the last stat buffer.

See also how 'lstat' plays a role, and how -B resets it, etc.

--
Ruud

@p5pRT

This comment has been minimized.

Copy link
Collaborator Author

commented Feb 4, 2014

From @tonycoz

On Mon Feb 03 23​:34​:31 2014, bulk88 wrote​:

1. disabling pmc in miniperl can be done for all OSes, there is also
the PERL_IS_MINIPERL macro, so no need to -D it

Not trivially, platforms which use Makefile.SH use the same pp_ctl$(O) for both miniperl and the final perl.

We'd need to what's done with op.c/opmini.c and perl.c/perlmini.c, and I don't think it's worth it.

2. ${^WIN32_SLOPPY_STAT} for Win32 miniperl only

3. /lib first on Win32 miniperl only

The reason I/O is slow on the machine I used is probably a combination
of 2 things. A horrible HP RAID controller I've never figured out why
its slow. Sequential read speed averages 49 MBps. Random access time 8
ms. But its full of 15K SCSIs. And I'm using WOW64 (32 bit Windows
process on 64 bit OS, C stack params and passed struct *s are copied
and extended/truncated WOW64 before the 64 bit syscall, it will add a
lil overhead probably). My time improvements in abs time saved
(minutes) will be the largest of any porter because of these 2
negative factors.

I'm building on fairly modern, if pedestrian hardware - first generation Core i7, 12GB RAM (which was mostly free), SATA spinning rust drive.

I was building 64-bit binaries on a 64-bit OS.

I have another more idle machine (also 64-bit) which I'll probably use for further testing, but that has a much older CPU (Athon 5200+).[1]

Random thoughts on the code, both S_check_type_and_open and
S_doopen_pm do SvPV on their SV.In 1 place in pp_require (doopen_pm's
caller), SvPVX is done just before doopen_pm. Multiple magic gets
issue. I know with no .pmc, 1 sub replaces the other. The whole
sv_newmortal and SvSetSV_nosteal is confusing. Why not just cat onto
incoming SV the "c", then SvCUR it off? "if
(PerlLIO_stat(SvPV_nolen_const(pmcsv)" that can become SvPVX since
sv_catpvn guarentees a PV. In .pmc mode, if there is a .pmc, 2 stats
are done on it.

I considered if for non-sloppy stat, if the CreatFileA() fails, we should just shortcut to win32_stat() failing. But that won't work for directories.

In most cases there isn't a .pmc, so the second stat isn't done.

-------------------------
if (!IS_SAFE_PATHNAME(p, len, "require"))
return NULL;
-------------------------
This is done twice on every path between the 2 funcs. I'm not touching
the mess that is called pp_require.

I considered the performance impact of the duplicate check for this when I added it, but I thought (and still think) that a memchr() against what should be a short string is going to be insignificant against any I/O we do.

Tony

[1] that said, the modern machine has a "windows performance index" of 5.9 while the newer is 5.5. the (more) modern machine is let down by the spinning rust.

@p5pRT

This comment has been minimized.

Copy link
Collaborator Author

commented Feb 10, 2014

From @bulk88

TLDR, 3 optimizations combined gave me a 1-(839/1582)=46% decrease in build time using the faster baseline of the 2 (if I use the slower baseline, the improvement is even higher).

I used a git cleaned WD for each run. Each run is a make all. I did some benchmarking of different optimizations proposed on my "slow" machine. Since I need to put it back into normal duty and each make is very long, I could only do 1 run of each permutation (in 2 cases I thought they were PMC_DISABLE runs, but I later checked the git afterwards and it turned out PMC_DISABLE was on a another blead WD). Notice that the numbers do not add up when you compare each optimization individually, to multi optimization runs, and baseline. OS caching might be playing a role here that /lib first gives the most improvement alone. A WAG says that Windows kernel keeps last used by the process or any process directory open until another directory is touched by a syscall. So after the first syscall to touch that dir, all following stats, dir entry enumerations and attempted opens on that directory do not touch the FS driver or disk driver. When perl/a process touches another dir the last used directory structure is tossed and the fs and disk drivers are called. Note there might be multiple caches at work in Windows kernel/driver stacks, and IDK where they are and how they specifically act. As I said, I can't afford to do more runs to get an average.

Regarding strange numbers, 1582 (best baseline)-1047(worst first /lib+sloppy stat) = 535 seconds, but 1157(DISABLE_PMC alone)-839(first /lib+sloppy stat+DISABLE_PMC) = 318 seconds. 1-(318/535) = 40% less gain.

baseline, no optimizations


Extracting find2perl (with variable substitutions)
  ..\miniperl.exe -I..\lib ..\x2p\s2p.PL
Extracting s2p (with variable substitutions)
Linking s2p to psed.
  link -subsystem​:console -out​:..\x2p\a2p.exe @​C​:\DOCUME1\ADMINI1\LOCALS
~1\Temp\nm31FA.tmp
Generating code
Finished generating code
  if exist ..\x2p\a2p.exe.manifest mt -nologo -manifest ..\x2p\a2p.exe.man
ifest -outputresource​:..\x2p\a2p.exe;1 && if exist ..\x2p\a2p.exe.manifest del
..\x2p\a2p.exe.manifest
Everything is up to date. 'nmake test' to run test suite.
  one​: 1727 wallclock secs ( 0.00 usr + 0.00 sys = 0.00 CPU)
  (warning​: too few iterations for a reliable count)

  link -subsystem​:console -out​:..\x2p\a2p.exe @​C​:\DOCUME1\ADMINI1\LOCALS
~1\Temp\nm84.tmp
Generating code
Finished generating code
  if exist ..\x2p\a2p.exe.manifest mt -nologo -manifest ..\x2p\a2p.exe.man
ifest -outputresource​:..\x2p\a2p.exe;1 && if exist ..\x2p\a2p.exe.manifest del
..\x2p\a2p.exe.manifest
Everything is up to date. 'nmake test' to run test suite.
  one​: 1582 wallclock secs ( 0.00 usr + 0.00 sys = 0.00 CPU)
  (warning​: too few iterations for a reliable count)


/lib optimization and sloppy stat


  link -subsystem​:console -out​:..\x2p\a2p.exe @​C​:\DOCUME1\ADMINI1\LOCALS
~1\Temp\nm3098.tmp
Generating code
Finished generating code
  if exist ..\x2p\a2p.exe.manifest mt -nologo -manifest ..\x2p\a2p.exe.man
ifest -outputresource​:..\x2p\a2p.exe;1 && if exist ..\x2p\a2p.exe.manifest del
..\x2p\a2p.exe.manifest
Everything is up to date. 'nmake test' to run test suite.
  one​: 1047 wallclock secs ( 0.00 usr + 0.00 sys = 0.00 CPU)
  (warning​: too few iterations for a reliable count)

C​:\p519\srcnew\win32>
  link -subsystem​:console -out​:..\x2p\a2p.exe @​C​:\DOCUME1\ADMINI1\LOCALS
~1\Temp\nm32F3.tmp
Generating code
Finished generating code
  if exist ..\x2p\a2p.exe.manifest mt -nologo -manifest ..\x2p\a2p.exe.man
ifest -outputresource​:..\x2p\a2p.exe;1 && if exist ..\x2p\a2p.exe.manifest del
..\x2p\a2p.exe.manifest
Everything is up to date. 'nmake test' to run test suite.
  one​: 1030 wallclock secs ( 0.00 usr + 0.00 sys = 0.00 CPU)
  (warning​: too few iterations for a reliable count)


no /lib optimization, sloppy stat only


  if exist ..\x2p\a2p.exe.manifest mt -nologo -manifest ..\x2p\a2p.exe.man
ifest -outputresource​:..\x2p\a2p.exe;1 && if exist ..\x2p\a2p.exe.manifest del
..\x2p\a2p.exe.manifest
Everything is up to date. 'nmake test' to run test suite.
  one​: 1470 wallclock secs ( 0.00 usr + 0.00 sys = 0.00 CPU)
  (warning​: too few iterations for a reliable count)


/lib optimization only, no sloppy stat


  C​:\p519\srcnew\miniperl.exe "-I..\..\lib" "-I..\..\lib" -MExtUtils​::Comm
and -e chmod -- 755 ..\..\lib\auto\threads\shared\shared.dll
  ..\miniperl.exe -I..\lib ..\x2p\find2perl.PL
Extracting find2perl (with variable substitutions)
  ..\miniperl.exe -I..\lib ..\x2p\s2p.PL
Extracting s2p (with variable substitutions)
Linking s2p to psed.
  link -subsystem​:console -out​:..\x2p\a2p.exe @​C​:\DOCUME1\ADMINI1\LOCALS
~1\Temp\nm3276.tmp
Generating code
Finished generating code
  if exist ..\x2p\a2p.exe.manifest mt -nologo -manifest ..\x2p\a2p.exe.man
ifest -outputresource​:..\x2p\a2p.exe;1 && if exist ..\x2p\a2p.exe.manifest del
..\x2p\a2p.exe.manifest
Everything is up to date. 'nmake test' to run test suite.
  one​: 1162 wallclock secs ( 0.00 usr + 0.00 sys = 0.00 CPU)
  (warning​: too few iterations for a reliable count)


no /lib optimization only, no sloppy stat, DISABLE_PMC only


  link -subsystem​:console -out​:..\x2p\a2p.exe @​C​:\DOCUME1\ADMINI1\LOCALS
~1\Temp\nm122.tmp
Generating code
Finished generating code
  if exist ..\x2p\a2p.exe.manifest mt -nologo -manifest ..\x2p\a2p.exe.man
ifest -outputresource​:..\x2p\a2p.exe;1 && if exist ..\x2p\a2p.exe.manifest del
..\x2p\a2p.exe.manifest
Everything is up to date. 'nmake test' to run test suite.
  one​: 1157 wallclock secs ( 0.00 usr + 0.00 sys = 0.00 CPU)
  (warning​: too few iterations for a reliable count)


/lib optimization only, sloppy stat, DISABLE_PMC (all 3 on)


Linking s2p to psed.
  link -subsystem​:console -out​:..\x2p\a2p.exe @​C​:\DOCUME1\ADMINI1\LOCALS
~1\Temp\nm1A1.tmp
Generating code
Finished generating code
  if exist ..\x2p\a2p.exe.manifest mt -nologo -manifest ..\x2p\a2p.exe.man
ifest -outputresource​:..\x2p\a2p.exe;1 && if exist ..\x2p\a2p.exe.manifest del
..\x2p\a2p.exe.manifest
Everything is up to date. 'nmake test' to run test suite.
  one​: 839 wallclock secs ( 0.00 usr + 0.00 sys = 0.00 CPU)
  (warning​: too few iterations for a reliable count)


patches used for testing


win32/win32.h | 1 +
1 file changed, 1 insertion(+)

Inline Patch
diff --git a/win32/win32.h b/win32/win32.h
index 3d1655a..f5f0187 100644
--- a/win32/win32.h
+++ b/win32/win32.h
@@ -20,6 +20,7 @@
  * level in full perl
  */
 #  define WIN32_NO_SOCKETS
+#  define PERL_DISABLE_PMC
 #endif
 
 #ifdef WIN32_NO_SOCKETS

----------------------------------------------------------------------------------------
 write_buildcustomize.pl | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/write_buildcustomize.pl b/write_buildcustomize.pl
index 64bf4ce..181a00b 100644
--- a/write_buildcustomize.pl
+++ b/write_buildcustomize.pl
@@ -54,7 +54,7 @@ require File::Spec::Functions;
 
 my $inc = join ",\n        ",
     map { "q\0$_\0" }
-    (map {File::Spec::Functions::rel2abs($_)} @toolchain, 'lib');
+    (map {File::Spec::Functions::rel2abs($_)} 'lib', @toolchain);
 
 open my $fh, '>', $file
     or die "Can't open $file: $!";
@@ -74,6 +74,8 @@ print $fh <<"EOT" or $error = "Can't print to $file: $!";
 # We are miniperl, building extensions
 # Replace the first entry of \@INC ("lib") with the list of
 # directories we need.
+
+\${^WIN32_SLOPPY_STAT} = 1;
 splice(\@INC, 0, 1, $inc);
 \$^O = '$osname';
 EOT


-- 

bulk88 ~ bulk88 at hotmail.com

@p5pRT

This comment has been minimized.

Copy link
Collaborator Author

commented Feb 28, 2014

From @nwc10

On Tue, Feb 04, 2014 at 02​:35​:23PM -0800, Tony Cook via RT wrote​:

On Mon Feb 03 23​:34​:31 2014, bulk88 wrote​:

1. disabling pmc in miniperl can be done for all OSes, there is also
the PERL_IS_MINIPERL macro, so no need to -D it

Not trivially, platforms which use Makefile.SH use the same pp_ctl$(O) for both miniperl and the final perl.

We'd need to what's done with op.c/opmini.c and perl.c/perlmini.c, and I don't think it's worth it.

I'm not convinced that it's worth it either, given that

1) unlike Win32, it's not trivial to do this due to the re-use of pp_ctl$(o)
2) there doesn't seem to be any noticeable speed hit on *nix

2. ${^WIN32_SLOPPY_STAT} for Win32 miniperl only

3. /lib first on Win32 miniperl only

These seem reasonable, but I don't even know how to do the first, so don't
know how easy it is. The latter *is* easy, but it's 90% testing, which I can't
do either.

On the subject of testing (that I can't) as George's smoker is unhappy again,
could someone on Win32 try out the branch smoke-me/nicholas/fake-pm_to_blib
and verify that it still builds on Win32? (I think that it should work)
It avoids the entire use of ExtUtils​::MakeMaker and make for building about
80 of the "simple" pure-perl modules, so it should speed things up.
Based on the experience of disabling PMCs, I hope that on Win32 it's quite a
bit faster.

I considered if for non-sloppy stat, if the CreatFileA() fails, we should just shortcut to win32_stat() failing. But that won't work for directories.

The other thought that I had is that the whole investigation with require
showed that a lot of the slowness on Win32 is caused by doing emulation work
that gets thrown away.

Because the internals were originate on Unix they assume that all file
metadata comes from a stat() call, hence you can't get any metadata without
a stat() call, hence there's no real need to think of anything "smaller" than
a stat() call, so just make a stat() call then pick out the part you need.

That's clearly not the case on Win32, where emulating parts of the data
stat() returns is possible, but expensive. So the result is that the
"portable" C code happily makes a stat() call, based on the assumptions
above, the stat() emulation code works hard, because it doesn't know the
context, and then the caller throws the work away.

I think it would be useful if someone would audit all the internal uses of
stat(), to see what fields are actually being used. From this, it's probably
going to be obvious that the PerlLIO_stat()/PerlLIO_fstat()/PerlLIO_lstat()
macros should be augmented, maybe with 2 more calls to directly get just
length, and "mode", or maybe something else (like a sloppy-stat(), which
is documented to be guaranteed to only fill in the cheap subset of the stat
structure, and does just that on Win32)

This feels like it might make a small general speedup for Win32.

Nicholas Clark

@p5pRT

This comment has been minimized.

Copy link
Collaborator Author

commented Feb 28, 2014

From @demerphq

On 28 February 2014 16​:47, Nicholas Clark <nick@​ccl4.org> wrote​:

On Tue, Feb 04, 2014 at 02​:35​:23PM -0800, Tony Cook via RT wrote​:

On Mon Feb 03 23​:34​:31 2014, bulk88 wrote​:

1. disabling pmc in miniperl can be done for all OSes, there is also
the PERL_IS_MINIPERL macro, so no need to -D it

Not trivially, platforms which use Makefile.SH use the same pp_ctl$(O) for both miniperl and the final perl.

We'd need to what's done with op.c/opmini.c and perl.c/perlmini.c, and I don't think it's worth it.

I'm not convinced that it's worth it either, given that

1) unlike Win32, it's not trivial to do this due to the re-use of pp_ctl$(o)
2) there doesn't seem to be any noticeable speed hit on *nix

2. ${^WIN32_SLOPPY_STAT} for Win32 miniperl only

3. /lib first on Win32 miniperl only

These seem reasonable, but I don't even know how to do the first, so don't
know how easy it is.

${^WIN32_SLOPPY_STAT}=1 if $O eq WINDOWS_O;

Yves

@p5pRT

This comment has been minimized.

Copy link
Collaborator Author

commented Feb 28, 2014

From @steve-m-hay

On 28 February 2014 15​:47, Nicholas Clark <nick@​ccl4.org> wrote​:

On Tue, Feb 04, 2014 at 02​:35​:23PM -0800, Tony Cook via RT wrote​:

On Mon Feb 03 23​:34​:31 2014, bulk88 wrote​:

1. disabling pmc in miniperl can be done for all OSes, there is also
the PERL_IS_MINIPERL macro, so no need to -D it

Not trivially, platforms which use Makefile.SH use the same pp_ctl$(O) for both miniperl and the final perl.

We'd need to what's done with op.c/opmini.c and perl.c/perlmini.c, and I don't think it's worth it.

I'm not convinced that it's worth it either, given that

1) unlike Win32, it's not trivial to do this due to the re-use of pp_ctl$(o)
2) there doesn't seem to be any noticeable speed hit on *nix

2. ${^WIN32_SLOPPY_STAT} for Win32 miniperl only

3. /lib first on Win32 miniperl only

These seem reasonable, but I don't even know how to do the first, so don't
know how easy it is. The latter *is* easy, but it's 90% testing, which I can't
do either.

On the subject of testing (that I can't) as George's smoker is unhappy again,
could someone on Win32 try out the branch smoke-me/nicholas/fake-pm_to_blib
and verify that it still builds on Win32? (I think that it should work)
It avoids the entire use of ExtUtils​::MakeMaker and make for building about
80 of the "simple" pure-perl modules, so it should speed things up.
Based on the experience of disabling PMCs, I hope that on Win32 it's quite a
bit faster.

Yes, that builds and tests ok for me. And the build is visibly quicker
too. I didn't compare timings, but I didn't need to to see that it's
quicker -- always the best kind of speed-up! :-)

@p5pRT

This comment has been minimized.

Copy link
Collaborator Author

commented Feb 28, 2014

From @nwc10

On Fri, Feb 28, 2014 at 06​:05​:18PM +0000, Steve Hay wrote​:

On 28 February 2014 15​:47, Nicholas Clark <nick@​ccl4.org> wrote​:

On the subject of testing (that I can't) as George's smoker is unhappy again,
could someone on Win32 try out the branch smoke-me/nicholas/fake-pm_to_blib
and verify that it still builds on Win32? (I think that it should work)
It avoids the entire use of ExtUtils​::MakeMaker and make for building about
80 of the "simple" pure-perl modules, so it should speed things up.
Based on the experience of disabling PMCs, I hope that on Win32 it's quite a
bit faster.

Yes, that builds and tests ok for me. And the build is visibly quicker
too. I didn't compare timings, but I didn't need to to see that it's
quicker -- always the best kind of speed-up! :-)

Cool. Thanks for testing.

Builds OK on VMS too. I have no idea if it was faster as I was out.

I'll merge it tomorrow when I'm competent again.

Nicholas Clark

@p5pRT

This comment has been minimized.

Copy link
Collaborator Author

commented Feb 28, 2014

From @craigberry

On Fri, Feb 28, 2014 at 5​:29 PM, Nicholas Clark <nick@​ccl4.org> wrote​:

Builds OK on VMS too. I have no idea if it was faster as I was out.

There is a smoke running now that should report its time and can be
compared to a recent smoke of blead.

@p5pRT

This comment has been minimized.

Copy link
Collaborator Author

commented Feb 28, 2014

From @nwc10

On Fri, Feb 28, 2014 at 05​:36​:07PM -0600, Craig A. Berry wrote​:

On Fri, Feb 28, 2014 at 5​:29 PM, Nicholas Clark <nick@​ccl4.org> wrote​:

Builds OK on VMS too. I have no idea if it was faster as I was out.

There is a smoke running now that should report its time and can be
compared to a recent smoke of blead.

Thanks. On HP's system I got these test failures​:

Failed 25 tests out of 2164, 98.84% okay.
  ../cpan/ExtUtils-MakeMaker/t/basic.t
  ../cpan/ExtUtils-MakeMaker/t/fixin.t
  ../cpan/Module-Build/t/PL_files.t
  ../cpan/Module-Build/t/basic.t
  ../cpan/Module-Build/t/compat.t
  ../cpan/Module-Build/t/debug.t
  ../cpan/Module-Build/t/destinations.t
  ../cpan/Module-Build/t/extend.t
  ../cpan/Module-Build/t/install_extra_target.t
  ../cpan/Module-Build/t/manifypods.t
  ../cpan/Module-Build/t/manifypods_with_utf8.t
  ../cpan/Module-Build/t/perl_mb_opt.t
  ../cpan/Module-Build/t/properties/share_dir.t
  ../cpan/Module-Build/t/script_dist.t
  ../cpan/Module-Build/t/test_file_exts.t
  ../cpan/Module-Build/t/test_types.t
  ../cpan/Module-Build/t/use_tap_harness.t
  ../cpan/Test-Harness/t/source_handler.t
  ../cpan/Test-Harness/t/testargs.t
  ../ext/B/t/showlex.t
  ../ext/Devel-Peek/t/Peek.t
  ../ext/Pod-Html/t/cache.t
  re/charset.t
  re/fold_grind.t
  run/locale.t

which I think are 2 more than last time I noted the results (August 2013)
I've not tried to expand any.

(re/*, run/*, Devel​::Peek failures are new(ish). Previous autodie failure is
gone. Most of these failures "solve themselves" when Module​::Build leaves the
core)

Nicholas Clark

@p5pRT

This comment has been minimized.

Copy link
Collaborator Author

commented Mar 1, 2014

From @bulk88

On Sun Feb 09 18​:18​:03 2014, bulk88 wrote​:

patches used for testing

-----------------------------------------------------------------------------------------
win32/win32.h | 1 +
1 file changed, 1 insertion(+)

diff --git a/win32/win32.h b/win32/win32.h
index 3d1655a..f5f0187 100644
--- a/win32/win32.h
+++ b/win32/win32.h
@​@​ -20,6 +20,7 @​@​
* level in full perl
*/
# define WIN32_NO_SOCKETS
+# define PERL_DISABLE_PMC
#endif

#ifdef WIN32_NO_SOCKETS

----------------------------------------------------------------------------------------
write_buildcustomize.pl | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/write_buildcustomize.pl b/write_buildcustomize.pl
index 64bf4ce..181a00b 100644
--- a/write_buildcustomize.pl
+++ b/write_buildcustomize.pl
@​@​ -54,7 +54,7 @​@​ require File​::Spec​::Functions;

my $inc = join ",\n ",
map { "q\0$_\0" }
- (map {File​::Spec​::Functions​::rel2abs($_)} @​toolchain, 'lib');
+ (map {File​::Spec​::Functions​::rel2abs($_)} 'lib', @​toolchain);

open my $fh, '>', $file
or die "Can't open $file​: $!";
@​@​ -74,6 +74,8 @​@​ print $fh <<"EOT" or $error = "Can't print to $file​:
$!";
# We are miniperl, building extensions
# Replace the first entry of \@​INC ("lib") with the list of
# directories we need.
+
+\${^WIN32_SLOPPY_STAT} = 1;
splice(\@​INC, 0, 1, $inc);
\$^O = '$osname';
EOT

To get this ticket moving. So I will make a patch with all 3 optimizations above, and all 3 optimizations are done only on Win32? With all the discussion above, AFAIK the consensus is none of the 3 optimizations can be done on anything but Win32. Any final objections to the proposal in this post that would prevent the future patch from being applied?

--
bulk88 ~ bulk88 at hotmail.com

@p5pRT

This comment has been minimized.

Copy link
Collaborator Author

commented Mar 1, 2014

From @bulk88

On Fri Feb 28 18​:53​:40 2014, bulk88 wrote​:

To get this ticket moving. So I will make a patch with all 3
optimizations above, and all 3 optimizations are done only on Win32?
With all the discussion above, AFAIK the consensus is none of the 3
optimizations can be done on anything but Win32. Any final objections
to the proposal in this post that would prevent the future patch from
being applied?

Since I had time. I made the patch. Passed a harness run for me (me=Win32 platform only).

--
bulk88 ~ bulk88 at hotmail.com

@p5pRT

This comment has been minimized.

Copy link
Collaborator Author

commented Mar 1, 2014

From @bulk88

0001-perl-121119-speed-up-miniperl-require-on-Win32.patch
From 4521b236a01d70c7acc6c1e8bb70a114081ad59c Mon Sep 17 00:00:00 2001
From: Daniel Dragan <bulk88@hotmail.com>
Date: Sat, 1 Mar 2014 06:20:16 -0500
Subject: [PATCH] [perl #121119] speed up miniperl require on Win32

These 3 optimizations reduce the number of, usually failing, I/O calls
for each "require" for miniperl only. None are appropriate except for
Win32. See #121119 for details.
---
 win32/win32.h           | 3 +++
 write_buildcustomize.pl | 6 +++++-
 2 files changed, 8 insertions(+), 1 deletion(-)

diff --git a/win32/win32.h b/win32/win32.h
index 3d1655a..b2cc94d 100644
--- a/win32/win32.h
+++ b/win32/win32.h
@@ -13,6 +13,7 @@
 #  define _WIN32_WINNT 0x0500     /* needed for CreateHardlink() etc. */
 #endif
 
+/* Win32 only optimizations for faster building */
 #ifdef PERL_IS_MINIPERL
 /* this macro will remove Winsock only on miniperl, PERL_IMPLICIT_SYS and
  * makedef.pl create dependencies that will keep Winsock linked in even with
@@ -20,6 +21,8 @@
  * level in full perl
  */
 #  define WIN32_NO_SOCKETS
+/* less I/O calls during each require */
+#  define PERL_DISABLE_PMC
 #endif
 
 #ifdef WIN32_NO_SOCKETS
diff --git a/write_buildcustomize.pl b/write_buildcustomize.pl
index 64bf4ce..cf429a9 100644
--- a/write_buildcustomize.pl
+++ b/write_buildcustomize.pl
@@ -54,7 +54,10 @@ require File::Spec::Functions;
 
 my $inc = join ",\n        ",
     map { "q\0$_\0" }
-    (map {File::Spec::Functions::rel2abs($_)} @toolchain, 'lib');
+    (map {File::Spec::Functions::rel2abs($_)} (
+# faster build on the non-parallel Win32 build process
+        $^O eq 'MSWin32' ? ('lib', @toolchain ) : (@toolchain, 'lib')
+    ));
 
 open my $fh, '>', $file
     or die "Can't open $file: $!";
@@ -74,6 +77,7 @@ print $fh <<"EOT" or $error = "Can't print to $file: $!";
 # We are miniperl, building extensions
 # Replace the first entry of \@INC ("lib") with the list of
 # directories we need.
+${\($^O eq 'MSWin32' ? '${^WIN32_SLOPPY_STAT} = 1;':'')}
 splice(\@INC, 0, 1, $inc);
 \$^O = '$osname';
 EOT
-- 
1.8.0.msysgit.0

@p5pRT

This comment has been minimized.

Copy link
Collaborator Author

commented Mar 19, 2014

From @bulk88

On Sat Mar 01 03​:22​:06 2014, bulk88 wrote​:

On Fri Feb 28 18​:53​:40 2014, bulk88 wrote​:

To get this ticket moving. So I will make a patch with all 3
optimizations above, and all 3 optimizations are done only on Win32?
With all the discussion above, AFAIK the consensus is none of the 3
optimizations can be done on anything but Win32. Any final objections
to the proposal in this post that would prevent the future patch from
being applied?

Since I had time. I made the patch. Passed a harness run for me
(me=Win32 platform only).

Bump.

--
bulk88 ~ bulk88 at hotmail.com

@p5pRT

This comment has been minimized.

Copy link
Collaborator Author

commented Mar 25, 2014

From zefram@fysh.org

Comment copied from the p5p thread on 5.20 blockers​:

I'd be happier about it if we actually resolved the commented reason for
lib being last. It shouldn't be hard to arrange for new files in lib
to be created under temporary names and atomically renamed into place.
This may be a bit much to do for 5.20, though, especially with it having
not been implemented yet. I'd lean towards punting to 5.21.

-zefram

@p5pRT

This comment has been minimized.

Copy link
Collaborator Author

commented Mar 25, 2014

From @tonycoz

On Tue, Mar 25, 2014 at 09​:06​:58AM +0000, Zefram wrote​:

Comment copied from the p5p thread on 5.20 blockers​:

I'd be happier about it if we actually resolved the commented reason for
lib being last. It shouldn't be hard to arrange for new files in lib
to be created under temporary names and atomically renamed into place.
This may be a bit much to do for 5.20, though, especially with it having
not been implemented yet. I'd lean towards punting to 5.21.

While I think rename into place is a good idea, if Win32 does ever get
parallel builds, lib will still have to go last, so an update of a
module (on a remake) doesn't fail because another process is reading
it.

Tony

@p5pRT

This comment has been minimized.

Copy link
Collaborator Author

commented Mar 26, 2014

From @bulk88

On Tue Mar 25 02​:07​:22 2014, zefram@​fysh.org wrote​:

Comment copied from the p5p thread on 5.20 blockers​:

I'd be happier about it if we actually resolved the commented reason for
lib being last. It shouldn't be hard to arrange for new files in lib
to be created under temporary names and atomically renamed into place.
This may be a bit much to do for 5.20, though, especially with it having
not been implemented yet. I'd lean towards punting to 5.21.

-zefram

related https://rt-archive.perl.org/perl5/Ticket/Display.html?id=82194 ?

--
bulk88 ~ bulk88 at hotmail.com

@p5pRT

This comment has been minimized.

Copy link
Collaborator Author

commented Mar 26, 2014

From zefram@fysh.org

bulk88 via RT wrote​:

related https://rt-archive.perl.org/perl5/Ticket/Display.html?id=82194 ?

Yes, related.

-zefram

@p5pRT

This comment has been minimized.

Copy link
Collaborator Author

commented Apr 1, 2014

From @bulk88

Ill add another very slow @​INC with miniperl trace. 5.2 seconds to do "C​:\p519\srckhw\miniperl.exe "-I..\..\lib" "-I..\..\lib" -MExtUtils​::Command -e chmod -- 755 blib\man3". The actual IO done on C​:\p519\srckhw\dist\ExtUtils-ParseXS\blib\man3 took 8 ms. 5.2 seconds-8ms was spent reading (most of the time), compiling, and executing modules.

--
bulk88 ~ bulk88 at hotmail.com

@p5pRT

This comment has been minimized.

Copy link
Collaborator Author

commented Apr 1, 2014

@p5pRT

This comment has been minimized.

Copy link
Collaborator Author

commented Apr 4, 2014

From @bulk88

On Mon Mar 31 17​:08​:58 2014, bulk88 wrote​:

Ill add another very slow @​INC with miniperl trace. 5.2 seconds to do
"C​:\p519\srckhw\miniperl.exe "-I..\..\lib" "-I..\..\lib"
-MExtUtils​::Command -e chmod -- 755 blib\man3". The actual IO done on
C​:\p519\srckhw\dist\ExtUtils-ParseXS\blib\man3 took 8 ms. 5.2 seconds-
8ms was spent reading (most of the time), compiling, and executing
modules.

Is anyone going to review and commit this?

--
bulk88 ~ bulk88 at hotmail.com

@p5pRT

This comment has been minimized.

Copy link
Collaborator Author

commented Apr 8, 2014

From @steve-m-hay

Thanks, applied as 8ce7a7e.

@p5pRT

This comment has been minimized.

Copy link
Collaborator Author

commented Apr 8, 2014

@steve-m-hay - Status changed from 'open' to 'resolved'

@p5pRT p5pRT closed this Apr 8, 2014
@p5pRT

This comment has been minimized.

Copy link
Collaborator Author

commented Apr 8, 2014

From @steve-m-hay

On 25 March 2014 09​:06, Zefram <zefram@​fysh.org> wrote​:

Comment copied from the p5p thread on 5.20 blockers​:

I'd be happier about it if we actually resolved the commented reason for
lib being last. It shouldn't be hard to arrange for new files in lib
to be created under temporary names and atomically renamed into place.
This may be a bit much to do for 5.20, though, especially with it having
not been implemented yet. I'd lean towards punting to 5.21.

-zefram

Zefram, bulk88's patch is now committed (8ce7a7e). As discussed here​:

http​://www.nntp.perl.org/group/perl.perl5.porters/2014/04/msg214294.html

please can you raise a new ticket for the atomic rename idea above if
you think that is worth pursuing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
1 participant
You can’t perform that action at this time.