Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Race condition installing modules to lib in parallel builds #11032

Closed
p5pRT opened this issue Jan 13, 2011 · 6 comments
Closed

Race condition installing modules to lib in parallel builds #11032

p5pRT opened this issue Jan 13, 2011 · 6 comments

Comments

@p5pRT
Copy link

@p5pRT p5pRT commented Jan 13, 2011

Migrated from rt.perl.org#82194 (status was 'resolved')

Searchable as RT82194$

@p5pRT
Copy link
Author

@p5pRT p5pRT commented Jan 13, 2011

From @tonycoz

Created by @tonycoz

Rarely, when performing a parallel build of perl there is a race
condition between copying a module into lib/ and another process using
that module.

In smoke logs this is seen as​:

File/Path.pm did not return a true value at ../../lib/AutoSplit.pm line 6.
BEGIN failed--compilation aborted at ../../lib/AutoSplit.pm line 6.
Compilation failed in require at ../../lib/ExtUtils/Install.pm line 6.
BEGIN failed--compilation aborted at ../../lib/ExtUtils/Install.pm line 6.
Compilation failed in require.
BEGIN failed--compilation aborted.
make[1]​: *** [pm_to_blib] Error 255
Unsuccessful make(ext/FileCache)​: code=512 at make_ext.pl line 494.
make​: *** [ext/FileCache/pm_to_blib] Error 25
make​: *** Waiting for unfinished jobs....

File/Path.pm did not return a true value at ../../lib/AutoSplit.pm line 6.
BEGIN failed--compilation aborted at ../../lib/AutoSplit.pm line 6.
Compilation failed in require at ../../lib/ExtUtils/Install.pm line 6.
BEGIN failed--compilation aborted at ../../lib/ExtUtils/Install.pm line 6.
Compilation failed in require.
BEGIN failed--compilation aborted.
make[1]​: *** [pm_to_blib] Error 255
Unsuccessful make(cpan/File-Temp)​: code=512 at make_ext.pl line 494.

ExtUtils/Install.pm did not return a true value.
BEGIN failed--compilation aborted.
make[1]​: *** [pm_to_blib] Error 255
Unsuccessful make(cpan/ExtUtils-Manifest)​: code=512 at make_ext.pl line 494.
make​: *** [cpan/ExtUtils-Manifest/pm_to_blib] Error 25
make​: *** Waiting for unfinished jobs....
Unable to make anything but miniperl in this configuration

File/Spec.pm did not return a true value at ../../lib/File/Path.pm line 8.
BEGIN failed--compilation aborted at ../../lib/File/Path.pm line 8.
Compilation failed in require at ../../lib/AutoSplit.pm line 6.
BEGIN failed--compilation aborted at ../../lib/AutoSplit.pm line 6.
Compilation failed in require at ../../lib/ExtUtils/Install.pm line 6.
BEGIN failed--compilation aborted at ../../lib/ExtUtils/Install.pm line 6.
Compilation failed in require.
BEGIN failed--compilation aborted.

I ran a loop​:

  while git clean -dxf && ./Configure -des -Dusedevel && make -j6 ; do true ; done

On the 490th run​:

make[1]​: Entering directory `/home/tony/dev/perl/git/perl/cpan/ExtUtils-MakeMake
r'
ExtUtils/Command.pm did not return a true value.
BEGIN failed--compilation aborted.
make[1]​: *** [blib/man3/.exists] Error 255
make[1]​: Leaving directory `/home/tony/dev/perl/git/perl/dist/ExtUtils-ParseXS'
make config PERL_CORE=1 LIBPERL_A=libperl.a failed, continuing anyway...
Making all in dist/ExtUtils-ParseXS
make all PERL_CORE=1 LIBPERL_A=libperl.a
make[1]​: Entering directory `/home/tony/dev/perl/git/perl/dist/ExtUtils-ParseXS'
make[1]​: Leaving directory `/home/tony/dev/perl/git/perl/cpan/ExtUtils-Constant'
ExtUtils/Command.pm did not return a true value.
BEGIN failed--compilation aborted.
./miniperl -Ilib make_ext.pl ext/FileCache/pm_to_blib MAKE=make LIBPERL_A=libperl.a
make[1]​: Leaving directory `/home/tony/dev/perl/git/perl/cpan/ExtUtils-Manifest'
Making all in cpan/ExtUtils-Manifest
make all PERL_CORE=1 LIBPERL_A=libperl.a
make[1]​: Leaving directory `/home/tony/dev/perl/git/perl/dist/ExtUtils-Install'
make[1]​: *** [blib/man3/.exists] Error 255
make[1]​: Leaving directory `/home/tony/dev/perl/git/perl/dist/ExtUtils-ParseXS'
Unsuccessful make(dist/ExtUtils-ParseXS)​: code=512 at make_ext.pl line 494.
cp lib/ExtUtils/Command.pm ../../lib/ExtUtils/Command.pm
  Making FileCache (all)

Creating Makefile.PL in ext/FileCache for FileCache

Running Makefile.PL in ext/FileCache
../../miniperl Makefile.PL INSTALLDIRS=perl INSTALLMAN1DIR=none INSTALLMAN3DIR=none PERL_CORE=1 LIBPERL_A=libperl.a
make​: *** [dist/ExtUtils-ParseXS/pm_to_blib] Error 25
make​: *** Waiting for unfinished jobs....
make[1]​: Entering directory `/home/tony/dev/perl/git/perl/cpan/ExtUtils-Manifest'
cp lib/ExtUtils/MM_OS2.pm ../../lib/ExtUtils/MM_OS2.pm
cp lib/ExtUtils/MakeMaker.pm ../../lib/ExtUtils/MakeMaker.pm
cp lib/ExtUtils/MM_VOS.pm ../../lib/ExtUtils/MM_VOS.pm
cp lib/ExtUtils/MM_Unix.pm ../../lib/ExtUtils/MM_Unix.pm
cp lib/ExtUtils/Mksymlists.pm ../../lib/ExtUtils/Mksymlists.pm
<snip>

ExtUtils​::Install​::pm_to_blib reports the copy after performing the
copy, so it's possible here that lib/ExtUtils/Command.pm has been
created, but has only been partially copied at the point where the
other processes try to load it.

To test this, I modified ExtUtils/Install.pm​:

--- a/dist/ExtUtils-Install/lib/ExtUtils/Install.pm
+++ b/dist/ExtUtils-Install/lib/ExtUtils/Install.pm
@​@​ -1219,8 +1219,9 @​@​ sub pm_to_blib {
  run_filter($pm_filter, $from, $to);
  print "$pm_filter <$from >$to\n";
  } else {
+ print STDERR "-cp $from $to\n";
  _copy( $from, $to );
- print "cp $from $to\n";
+ print STDERR "+cp $from $to\n";
  }
  my($mode,$atime,$mtime) = (stat $from)[2,8,9];
  utime($atime,$mtime+$Is_VMS,$to);

After 209 builds I get​:

+cp lib/File/Spec/Epoc.pm ../../lib/File/Spec/Epoc.pm
-cp lib/File/Spec/Cygwin.pm ../../lib/File/Spec/Cygwin.pm
+cp lib/File/Spec/Cygwin.pm ../../lib/File/Spec/Cygwin.pm
-cp lib/File/Spec.pm ../../lib/File/Spec.pm
Processing extracted/DBinaryProperties.txt
File/Spec.pm did not return a true value at ../../lib/ExtUtils/ParseXS.pm line 7
.
BEGIN failed--compilation aborted at ../../lib/ExtUtils/ParseXS.pm line 7.
Compilation failed in require at ../../lib/ExtUtils/xsubpp line 4.
BEGIN failed--compilation aborted at ../../lib/ExtUtils/xsubpp line 4.
make[1]​: *** [Zlib.c] Error 255
make[1]​: Leaving directory `/home/tony/dev/perl/git/perl/cpan/Compress-Raw-Zlib'
+cp lib/File/Spec.pm ../../lib/File/Spec.pm
-cp Cwd.pm ../../lib/Cwd.pm
+cp Cwd.pm ../../lib/Cwd.pm

As a further test I applied the following change locally​:

--- a/dist/ExtUtils-Install/lib/ExtUtils/Install.pm
+++ b/dist/ExtUtils-Install/lib/ExtUtils/Install.pm
@​@​ -535,8 +535,11 @​@​ sub _copy {
  printf "copy(%s,%s)\n", $from, $to;
  }
  if (!$dry_run) {
- File​::Copy​::copy($from,$to)
- or Carp​::croak( _estr "ERROR​: Cannot copy '$from' to '$to'​: $!" );
+ my $work = "$to.work";
+ File​::Copy​::copy($from,$work)
+ or Carp​::croak( _estr "ERROR​: Cannot copy '$from' to '$work'​: $!" )
+ rename($work, $to)
+ or Carp​::croak( _estr "ERROR​: Cannot rename '$work' to '$to'​: $!" );
  }
}

I ran my build loop again, and killed it after 2043 successful builds.

I don't expect the change above is safe for blead​:

- the extra . in filename may be a problem for some systems, or the
  extra file length

- ExtUtils​::Install​::_copy() is also used for installion to the final
  location, the change above may cause permissions problems if the
  file already exists

- some (non-CORE) modules may have filenames where the .work cause a
  conflict.

For a real solution​:

a) some mechanism to make a non-conflicting filename

b) some mechanism to retain ownership and permissions on the target file

or

c) perhaps use copy and rename only for building the perl core

or

d) ignore the problem, it only occurs rarely

Perl Info

Flags:
    category=core
    severity=low

Site configuration information for perl 5.13.6:

Configured by tony at Sat Oct 23 11:02:59 EST 2010.

Summary of my perl5 (revision 5 version 13 subversion 6) configuration:
  Derived from: 4a46fe99d0b8e1b85f7c17c5c73894e520adbf9e
  Platform:
    osname=linux, osvers=2.6.26-2-amd64, archname=x86_64-linux
    uname='linux mars 2.6.26-2-amd64 #1 smp thu sep 16 15:56:38 utc 2010 x86_64 gnulinux '
    config_args='-des -Dusedevel'
    hint=recommended, useposix=true, d_sigaction=define
    useithreads=undef, usemultiplicity=undef
    useperlio=define, d_sfio=undef, uselargefiles=define, usesocks=undef
    use64bitint=define, use64bitall=define, uselongdouble=undef
    usemymalloc=n, bincompat5005=undef
  Compiler:
    cc='cc', ccflags ='-fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64',
    optimize='-O2',
    cppflags='-fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include'
    ccversion='', gccversion='4.3.2', gccosandvers=''
    intsize=4, longsize=8, ptrsize=8, doublesize=8, byteorder=12345678
    d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=16
    ivtype='long', ivsize=8, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8
    alignbytes=8, prototype=define
  Linker and Libraries:
    ld='cc', ldflags =' -fstack-protector -L/usr/local/lib'
    libpth=/usr/local/lib /lib /usr/lib /lib64 /usr/lib64
    libs=-lnsl -lgdbm -ldl -lm -lcrypt -lutil -lc -lgdbm_compat
    perllibs=-lnsl -ldl -lm -lcrypt -lutil -lc
    libc=/lib/libc-2.7.so, so=so, useshrplib=false, libperl=libperl.a
    gnulibc_version='2.7'
  Dynamic Linking:
    dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,-E'
    cccdlflags='-fPIC', lddlflags='-shared -O2 -L/usr/local/lib -fstack-protector'

Locally applied patches:
    


@INC for perl 5.13.6:
    lib
    /usr/local/lib/perl5/site_perl/5.13.6/x86_64-linux
    /usr/local/lib/perl5/site_perl/5.13.6
    /usr/local/lib/perl5/5.13.6/x86_64-linux
    /usr/local/lib/perl5/5.13.6
    .


Environment for perl 5.13.6:
    HOME=/home/tony
    LANG=en_AU.UTF-8
    LANGUAGE (unset)
    LD_LIBRARY_PATH (unset)
    LOGDIR (unset)
    PATH=/home/tony/bin:/usr/local/bin:/usr/bin:/bin:/usr/games
    PERL_BADLANG (unset)
    SHELL=/bin/bash

@p5pRT
Copy link
Author

@p5pRT p5pRT commented Jan 17, 2011

From @nwc10

On Thu, Jan 13, 2011 at 01​:22​:40AM -0800, Tony Cook wrote​:

For a real solution​:

a) some mechanism to make a non-conflicting filename

b) some mechanism to retain ownership and permissions on the target file

or

c) perhaps use copy and rename only for building the perl core

or

d) ignore the problem, it only occurs rarely

I think that actually the correct solution is

e) Patch ExtUtils​::MakeMaker not to add ../../lib to miniperl's @​INC when
  building under PERL_CORE

[given that all the relevant modules are now added to @​INC by make_ext.pl
using $ENV{PERL5LIB}]

because I think that the race condition is that ../../lib is first in @​INC,
so a partially copied module is getting found if it happens that a second
process is loading *that* module at just the same time as the first process
is copying it into lib.

But I'm not sure what the current state of patching ExtUtils​::MakeMaker is.

Nicholas Clark

@p5pRT
Copy link
Author

@p5pRT p5pRT commented Jan 17, 2011

The RT System itself - Status changed from 'new' to 'open'

@p5pRT
Copy link
Author

@p5pRT p5pRT commented Apr 8, 2014

From @nwc10

On Mon, Jan 17, 2011 at 10​:49​:43AM +0000, Nicholas Clark wrote​:

On Thu, Jan 13, 2011 at 01​:22​:40AM -0800, Tony Cook wrote​:

For a real solution​:

a) some mechanism to make a non-conflicting filename

b) some mechanism to retain ownership and permissions on the target file

or

c) perhaps use copy and rename only for building the perl core

or

d) ignore the problem, it only occurs rarely

I think that actually the correct solution is

e) Patch ExtUtils​::MakeMaker not to add ../../lib to miniperl's @​INC when
building under PERL_CORE

[given that all the relevant modules are now added to @​INC by make_ext.pl
using $ENV{PERL5LIB}]

because I think that the race condition is that ../../lib is first in @​INC,
so a partially copied module is getting found if it happens that a second
process is loading *that* module at just the same time as the first process
is copying it into lib.

Isn't this bug fixed by this change? (if so, my fault for not closing it)

commit 5e4c4c9
Author​: Nicholas Clark <nick@​ccl4.org>
Date​: Mon Feb 14 10​:14​:18 2011 +0000

  Use a buildcustomize.pl to set @​INC in miniperl when building extensions.
 
  With the build tools now shipped in various subdirectories of cpan/ and dist/
  we need to add several paths to @​INC when invoking MakeMaker (etc) to build
  extensions.
 
  The previous approach of using $ENV{PERL5LIB} was fragile, because​:
  a​: It was hitting the length limit for %ENV variables on VMS
  b​: It was running the risk of race conditions in a parallel build -
  ExtUtils​::Makemaker "knows" to add -I../..lib, which puts lib at the *front*
  of @​INC, but if one parallel process happens to copy a module into lib/
  whilst another is searching for it, the second may get a partial read
  c​: Overwriting $ENV{PERL5LIB} breaks any system where any of the installed
  build tools are actually implemented in Perl, if they are relying on
  $ENV{PERL5LIB} for setup
 
  This approach
 
  a​: Doesn't have %ENV length limits
  b​: Ensures that lib/ is last, so copy targets are always shadowing copy
  sources
  c​: Only affects miniperl, and doesn't touch $ENV{PERL5LIB}
 
  Approaches that turned out to have fatal flaws​:
 
  1​: Using $ENV{PERL5OPT} with a module fails because ExtUtils​::MakeMaker
  searches for the build perl without setting lib, and treats the error
  caused by a failed -M as "not a valid perl 5 binary"
  2​: Refactoring ExtUtils​::MakeMaker to *not* use -I for lib, and instead rely
  on $ENV{PERL5LIB} [which includes "../../lib"] fails because​:
  some extensions have subdirectories, and on these EU​::MM correctly uses
  -I../../../lib, where as $ENV{PERL5LIB} only has space for relative paths,
  and only with two levels.
 
  This approach actually takes advantage of ExtUtils​::MakeMaker setting an -I
  option correct for the depth of directory being built.

Nicholas Clark

@p5pRT
Copy link
Author

@p5pRT p5pRT commented Apr 8, 2014

From @tonycoz

On Tue Apr 08 03​:09​:34 2014, nicholas wrote​:

Isn't this bug fixed by this change? (if so, my fault for not closing
it)

Yes, I think so.

I knew about the commit (or at least the changed in behaviour) when I last looked at this ticket and it didn't occur to me either.

Thanks, closing.

Tony

@p5pRT p5pRT closed this as completed Apr 8, 2014
@p5pRT
Copy link
Author

@p5pRT p5pRT commented Apr 8, 2014

@tonycoz - Status changed from 'open' to 'resolved'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant