Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Strange behavior when forking in BEGIN #15254

Open
p5pRT opened this issue Mar 28, 2016 · 14 comments
Open

Strange behavior when forking in BEGIN #15254

p5pRT opened this issue Mar 28, 2016 · 14 comments

Comments

@p5pRT
Copy link

p5pRT commented Mar 28, 2016

Migrated from rt.perl.org#127794 (status was 'open')

Searchable as RT127794$

@p5pRT
Copy link
Author

p5pRT commented Mar 28, 2016

From @exodist

Created by @exodist

This script, should output "about to fork" and "Hi" 10 times. Instead it
prints "about to fork" 10 times, but "Hi" only once.

#!/usr/bin/env perl

  BEGIN {
  my $start = $$;
  for ( 1 .. 10 ) {
  my $pid = fork;
  if ($pid) {
  print "About to fork\n";
  waitpid($pid, 0);
  }
  else {
  last;
  }
  }
  exit 0 if $$ == $start;
  }

  print "Hi\n";

Here is where it gets even MORE interesting, add these 3 lines to the end
of the script and it prints "Hi" twice​:

  __END__

  print "Hi\n";

But wait, theres more! Add those 3 lines again, so it looks like this​:

  ...
  print "Hi\n";

  __END__

  print "Hi\n";

  __END__

  print "Hi\n";

And bam, it prints "Hi\n" all 10 times.

This looks like a filehandle fork bug to me, but I don't really know much
about these things.

Perl Info

Flags:
    category=core
    severity=medium

Site configuration information for perl 5.20.3:

Configured by exodist at Sat Mar  5 16:04:14 PST 2016.

Summary of my perl5 (revision 5 version 20 subversion 3) configuration:

  Platform:
    osname=linux, osvers=4.4.3-1-arch, archname=x86_64-linux-thread-multi
    uname='linux abydos 4.4.3-1-arch #1 smp preempt fri feb 26 15:09:29 cet
2016 x86_64 gnulinux '
    config_args='-de -Dprefix=/home/exodist/perl5/perlbrew/perls/main
-Dusethreads -Aeval:scriptdir=/home/exodist/perl5/perlbrew/perls/main/bin'
    hint=recommended, useposix=true, d_sigaction=define
    useithreads=define, usemultiplicity=define
    use64bitint=define, use64bitall=define, uselongdouble=undef
    usemymalloc=n, bincompat5005=undef
  Compiler:
    cc='cc', ccflags ='-D_REENTRANT -D_GNU_SOURCE -fwrapv
-fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include
-D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64',
    optimize='-O2',
    cppflags='-D_REENTRANT -D_GNU_SOURCE -fwrapv -fno-strict-aliasing -pipe
-fstack-protector -I/usr/local/include'
    ccversion='', gccversion='5.3.0', gccosandvers=''
    intsize=4, longsize=8, ptrsize=8, doublesize=8, byteorder=12345678
    d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=16
    ivtype='long', ivsize=8, nvtype='double', nvsize=8, Off_t='off_t',
lseeksize=8
    alignbytes=8, prototype=define
  Linker and Libraries:
    ld='cc', ldflags =' -fstack-protector -L/usr/local/lib'
    libpth=/usr/local/lib
/usr/lib/gcc/x86_64-unknown-linux-gnu/5.3.0/include-fixed /usr/lib
/lib/../lib /usr/lib/../lib /lib /lib64 /usr/lib64
    libs=-lpthread -lnsl -lnm -lgdbm -ldb -ldl -lm -lcrypt -lutil -lc
-lgdbm_compat
    perllibs=-lpthread -lnsl -lnm -ldl -lm -lcrypt -lutil -lc
    libc=libc-2.23.so, so=so, useshrplib=false, libperl=libperl.a
    gnulibc_version='2.23'
  Dynamic Linking:
    dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,-E'
    cccdlflags='-fPIC', lddlflags='-shared -O2 -L/usr/local/lib
-fstack-protector'



@INC for perl 5.20.3:
    /home/exodist/.perlbrew/libs/main@exodist
/lib/perl5/x86_64-linux-thread-multi
    /home/exodist/.perlbrew/libs/main@exodist/lib/perl5

/home/exodist/perl5/perlbrew/perls/main/lib/site_perl/5.20.3/x86_64-linux-thread-multi
    /home/exodist/perl5/perlbrew/perls/main/lib/site_perl/5.20.3

/home/exodist/perl5/perlbrew/perls/main/lib/5.20.3/x86_64-linux-thread-multi
    /home/exodist/perl5/perlbrew/perls/main/lib/5.20.3
    .


Environment for perl 5.20.3:
    HOME=/home/exodist
    LANG=en_US.UTF-8
    LANGUAGE (unset)
    LD_LIBRARY_PATH (unset)
    LOGDIR (unset)
    PATH=/home/exodist/.perlbrew/libs/main@exodist
/bin:/home/exodist/perl5/perlbrew/bin:/home/exodist/perl5/perlbrew/perls/main/bin:/home/exodist/bin:/usr/local/sbin:/usr/local/bin:/usr/bin:/usr/lib/jvm/default/bin:/usr/bin/site_perl:/usr/bin/vendor_perl:/usr/bin/core_perl
    PERL5LIB=/home/exodist/.perlbrew/libs/main@exodist/lib/perl5
    PERLBREW_BASHRC_VERSION=0.73
    PERLBREW_HOME=/home/exodist/.perlbrew
    PERLBREW_LIB=exodist
    PERLBREW_MANPATH=/home/exodist/.perlbrew/libs/main@exodist
/man:/home/exodist/perl5/perlbrew/perls/main/man
    PERLBREW_PATH=/home/exodist/.perlbrew/libs/main@exodist
/bin:/home/exodist/perl5/perlbrew/bin:/home/exodist/perl5/perlbrew/perls/main/bin
    PERLBREW_PERL=main
    PERLBREW_ROOT=/home/exodist/perl5/perlbrew
    PERLBREW_VERSION=0.74
    PERL_BADLANG (unset)
    PERL_LOCAL_LIB_ROOT=/home/exodist/.perlbrew/libs/main@exodist
    PERL_MB_OPT=--install_base /home/exodist/.perlbrew/libs/main@exodist
    PERL_MM_OPT=INSTALL_BASE=/home/exodist/.perlbrew/libs/main@exodist
    SHELL=/usr/bin/zsh

@p5pRT
Copy link
Author

p5pRT commented Mar 28, 2016

From @mauke

Am 28.03.2016 um 08​:10 schrieb Chad Granum (via RT)​:

This script, should output "about to fork" and "Hi" 10 times. Instead it
prints "about to fork" 10 times, but "Hi" only once.

#!/usr/bin/env perl

 BEGIN \{
     my $start = $$;
     for \( 1 \.\. 10 \) \{
         my $pid = fork;
         if \($pid\) \{
             print "About to fork\\n";
             waitpid\($pid\, 0\);
         \}
         else \{
             last;
         \}
     \}
     exit 0 if $$ == $start;
 \}

 print "Hi\\n";

Here is where it gets even MORE interesting, add these 3 lines to the end
of the script and it prints "Hi" twice​:

 \_\_END\_\_

 print "Hi\\n";

But wait, theres more! Add those 3 lines again, so it looks like this​:

 \.\.\.
 print "Hi\\n";

 \_\_END\_\_

 print "Hi\\n";

 \_\_END\_\_

 print "Hi\\n";

And bam, it prints "Hi\n" all 10 times.

This looks like a filehandle fork bug to me, but I don't really know much
about these things.

Inherited filehandles share positions. BEGIN blocks run at parse time,
before the rest of the code has been read. The first child process will
read the rest of the code, parse it, print "Hi", and exit.

All other processes will continue parsing from where the last child left
off (because they share their read position in the source file).

It's nonintuitive but I'm not sure if this is even a bug.

--
Lukas Mai <plokinom@​gmail.com>

@p5pRT
Copy link
Author

p5pRT commented Mar 28, 2016

The RT System itself - Status changed from 'new' to 'open'

@p5pRT
Copy link
Author

p5pRT commented Mar 28, 2016

From @exodist

How do the to __END__ sections factor into this? Why does adding 1 allow 2
to say 'hi', then adding a third let them all work?

On Sun, Mar 27, 2016 at 11​:26 PM, Lukas Mai <plokinom@​gmail.com> wrote​:

Am 28.03.2016 um 08​:10 schrieb Chad Granum (via RT)​:

This script, should output "about to fork" and "Hi" 10 times. Instead it
prints "about to fork" 10 times, but "Hi" only once.

#!/usr/bin/env perl

 BEGIN \{
     my $start = $$;
     for \( 1 \.\. 10 \) \{
         my $pid = fork;
         if \($pid\) \{
             print "About to fork\\n";
             waitpid\($pid\, 0\);
         \}
         else \{
             last;
         \}
     \}
     exit 0 if $$ == $start;
 \}

 print "Hi\\n";

Here is where it gets even MORE interesting, add these 3 lines to the end
of the script and it prints "Hi" twice​:

 \_\_END\_\_

 print "Hi\\n";

But wait, theres more! Add those 3 lines again, so it looks like this​:

 \.\.\.
 print "Hi\\n";

 \_\_END\_\_

 print "Hi\\n";

 \_\_END\_\_

 print "Hi\\n";

And bam, it prints "Hi\n" all 10 times.

This looks like a filehandle fork bug to me, but I don't really know much
about these things.

Inherited filehandles share positions. BEGIN blocks run at parse time,
before the rest of the code has been read. The first child process will
read the rest of the code, parse it, print "Hi", and exit.

All other processes will continue parsing from where the last child left
off (because they share their read position in the source file).

It's nonintuitive but I'm not sure if this is even a bug.

--
Lukas Mai <plokinom@​gmail.com>

@p5pRT
Copy link
Author

p5pRT commented Mar 28, 2016

From arodland@cpan.org

On Sun Mar 27 23​:45​:15 2016, exodist7@​gmail.com wrote​:

How do the to __END__ sections factor into this? Why does adding 1 allow 2
to say 'hi', then adding a third let them all work?

Simplest cases here -- other behavior is possible due to races, but not seen because the script is so small and runs so quickly.

With no __END__​:

1) The parent reads to the closing brace of the BEGIN, compiles and begins executing the BEGIN, and forks with the script fd at the position after the BEGIN (PERL_FLUSHALL_FOR_CHILD ensures that the children see this

2) The first child to run reads the remainder of the script, compiles and runs it, and exits.

3) The remaining children see an fd that's already at EOF and exit because there's no more script.

With one __END__​:

1) As above

2) The first child to run reads up to the __END__, compiles, runs, and exits. On exit the handle is flushed, leaving the fd at the position after the __END__ token.

3) The next child to run reads the remainder of script *after* the __END__, compiles and runs it, and exits.

4) The remaining children see EOF and do nothing.

With two __END__​:

1) and 2) as above.

3) As above, except that the next child to run miscomputes its position in the file (basically, adding its last known position to the amount of buffer it's consumed, except it's mistaken about the last known position), and the flush on exit happens to leave the fd positioned just after the __END__ token for the next process. If you vary the length of the content before and after the __END__ you can provoke different behavior, often syntax errors, as they start reading your code mid-line.

4) The rest of the children proceed as 3), all reading the same content and seeking the fd back to the same point.

@p5pRT
Copy link
Author

p5pRT commented Mar 28, 2016

From arodland@cpan.org

On Mon Mar 28 00​:13​:39 2016, arodland wrote​:

1) The parent reads to the closing brace of the BEGIN, compiles and
begins executing the BEGIN, and forks with the script fd at the
position after the BEGIN (PERL_FLUSHALL_FOR_CHILD ensures that the
children see this

left this thought unfinished​: PERL_FLUSHALL_FOR_CHILD ensures that the children see this even though PerlIO is using buffered I/O. "Flushing" an input handle seeks the underlying fd to PerlIO's logical read position and discards the unread buffer, so that the next low-level read will pick up where PerlIO left off.

@p5pRT
Copy link
Author

p5pRT commented Dec 16, 2017

From zefram@fysh.org

This kind of strangeness is only what should be expected from forking
in a BEGIN block. This ticket should be closed.

-zefram

@p5pRT
Copy link
Author

p5pRT commented Dec 17, 2017

From @hvds

On Sat, 16 Dec 2017 00​:40​:06 -0800, zefram@​fysh.org wrote​:

This kind of strangeness is only what should be expected from forking
in a BEGIN block. This ticket should be closed.

What who should expect - someone intimately familiar with perl internals, or someone who uses perl and (on a good day) may have read `perldoc -f fork`?

The preceding discussion suggests to me that an attempt to fork() at BEGIN time should yield at least a warning (and maybe a fatal error) to reflect the fact perl will not be able to honour a reasonable non-expert's expectations of what that means.

Hugo

@p5pRT
Copy link
Author

p5pRT commented Dec 17, 2017

From @eserte

Dana Sun, 17 Dec 2017 02​:10​:02 -0800, hv reče​:

On Sat, 16 Dec 2017 00​:40​:06 -0800, zefram@​fysh.org wrote​:

This kind of strangeness is only what should be expected from forking
in a BEGIN block. This ticket should be closed.

What who should expect - someone intimately familiar with perl
internals, or someone who uses perl and (on a good day) may have read
`perldoc -f fork`?

The preceding discussion suggests to me that an attempt to fork() at
BEGIN time should yield at least a warning (and maybe a fatal error)
to reflect the fact perl will not be able to honour a reasonable non-
expert's expectations of what that means.

Probably there are legitimate uses of fork() in a BEGIN block --- I would expect that a fork+exec here is harmless.

Regards,
  Slaven

@p5pRT
Copy link
Author

p5pRT commented Dec 18, 2017

From @exodist

I have code that forks in BEGIN and works around this issue, I would oppose
making fork in begin fatal. That said I would be fine with a warning which
I could disable int he cases I really need to.

That said, I would greatly prefer some kind of fix, though I understand
that is probably not gonna happen.

-Chad

On Sun, Dec 17, 2017 at 5​:18 AM, slaven@​rezic.de via RT <
perlbug-followup@​perl.org> wrote​:

Dana Sun, 17 Dec 2017 02​:10​:02 -0800, hv reče​:

On Sat, 16 Dec 2017 00​:40​:06 -0800, zefram@​fysh.org wrote​:

This kind of strangeness is only what should be expected from forking
in a BEGIN block. This ticket should be closed.

What who should expect - someone intimately familiar with perl
internals, or someone who uses perl and (on a good day) may have read
`perldoc -f fork`?

The preceding discussion suggests to me that an attempt to fork() at
BEGIN time should yield at least a warning (and maybe a fatal error)
to reflect the fact perl will not be able to honour a reasonable non-
expert's expectations of what that means.

Probably there are legitimate uses of fork() in a BEGIN block --- I would
expect that a fork+exec here is harmless.

Regards,
Slaven

---
via perlbug​: queue​: perl5 status​: open
https://rt-archive.perl.org/perl5/Ticket/Display.html?id=127794

@p5pRT
Copy link
Author

p5pRT commented Dec 18, 2017

From @xsawyerx

Could you expand on your use-case, Chad?

On 12/18/2017 05​:23 PM, Chad Granum wrote​:

I have code that forks in BEGIN and works around this issue, I would
oppose making fork in begin fatal. That said I would be fine with a
warning which I could disable int he cases I really need to.

That said, I would greatly prefer some kind of fix, though I
understand that is probably not gonna happen.

-Chad

On Sun, Dec 17, 2017 at 5​:18 AM, slaven@​rezic.de
<mailto​:slaven@​rezic.de> via RT <perlbug-followup@​perl.org
<mailto​:perlbug-followup@​perl.org>> wrote​:

Dana Sun\, 17 Dec 2017 02&#8203;:10&#8203;:02 \-0800\, hv reče&#8203;:
> On Sat\, 16 Dec 2017 00&#8203;:40&#8203;:06 \-0800\, zefram@&#8203;fysh\.org
\<mailto&#8203;:zefram@&#8203;fysh\.org> wrote&#8203;:
> > This kind of strangeness is only what should be expected from forking
> > in a BEGIN block\.  This ticket should be closed\.
>
> What who should expect \- someone intimately familiar with perl
> internals\, or someone who uses perl and \(on a good day\) may have
read
> \`perldoc \-f fork\`?
>
> The preceding discussion suggests to me that an attempt to fork\(\) at
> BEGIN time should yield at least a warning \(and maybe a fatal error\)
> to reflect the fact perl will not be able to honour a reasonable
non\-
> expert's expectations of what that means\.

Probably there are legitimate uses of fork\(\) in a BEGIN block \-\-\-
I would expect that a fork\+exec here is harmless\.

Regards\,
    Slaven



\-\-\-
via perlbug&#8203;:  queue&#8203;: perl5 status&#8203;: open
https://rt-archive.perl.org/perl5/Ticket/Display.html?id=127794
\<https://rt-archive.perl.org/perl5/Ticket/Display.html?id=127794>

@p5pRT
Copy link
Author

p5pRT commented Dec 18, 2017

From @exodist

Yes, The real-world code that forks in BEGIN is Test2​::Harness, AKA yath.

To summarize​:
* Yath script starts
* In a BEGIN block (use statement to be precise)
  * Yath figures out what tests need to be run, and what modules should be
preloaded
  * Yath preloads modules
  * Yath forks, in the begin block for each test to be run (process
management hand waving here)
  * In the child process, still in the BEGIN, yath uses a source filter
(MUST BE COMPILE TIME) to replace the rest of the yath script with the test
to actually be run
* BEGIN time ends, and run-time runs the test file

Basically Yath is a preloading Test Harness. It preloads some modules, then
forks for each test. Problem is that a LOT of tests depend on being the
bottom of the stack. A stack trace should not find anything beneath the
test. Yath accomplishes this by preloading in a BEGIN block then using
magic to essentially swap out the rest of the initial script with the
contents of the test file to be run.

Script​: https://github.com/Test-More/Test2-Harness/blob/master/scripts/yath
Post fork magic line​:
https://github.com/Test-More/Test2-Harness/blob/master/lib/App/Yath/Command/spawn.pm#L77
The encapsulation of the magic​:
https://github.com/exodist/goto-file/blob/master/lib/goto/file.pm

It was when trying to invent all this that I first noticed the fork in
BEGIN bug as I reported it. As you can see I now have a working system that
does what I need, bypassing the issue. Specifically each child ignores the
rest of the file that would be read (by reading it whatever it can and
throwing it all away). The parent process works fine because it also throws
away the rest of the file and then uses the same source-filter to re-inject
the rest of the file (easy as it is 1 line).

Sorry if this is clear as mud. It is hard to simplify it for discussion.

On Mon, Dec 18, 2017 at 7​:38 AM, Sawyer X <xsawyerx@​gmail.com> wrote​:

Could you expand on your use-case, Chad?

On 12/18/2017 05​:23 PM, Chad Granum wrote​:

I have code that forks in BEGIN and works around this issue, I would
oppose making fork in begin fatal. That said I would be fine with a
warning which I could disable int he cases I really need to.

That said, I would greatly prefer some kind of fix, though I
understand that is probably not gonna happen.

-Chad

On Sun, Dec 17, 2017 at 5​:18 AM, slaven@​rezic.de
<mailto​:slaven@​rezic.de> via RT <perlbug-followup@​perl.org
<mailto​:perlbug-followup@​perl.org>> wrote​:

Dana Sun\, 17 Dec 2017 02&#8203;:10&#8203;:02 \-0800\, hv reče&#8203;:
> On Sat\, 16 Dec 2017 00&#8203;:40&#8203;:06 \-0800\, zefram@&#8203;fysh\.org
\<mailto&#8203;:zefram@&#8203;fysh\.org> wrote&#8203;:
> > This kind of strangeness is only what should be expected from

forking

> > in a BEGIN block\.  This ticket should be closed\.
>
> What who should expect \- someone intimately familiar with perl
> internals\, or someone who uses perl and \(on a good day\) may have
read
> \`perldoc \-f fork\`?
>
> The preceding discussion suggests to me that an attempt to fork\(\)

at

> BEGIN time should yield at least a warning \(and maybe a fatal

error)

> to reflect the fact perl will not be able to honour a reasonable
non\-
> expert's expectations of what that means\.

Probably there are legitimate uses of fork\(\) in a BEGIN block \-\-\-
I would expect that a fork\+exec here is harmless\.

Regards\,
    Slaven



\-\-\-
via perlbug&#8203;:  queue&#8203;: perl5 status&#8203;: open
https://rt-archive.perl.org/perl5/Ticket/Display.html?id=127794
\<https://rt-archive.perl.org/perl5/Ticket/Display.html?id=127794>

@p5pRT
Copy link
Author

p5pRT commented Dec 19, 2017

From zefram@fysh.org

Chad Granum wrote​:

That said, I would greatly prefer some kind of fix, though I understand
that is probably not gonna happen.

Indeed. A proper fix would amount to the child process getting a clone
of the open file description for the source file, as opposed to a clone
of the file descriptor referring to the same open file description. It's
difficult to draw the line regarding which open file descriptions should
be cloned​: things opened by the program might want the same treatment
as source, but many things want the default sharing behaviour. But that
doesn't matter, because there's no way to clone open file descriptions.

Failing that, the next best thing would be to detect when a conflict
occurs. You'd want to detect at read time that something else has
performed a read on the same open file description since the last read
you know about. But that's also impossible. For regular files you
could look at the file position, but that's subject to race condition,
and it doesn't apply at all to pipes.

So no fix is going to happen.

-zefram

@p5pRT
Copy link
Author

p5pRT commented Dec 19, 2017

From @exodist

I can live without a fix. I just want to make sure no fatal errors occur
that will block my current code.

Would also be nice to have some xs code that lets you tell and seek on the
current source files handle/descriptor, as well as xs code to close it,
re-open the file, seek to the right place, and use that instead of the
original. Such low level access would allow people (like me) who want to
fork in begin manage the source files internal handle directly.

-Chad

On Dec 18, 2017 5​:20 PM, "Zefram" <zefram@​fysh.org> wrote​:

Chad Granum wrote​:

That said, I would greatly prefer some kind of fix, though I understand
that is probably not gonna happen.

Indeed. A proper fix would amount to the child process getting a clone
of the open file description for the source file, as opposed to a clone
of the file descriptor referring to the same open file description. It's
difficult to draw the line regarding which open file descriptions should
be cloned​: things opened by the program might want the same treatment
as source, but many things want the default sharing behaviour. But that
doesn't matter, because there's no way to clone open file descriptions.

Failing that, the next best thing would be to detect when a conflict
occurs. You'd want to detect at read time that something else has
performed a read on the same open file description since the last read
you know about. But that's also impossible. For regular files you
could look at the file position, but that's subject to race condition,
and it doesn't apply at all to pipes.

So no fix is going to happen.

-zefram

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants