Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

can't dup DATA? #369

Closed
p5pRT opened this issue Aug 11, 1999 · 19 comments
Closed

can't dup DATA? #369

p5pRT opened this issue Aug 11, 1999 · 19 comments

Comments

@p5pRT
Copy link
Collaborator

@p5pRT p5pRT commented Aug 11, 1999

Migrated from rt.perl.org#1204 (status was 'resolved')

Searchable as RT1204$

@p5pRT
Copy link
Collaborator Author

@p5pRT p5pRT commented Aug 11, 1999

From @vlbrown

If I do this

  #!/usr/bin/perl

  open (INPUT,"<&STDIN");
  while (<INPUT>) {
  print;
  }

it works. I can type input and INPUT is a dup of STDIN. the while loop
is the moral equivalent of
  while (<STDIN>) {

but if I try
  #!/usr/bin/perl

  open (INPUT,"<&DATA");
  while (<INPUT>) {
  print;
  }

  __END__
  apple
  banana

when I run this, nothing is printed. Can I not dup DATA? Is this because it
is a "magic" filehandle? (Not a "real" filehandle).

Can Perl be changed to allow dup'ing of the DATA filehandle?

Perl Info


Site configuration information for perl 5.00502:

Configured by vlb at Tue Dec  1 13:04:02 PST 1998.

Summary of my perl5 (5.0 patchlevel 5 subversion 2) configuration:
  Platform:
    osname=solaris, osvers=2.6, archname=sun4-solaris
    uname='sunos jeeves 5.6 generic_105181-03 sun4u sparc
sunw,ultra-enterprise
'
    hint=recommended, useposix=true, d_sigaction=define
    usethreads=undef useperlio=undef d_sfio=undef
  Compiler:
    cc='gcc', optimize='-O', gccversion=2.8.1
    cppflags='-I/usr/local/include'
    ccflags ='-I/usr/local/include'
    stdchar='unsigned char', d_stdstdio=define, usevfork=false
    intsize=4, longsize=4, ptrsize=4, doublesize=8
    d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=16
    alignbytes=8, usemymalloc=y, prototype=define
  Linker and Libraries:
    ld='gcc', ldflags =' -L/usr/local/lib'
    libpth=/usr/local/lib /lib /usr/lib /usr/ccs/lib
    libs=-lsocket -lnsl -ldb -ldl -lm -lc -lcrypt
    libc=/lib/libc.so, so=so, useshrplib=false, libperl=libperl.a
  Dynamic Linking:
    dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags=' '
    cccdlflags='-fPIC', lddlflags='-G -L/usr/local/lib'

Locally applied patches:



@INC for perl 5.00502:
    /usr/local/lib/perl5/5.00502/sun4-solaris
    /usr/local/lib/perl5/5.00502
    /usr/local/lib/perl5/site_perl/5.005/sun4-solaris
    /usr/local/lib/perl5/site_perl/5.005
    .


Environment for perl 5.00502:
    HOME=/export/home/vlb
    LANG (unset)
    LD_LIBRARY_PATH=/usr/usr2/oracle/product/8.0.5/lib
    LOGDIR (unset)

PATH=/usr/local/sbin:/export/home/vlb/bin:/export/home/vlb/nib:/usr/local/bi
n:/opt/SUNWspro/bin:/usr/ccs/bin:/usr/ucb:/usr/bin:/bin:/etc:/sbin:/usr/sbin
:/us
r/openwin/bin:/usr/usr2/oracle/product/8.0.5/bin:/usr/dt/bin:/usr/local/geno
me/b
in:/usr/games:.
    PERL_BADLANG (unset)
    SHELL=/usr/bin/tcsh
-----
 //=\   Vicki Brown <vlb@deltagen.com>
 \=//    Journeyman Sourcerer: Scripts & Philtres
  //=\
  \=//     Scientific Programming <> Perl, Unix, Mac
   //=\     A little Web gardening on the weekends
   \=//
    //=\      Deltagen, Inc; 1031 Bing St, San Carlos, CA 94070

@p5pRT
Copy link
Collaborator Author

@p5pRT p5pRT commented Aug 11, 1999

From [Unknown Contact. See original ticket]

"VB" == Vicki Brown <vlb@​deltagen.com> writes​:

VB> Can Perl be changed to allow dup'ing of the DATA filehandle?

Would you realy want to allow that? Since the start of DATA is
not the start of the data. Yes, that could be true for other
filehandles, but

Interesting open(INPUT, "<&3") doesn't work either. (The 3 is from
fileno(DATA)). Something special about the fd?

<chaim>
--
Chaim Frenkel Nonlinear Knowledge, Inc.
chaimf@​pobox.com +1-718-236-0183

@p5pRT
Copy link
Collaborator Author

@p5pRT p5pRT commented Aug 11, 1999

From @mjdominus

Indeed, if the data after the __END__ tag is long enough, then Vicki's
example does print out most of it, omitting only a bit at the
beginning that was read into the DATA stdio buffer at program start
time.

@p5pRT
Copy link
Collaborator Author

@p5pRT p5pRT commented Aug 11, 1999

From @vlbrown

At 15​:07 -0400 8/11/99, Chaim Frenkel wrote​:

"VB" == Vicki Brown <vlb@​deltagen.com> writes​:

VB> Can Perl be changed to allow dup'ing of the DATA filehandle?

Would you realy want to allow that?

Yes. :-)

Rationale (from the MacPerl mailing list)​:

I enter test data after the __END__ tag at script end and use it to test.
I was wondering if there was any way to map one file handle onto another
for testing. For example, I have a script with 'while (<INPUT>)' where
INPUT is the result of 'open INPUT, $inputfile;' I then have a couple of
spots where the script returns to the file top, etc. It would be handy to
change one line mapping INPUT to DATA during testing, rather than
switching every instance of INPUT to DATA.

I do this too. It's often easier to fake the data after the
__END__
section and get the script working on that, then move on to actually
reading from STDIN, or opening a file or a pipe or whathaveyou.

I was a tad surprised that DATA didn't seem to be duplicatable [sic] this
way, and didn't see any special caveats in the docs (OK, point me to the
part I missed :)... is this related to the inability to refer to the DATA
filehandle in a BEGIN{} block?

Since the start of DATA is
not the start of the data. Yes, that could be true for other
filehandles, but

Well, I figure that if someone told Perl to allow it, they'd tell Perl
how to do it "correctly" :-)

Interesting open(INPUT, "<&3") doesn't work either. (The 3 is from
fileno(DATA)). Something special about the fd?

As usual, I could live with a change to the docs explaining why this cannot
be done :)
-- --
  |\ _,,,---,,_ Vicki Brown <vlb@​cfcl.com>
ZZZzz /,`.-'`' -. ;-;;,_ Journeyman Sourceror​: Scripts & Philtres
  |,4- ) )-,_. ,\ ( `'-' P.O. Box 1269 San Bruno CA 94066
  '---''(_/--' `-'\_) http​://www.cfcl.com/~vlb http​://www.macperl.com

@p5pRT
Copy link
Collaborator Author

@p5pRT p5pRT commented Aug 11, 1999

From [Unknown Contact. See original ticket]

Graham Barr <gbarr@​pobox.com> writes​:

This is a bug that needs to be fixed because

open(INPUT,"<&" . fileno(DATA)) or die "$!";
print <INPUT>;

__END__
1
2
3

will print nothing and the following does exactly what you want.

seek(DATA,0,1);
open(INPUT,"<&" . fileno(DATA)) or die "$!";
print <INPUT>;

This is perfectly normal "dup'ing a stdio buffered handle" issue.

open(FOO,__FILE__);
my $first = <FOO>;
while (<FOO>)
{
  last if /^__(DATA|END)__$/
}
open(INPUT,<&FOO);

will have same problem.

The DATA handle has been read so stdio has slurped (say) 8K of data into
its buffer - which is enough to consume moderate sized scripts + data.
Thus underlying fd is at EOF.

If you fix it for DATA you should/will fix it for all handles ...
It is just a case of dup implying a PerlIO_seek(f,0,1) (with consequent $!
pollution on ttys etc.)

--
Nick Ing-Simmons

@p5pRT
Copy link
Collaborator Author

@p5pRT p5pRT commented Aug 11, 1999

From @gbarr

This is a bug that needs to be fixed because

open(INPUT,"<&" . fileno(DATA)) or die "$!";
print <INPUT>;

__END__
1
2
3

will print nothing and the following does exactly what you want.

seek(DATA,0,1);
open(INPUT,"<&" . fileno(DATA)) or die "$!";
print <INPUT>;

__END__
1
2
3

This is probably a bug that needs fixing.

On Wed, Aug 11, 1999 at 01​:07​:29PM -0700, Vicki Brown wrote​:

Interesting open(INPUT, "<&3") doesn't work either. (The 3 is from
fileno(DATA)). Something special about the fd?

As usual, I could live with a change to the docs explaining why this cannot
be done :)

--
Since you're clearly mad as a mongoose, I'll bid you good-day.
  -- Edmund to Captain Rum : Black Adder II "Potato"

@p5pRT
Copy link
Collaborator Author

@p5pRT p5pRT commented Aug 11, 1999

From @mjdominus

Barr says​:

This is a bug that needs to be fixed

Patch enclosed. Someone with more experience should look at it and
make sure I didn't commit any terrible errors. Perl 5.5.57 does pass
all the tests, and it does fix Vicki's problem, as well as other
related problems such as​:

  #!/usr/bin/perl
  open F1, '/tmp/vb2' or die;
  print scalar <F1>; # Prints first line from file
  open F2, "<&F1" or die;
  print scalar <F2>; # Fails to print second line from file

Idea of patch​: Call `seek' automatically to flush the buffer just
befure dulicating the file descriptor.

--- doio.c 1999/06/10 23​:11​:05 1.1
+++ doio.c 1999/08/11 19​:38​:46
@​@​ -243,7 +243,10 @​@​
  goto say_false;
  }
  if (IoIFP(thatio)) {
- fd = PerlIO_fileno(IoIFP(thatio));
+ PerlIO *fp = IoIFP(thatio);
+ /* Flush stdio buffer before dup */
+ PerlIO_seek(fp, 0, 1);
+ fd = PerlIO_fileno(fp);
  if (IoTYPE(thatio) == 's')
  IoTYPE(io) = 's';
  }
--- t/io/dup.t 1999/08/11 19​:43​:50 1.1
+++ t/io/dup.t 1999/08/11 19​:52​:47
@​@​ -2,7 +2,7 @​@​

# $RCSfile​: dup.t,v $$Revision​: 1.1 $$Date​: 1999/08/11 19​:43​:50 $

-print "1..6\n";
+print "1..7\n";

print "ok 1\n";

@​@​ -37,3 +37,16 @​@​
unlink 'Io.dup';

print STDOUT "ok 6\n";
+
+# 7 # 19990811 mjd@​plover.com
+my ($out1, $out2) = ("Line 1\n", "Line 2\n");
+open(W, "> Io.dup") || die "Can't open stdout";
+print W $out1, $out2;
+close W;
+open(R1, "< Io.dup") || die "Can't read temp file";
+$in1 = <R1>;
+open(R2, "<&R1") || die "Can't dup";
+$in2 = <R2>;
+print "not " unless $in1 eq $out1 && $in2 eq $out2;
+print "ok 7\n";
+

@p5pRT
Copy link
Collaborator Author

@p5pRT p5pRT commented Aug 11, 1999

From @vlbrown

At 15​:31 -0400 8/11/99, Mark-Jason Dominus wrote​:

Indeed, if the data after the __END__ tag is long enough, then Vicki's
example does print out most of it, omitting only a bit at the
beginning that was read into the DATA stdio buffer at program start
time.

Gaah! You're right.

I see "long enough" as being 16297 characters in MacPerl (5.004), 1147
under Solaris (5.005_02) and 639 chars on my Redhat / MkLinux PPC system
(5.005_02).

I'm leaning toward bug :)
-- --
  |\ _,,,---,,_ Vicki Brown <vlb@​cfcl.com>
ZZZzz /,`.-'`' -. ;-;;,_ Journeyman Sourceror​: Scripts & Philtres
  |,4- ) )-,_. ,\ ( `'-' P.O. Box 1269 San Bruno CA 94066
  '---''(_/--' `-'\_) http​://www.cfcl.com/~vlb http​://www.macperl.com

@p5pRT
Copy link
Collaborator Author

@p5pRT p5pRT commented Aug 11, 1999

From @mjdominus

Patch enclosed.

I forgot to do perldelta.

--- pod/perldelta.pod 1999/08/11 19​:58​:16 1.3
+++ pod/perldelta.pod 1999/08/11 20​:02​:52
@​@​ -229,6 +229,14 @​@​
buffering mishaps suffered by users unaware of how Perl internally
handles I/O.

+=head2 Buffered data discarded from input filehandle when dup'ed.
+
+C<open(NEW, "E<lt>&OLD")> now discards any data that was previously
+read and buffered in C<OLD>. The next read operation on C<NEW> will
+return the same data as the corresponding operation on C<OLD>.
+Formerly, it would have returned the data from the start of the
+following disk block instead.
+
=head1 Supported Platforms

=over 4

@p5pRT
Copy link
Collaborator Author

@p5pRT p5pRT commented Aug 11, 1999

From [Unknown Contact. See original ticket]

Rationale (from the MacPerl mailing list)​:

I enter test data after the __END__ tag at script end and use it to test.
I was wondering if there was any way to map one file handle onto another
for testing. For example, I have a script with 'while (<INPUT>)' where
INPUT is the result of 'open INPUT, $inputfile;' I then have a couple of
spots where the script returns to the file top, etc. It would be handy to
change one line mapping INPUT to DATA during testing, rather than
switching every instance of INPUT to DATA.

I do this too. It's often easier to fake the data after the
__END__
section and get the script working on that, then move on to actually
reading from STDIN, or opening a file or a pipe or whathaveyou.

It's always been annoying that this doesn't work correctly​:

  % perl whateverscript "<&DATA"

on <ARGV> handling.

--tom

@p5pRT
Copy link
Collaborator Author

@p5pRT p5pRT commented Aug 11, 1999

From @mjdominus

It's always been annoying that this doesn't work correctly​:

% perl whateverscript "\<&DATA"

on <ARGV> handling.

My patch fixes that.

@p5pRT
Copy link
Collaborator Author

@p5pRT p5pRT commented Aug 12, 1999

From @jhi

The patch doesn't seem to work in Digital UNIX, the new io/dup subtest
fails. On the other hand, the patch seems to do no harm (no other failures).

This is what

print "in1 = '$in1', out1 = '$out1', in2 = '$in2', out2 = '$out2'\n";

outputs after the subtest #7.

in1 = 'Line 1
', out1 = 'Line 1
', in2 = '', out2 = 'Line 2
'

My guess is that calling PerlIO_seek(fp, 0, SEEK_CUR) doesn't flush.
Silly question of the day​: why not

  PerlIO_flush(fp);

instead of the seek()?

--
$jhi++; # http​://www.iki.fi/jhi/
  # There is this special biologist word we use for 'stable'.
  # It is 'dead'. -- Jack Cohen

@p5pRT
Copy link
Collaborator Author

@p5pRT p5pRT commented Aug 12, 1999

From @mjdominus

My guess is that calling PerlIO_seek(fp, 0, SEEK_CUR) doesn't flush.

Yeah.

Silly question of the day​: why not
PerlIO_flush(fp);
instead of the seek()?

No good reason; I think I had seek on the brain because of Graham's
message. Can you try flush() and see if it works on your side and I
will try it here too and if it works in both places I will amend and
resubmit the patch.

Thanks.

@p5pRT
Copy link
Collaborator Author

@p5pRT p5pRT commented Aug 12, 1999

From @jhi

Mark-Jason Dominus writes​:

My guess is that calling PerlIO_seek(fp, 0, SEEK_CUR) doesn't flush.

Yeah.

Silly question of the day​: why not
PerlIO_flush(fp);
instead of the seek()?

No good reason; I think I had seek on the brain because of Graham's
message. Can you try flush() and see if it works on your side and I

I tried it already. It works.

will try it here too and if it works in both places I will amend and
resubmit the patch.

No need to resubmit the patch; just confirm whether it works for you
and I'll check in my change.

Thanks.

--
$jhi++; # http​://www.iki.fi/jhi/
  # There is this special biologist word we use for 'stable'.
  # It is 'dead'. -- Jack Cohen

@p5pRT
Copy link
Collaborator Author

@p5pRT p5pRT commented Aug 12, 1999

From @jhi

Nick Ing-Simmons writes​:

Jarkko Hietaniemi <jhi@​iki.fi> writes​:

Silly question of the day​: why not

PerlIO_flush(fp);

instead of the seek()?

Because where PerlIO is stdio fflush() may not do anything useful
on handles open for read.

But neither does seek(), it seems. Shall we do do both? And if
that does not help, sacrifice a chicken and perform the Shamanistic
Ritual #17b?

--
$jhi++; # http​://www.iki.fi/jhi/
  # There is this special biologist word we use for 'stable'.
  # It is 'dead'. -- Jack Cohen

@p5pRT
Copy link
Collaborator Author

@p5pRT p5pRT commented Aug 12, 1999

From [Unknown Contact. See original ticket]

Jarkko Hietaniemi <jhi@​iki.fi> writes​:

Silly question of the day​: why not

PerlIO_flush(fp);

instead of the seek()?

Because where PerlIO is stdio fflush() may not do anything useful
on handles open for read.

--
Nick Ing-Simmons <nik@​tiuk.ti.com>
Via, but not speaking for​: Texas Instruments Ltd.

@p5pRT
Copy link
Collaborator Author

@p5pRT p5pRT commented Aug 12, 1999

From @jhi

Nick Ing-Simmons writes​:

My only concern is that some stdio somewhere will complain about
flush() on a read handle.

And if
that does not help, sacrifice a chicken and perform the Shamanistic
Ritual #17b?

Perhaps the correct fix is :
PerlIO_flush(src);
PerlIO_seek(dst,PerlIO_tell(src),0);

Although that perhaps should be getpos/setpos to handle REC files
and/or large files.

Okay, *now* I want an updated patch...

--
$jhi++; # http​://www.iki.fi/jhi/
  # There is this special biologist word we use for 'stable'.
  # It is 'dead'. -- Jack Cohen

@p5pRT
Copy link
Collaborator Author

@p5pRT p5pRT commented Aug 12, 1999

From [Unknown Contact. See original ticket]

Jarkko Hietaniemi <jhi@​iki.fi> writes​:

Nick Ing-Simmons writes​:

Jarkko Hietaniemi <jhi@​iki.fi> writes​:

Silly question of the day​: why not

PerlIO_flush(fp);

instead of the seek()?

Because where PerlIO is stdio fflush() may not do anything useful
on handles open for read.

But neither does seek(), it seems.

An understandable optimization - if stdio is not bothered about dups.

Shall we do do both?

My only concern is that some stdio somewhere will complain about
flush() on a read handle.

And if
that does not help, sacrifice a chicken and perform the Shamanistic
Ritual #17b?

Perhaps the correct fix is :
  PerlIO_flush(src);
  PerlIO_seek(dst,PerlIO_tell(src),0);

Although that perhaps should be getpos/setpos to handle REC files
and/or large files.

--
Nick Ing-Simmons <nik@​tiuk.ti.com>
Via, but not speaking for​: Texas Instruments Ltd.

@p5pRT
Copy link
Collaborator Author

@p5pRT p5pRT commented Aug 12, 1999

From @jhi

It seems that PerlIO_seek() doesn't flush in IRIX 6.5, either.

--
$jhi++; # http​://www.iki.fi/jhi/
  # There is this special biologist word we use for 'stable'.
  # It is 'dead'. -- Jack Cohen

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
1 participant
You can’t perform that action at this time.