Rakudo cannot read lines from stdin #860

Closed
gerdr opened this Issue Oct 14, 2012 · 7 comments

Comments

Projects
None yet
3 participants
@gerdr
Contributor

gerdr commented Oct 14, 2012

GDB session showing the issue:

$ gdb ./perl6
[...]
(gdb) break Parrot_io_readline_s
Function "Parrot_io_readline_s" not defined.
Make breakpoint pending on future shared library load? (y or [n]) y

Breakpoint 1 (Parrot_io_readline_s) pending.
(gdb) run -e '$*IN.get'
Starting program: /devel/rakudo/install/bin/perl6 -e '$*IN.get'
[New Thread 4116.0xf64]
[New Thread 4116.0x102c]

Breakpoint 1, Parrot_io_readline_s (interp=0x80039b18, handle=0x800ee614,
    terminator=0x81245a54) at src/io/api.c:940
940 ASSERT_ARGS(Parrot_io_readline_s)
(gdb) print terminator->strstart
$1 = 0xfe689d48 "1745786907"

"1745786907" is not a legitimate line terminator.

The problem occurs when invoking the PIR directly from Rakudo as well

$ gdb ./perl6
[...]
(gdb) break Parrot_io_readline_s
Function "Parrot_io_readline_s" not defined.
Make breakpoint pending on future shared library load? (y or [n]) y

Breakpoint 1 (Parrot_io_readline_s) pending.
(gdb) run -e 'pir::print__is(pir::getstdin__P.readline)'
Starting program: /devel/rakudo/install/bin/perl6 -e 'pir::print__is(pir::getstdin__P.readline)'
[New Thread 4956.0xf28]
[New Thread 4956.0xe70]

Breakpoint 1, Parrot_io_readline_s (interp=0x80039ad0, handle=0x800ee5cc,
    terminator=0x812523e4) at src/io/api.c:940
940 ASSERT_ARGS(Parrot_io_readline_s)
(gdb) print terminator->strstart
$1 = 0xfe68946c "cuid_48_1350162742.96173_nfa"

but not when using NQP

$ gdb ./nqp
[...]
(gdb) break Parrot_io_readline_s
Function "Parrot_io_readline_s" not defined.
Make breakpoint pending on future shared library load? (y or [n]) y

Breakpoint 1 (Parrot_io_readline_s) pending.
(gdb) run -e 'pir::print__is(pir::getstdin__P.readline)'
Starting program: /devel/rakudo/install/bin/nqp -e 'pir::print__is(pir::getstdin__P.readline)'
[New Thread 4020.0x1274]
[New Thread 4020.0xedc]

Breakpoint 1, Parrot_io_readline_s (interp=0x80039ac0, handle=0x800ee5bc,
    terminator=0x800804a8) at src/io/api.c:940
940 ASSERT_ARGS(Parrot_io_readline_s)
(gdb) print terminator->strstart
$1 = 0x6f9e7570 "\n"

or PIR

$ cat readline.pir
.sub test :main
    .local pmc stdin
    .local string line
    stdin = getstdin
    line = stdin.'readline'()
    print line
.end

$ gdb ./parrot
[...]
(gdb) break Parrot_io_readline_s
Function "Parrot_io_readline_s" not defined.
Make breakpoint pending on future shared library load? (y or [n]) y

Breakpoint 1 (Parrot_io_readline_s) pending.
(gdb) run readline.pir
Starting program: /devel/rakudo/install/bin/parrot readline.pir
[New Thread 1076.0x534]
[New Thread 1076.0xf30]

Breakpoint 1, Parrot_io_readline_s (interp=0x80039ad0, handle=0x800ee5cc,
    terminator=0x800804b8) at src/io/api.c:940
940 ASSERT_ARGS(Parrot_io_readline_s)
(gdb) print terminator->strstart
$1 = 0x6f9e7570 "\n"

While STDIN gets initialized correctly, somehow the record_separator (and possibly other fields) in the Parrot_Handle_attributes structure gets overwritten:

$ gdb ./perl6
[...]
(gdb) break Parrot_api_run_bytecode
Breakpoint 1 at 0x401778
(gdb) break Parrot_io_readline_s
Function "Parrot_io_readline_s" not defined.
Make breakpoint pending on future shared library load? (y or [n]) y

Breakpoint 2 (Parrot_io_readline_s) pending.
(gdb) run -e 'pir::print__is(pir::getstdin__P.readline)'
Starting program: /devel/rakudo/install/bin/perl6 -e 'pir::print__is(pir::getstdin__P.readline)'
[New Thread 2752.0x125c]
[New Thread 2752.0x9a4]

Breakpoint 1, 0x00401778 in Parrot_api_run_bytecode ()
(gdb) s
Single stepping until exit from function Parrot_api_run_bytecode,
which has no line number information.
Parrot_api_run_bytecode (interp_pmc=0x800e8978, pbc=0x800b8540,
    args=0x800f660c) at src/embed/bytecode.c:150
150 {
(gdb) print ((Parrot_Handle_attributes*)((Parrot_Interp*)interp_pmc->data)->piodata->table[0]->data)
$1 = (Parrot_Handle_attributes *) 0x8009d3d8
(gdb) print ((Parrot_Handle_attributes*)((Parrot_Interp*)interp_pmc->data)->piodata->table[0]->data)->record_separator
$2 = (STRING *) 0x800805a8
(gdb) print ((Parrot_Handle_attributes*)((Parrot_Interp*)interp_pmc->data)->piodata->table[0]->data)->record_separator->strstart
$3 = 0x6f9e7570 "\n"
(gdb) c
Continuing.

Breakpoint 1, Parrot_api_run_bytecode (interp_pmc=0x800e8978, pbc=0x800b8540,
    args=0x800f660c) at src/embed/bytecode.c:151
151 ASSERT_ARGS(Parrot_api_run_bytecode)
(gdb) c
Continuing.

Breakpoint 2, Parrot_io_readline_s (interp=0x80039bc0, handle=0x800ee6bc,
    terminator=0x81250824) at src/io/api.c:940
940 ASSERT_ARGS(Parrot_io_readline_s)
(gdb) print ((Parrot_Handle_attributes*)interp->piodata->table[0]->data)
$4 = (Parrot_Handle_attributes *) 0x8009d3d8
(gdb) print ((Parrot_Handle_attributes*)interp->piodata->table[0]->data)->record_separator
$5 = (STRING *) 0x81250824
(gdb) print ((Parrot_Handle_attributes*)interp->piodata->table[0]->data)->record_separator->strstart
$6 = 0x82fdbe64 "-763724221"

It is not clear if the problem lies with the Rakudo or Parrot codebase as downgrading to Parrot 4.4.0 fixes (or at least hides) the issue.

@pmichaud

This comment has been minimized.

Show comment Hide comment
@pmichaud

pmichaud Oct 14, 2012

Member

Could we get a few more details about the environment in which the above tests are being run? When I do the above on my system (Kubuntu 12.04.1, 64-bit) I don't see the record separator changing.

Also, would it be possible to set a GDB watchpoint on ((Parrot_Handle_attributes*)interp->piodata->table[0]->data)->record_separator , and give a backtrace? Then we might be able to find out what is causing the record separator to be changed.

Thanks for the excellent debugging!

Pm

Member

pmichaud commented Oct 14, 2012

Could we get a few more details about the environment in which the above tests are being run? When I do the above on my system (Kubuntu 12.04.1, 64-bit) I don't see the record separator changing.

Also, would it be possible to set a GDB watchpoint on ((Parrot_Handle_attributes*)interp->piodata->table[0]->data)->record_separator , and give a backtrace? Then we might be able to find out what is causing the record separator to be changed.

Thanks for the excellent debugging!

Pm

@pmichaud

This comment has been minimized.

Show comment Hide comment
@pmichaud

pmichaud Oct 14, 2012

Member

Try the following in NQP on your system and see if the record separator changes unacceptably:

(gdb) run -e 'my $IN := pir::getstdin__P; $IN.encoding("utf8"); pir::print__is($IN.readline);'

Pm

Member

pmichaud commented Oct 14, 2012

Try the following in NQP on your system and see if the record separator changes unacceptably:

(gdb) run -e 'my $IN := pir::getstdin__P; $IN.encoding("utf8"); pir::print__is($IN.readline);'

Pm

@gerdr

This comment has been minimized.

Show comment Hide comment
@gerdr

gerdr Oct 14, 2012

Contributor

I'm on Cygwin, which I did not mention before as I (apparently incorrectly?) believed that this bug ($*IN.get not terminating on \n) was well-known and one of the reasons why Rakudo does not use current Parrot.

Playing around some more with the debugger, this appears to be a GC bug: The string header gets recycled while still referenced from the attribute structure.

Perhaps you've got better luck reproducing the bug by running something more involved - I first stumbled upon it when I ran k-nucleotide.p6.pl from perl6-examples/shootout:

$ perl6 k-nucleotide.p6.pl <k-nucleotide.input


  0     ggt
  0     ggta
  0     ggtatt
  0     ggtattttaatt
  0     ggtattttaatttatagt

is not the expected output.

Contributor

gerdr commented Oct 14, 2012

I'm on Cygwin, which I did not mention before as I (apparently incorrectly?) believed that this bug ($*IN.get not terminating on \n) was well-known and one of the reasons why Rakudo does not use current Parrot.

Playing around some more with the debugger, this appears to be a GC bug: The string header gets recycled while still referenced from the attribute structure.

Perhaps you've got better luck reproducing the bug by running something more involved - I first stumbled upon it when I ran k-nucleotide.p6.pl from perl6-examples/shootout:

$ perl6 k-nucleotide.p6.pl <k-nucleotide.input


  0     ggt
  0     ggta
  0     ggtatt
  0     ggtattttaatt
  0     ggtattttaatttatagt

is not the expected output.

@pmichaud

This comment has been minimized.

Show comment Hide comment
@pmichaud

pmichaud Oct 14, 2012

Member

On Sun, Oct 14, 2012 at 11:00:42AM -0700, Gerhard R. wrote:

I'm on Cygwin, which I did not mention before as I (apparently incorrectly?)
believed that this bug ($*IN.get not terminating on \n) was well-known and one
of the reasons why Rakudo does not use current Parrot.

It's a well-known bug, yes, but it's only manifesting on some systems.
I haven't been able to reproduce it in any of my environments yet.

Playing around some more with the debugger, this appears to be a GC bug: The
string header gets recycled while still referenced from the attribute
structure.

How much memory on that system? That would indicate how often GC is occurring.

AFAICT, in looking at handle.pmc I can't find where the 'record_separator'
and 'encoding' fields ever get marked during GC. (All of the other PMCs I've
looked at containing STRING attributes have explicit mark() vtables and
explicit calls to Parrot_gc_mark_STRING_alive().)

Pm

Member

pmichaud commented Oct 14, 2012

On Sun, Oct 14, 2012 at 11:00:42AM -0700, Gerhard R. wrote:

I'm on Cygwin, which I did not mention before as I (apparently incorrectly?)
believed that this bug ($*IN.get not terminating on \n) was well-known and one
of the reasons why Rakudo does not use current Parrot.

It's a well-known bug, yes, but it's only manifesting on some systems.
I haven't been able to reproduce it in any of my environments yet.

Playing around some more with the debugger, this appears to be a GC bug: The
string header gets recycled while still referenced from the attribute
structure.

How much memory on that system? That would indicate how often GC is occurring.

AFAICT, in looking at handle.pmc I can't find where the 'record_separator'
and 'encoding' fields ever get marked during GC. (All of the other PMCs I've
looked at containing STRING attributes have explicit mark() vtables and
explicit calls to Parrot_gc_mark_STRING_alive().)

Pm

@gerdr

This comment has been minimized.

Show comment Hide comment
@gerdr

gerdr Oct 21, 2012

Contributor

Here's some PIR which triggers the bug on my system:

$ cat rs.pir
.sub 'printrs'
    .param pmc handle
    .local string rs

    rs = handle.'record_separator'()
    print "record separator is [[["
    print rs
    print "]]]\n"

.end

.sub 'main' :main
    .local pmc stdin

    stdin = getstdin
    printrs(stdin)

    # all of these are necessary to trigger the bug
    stdin.'encoding'('utf8')
    sweep 1
    stdin.'readline'()

    printrs(stdin)

.end

$ echo NOT_A_SEPARATOR | install/bin/parrot rs.pir
record separator is [[[
]]]
record separator is [[[NOT_A_SEPARATOR
]]]
Contributor

gerdr commented Oct 21, 2012

Here's some PIR which triggers the bug on my system:

$ cat rs.pir
.sub 'printrs'
    .param pmc handle
    .local string rs

    rs = handle.'record_separator'()
    print "record separator is [[["
    print rs
    print "]]]\n"

.end

.sub 'main' :main
    .local pmc stdin

    stdin = getstdin
    printrs(stdin)

    # all of these are necessary to trigger the bug
    stdin.'encoding'('utf8')
    sweep 1
    stdin.'readline'()

    printrs(stdin)

.end

$ echo NOT_A_SEPARATOR | install/bin/parrot rs.pir
record separator is [[[
]]]
record separator is [[[NOT_A_SEPARATOR
]]]
@luben

This comment has been minimized.

Show comment Hide comment
@luben

luben Oct 21, 2012

Could you test 3c558c8

On my PC it works as expected with the minimal PIR test case provided here.

luben commented Oct 21, 2012

Could you test 3c558c8

On my PC it works as expected with the minimal PIR test case provided here.

@gerdr

This comment has been minimized.

Show comment Hide comment
@gerdr

gerdr Oct 21, 2012

Contributor

I just rebuilt Parrot/NQP/Rakudo. All test cases mentioned here now pass on my machine.

Contributor

gerdr commented Oct 21, 2012

I just rebuilt Parrot/NQP/Rakudo. All test cases mentioned here now pass on my machine.

@gerdr gerdr closed this Oct 21, 2012

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment