Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP

Loading…

pbc_disassemble fails on large PBCs #326

Open
plobsing opened this Issue · 3 comments

4 participants

@plobsing
Collaborator

PBC disassemble complains about encoding or flat out segfaults when dealing with large PBC files. Good choices are perl6.pbc from rakudo, nqp-rx.pbc, etc

> /home/pitr/parrot-trunk/bin/pbc_disassemble perl6.pbc
zsh: segmentation fault  /home/pitr/parrot-trunk/bin/pbc_disassemble perl6.pbc
> ./pbc_disassemble parrot-nqp.pbc
...
000000000750-000000002197 000707:   index_i_sc_s I8,"\t Invalid character for UTF-8 encoding

current instr.: 'parrot;NQP;Compiler;main' pc 76066 (ext/nqp-rx/src/stage0/NQP-s0.pir:20736)

Originally http://trac.parrot.org/parrot/ticket/1557

@Util
Owner

Here is a partial analysis, tested on r45670:

I don't think that this problem is really about the size of the .pbc file.

This problem occurs with *any* PBC built from PIR containing Unicode literal strings. Only large PIR modules happen to contain Unicode literals right now, hence the appearance that PBC size is an element of the problem.

For example, the following code complies to .pbc and executes correctly, but fails to disassemble:

$ cat unicode_minimal_crash.pir

.sub _main
    $S0 = 'd'
    $I0 = index unicode:"abc\x{a0}def", $S0
    print "The answer is "
    say $I0
    end
.end

$ ./parrot -o unicode_minimal_crash.pbc unicode_minimal_crash.pir

$ ./parrot unicode_minimal_crash.pbc

The answer is 4

$ ./pbc_disassemble unicode_minimal_crash.pbc

=head1 Constant-table
PMC_CONST(0): 'ParrotInterpreter'
PMC_CONST(1): abc def
PMC_CONST(2): d
PMC_CONST(3): The answer is
PMC_CONST(4): unicode_minimal_crash.pir
PMC_CONST(5):
PMC_CONST(6): _main
=cut
#   Seq_Op_Num- Relative-PC SrcLn#:
# Current Source Filename 'unicode_minimal_crash.pir'
000000000000-000000000000 000003:   set_s_sc S0,"d"
000000000001-000000000003 000004:   index_i_sc_s I0,"abcInvalid character for UTF-8 encoding
@parrot

The problem is that the string is converted to C string and later sent char by char to the output file, which is a encoding nightmare. To fix the problem the C string usage must be avoided. in the meantime, workaround added in r45831.

@coke
Owner

Current failure mode:

$ pbc_disassemble perl6.pbc
Could not load oplib `io_ops'
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.