Skip to content

Commit

Permalink
[docs/compiler_overview.pod] update for new master branch completed
Browse files Browse the repository at this point in the history
  • Loading branch information
Martin Berends committed Feb 19, 2010
1 parent 3704a2e commit 3b869ce
Showing 1 changed file with 90 additions and 52 deletions.
142 changes: 90 additions & 52 deletions docs/compiler_overview.pod
Expand Up @@ -33,8 +33,8 @@ Action methods build a Parrot Abstract Syntax Tree (F<Perl6/Actions.pm>)

=item 5.

Parrot extensions provide Perl 6 run time behavior (TODO: describe)
(F<binder/*>, F<ops/*>, F<pmc/*>)
Parrot extensions provide Perl 6 run time behavior (F<ops/perl6.ops>,
F<pmc/*.pmc>, F<binder/*>)

=item 6.

Expand All @@ -46,8 +46,8 @@ F<core/*.pm>, F<glue/*.pir>, F<metamodel/*>)
The F<Makefile> (generated from F<build/Makefile.in> by
F<../Configure.pl>) compiles all the parts to form the F<perl6.pbc>
executable and the F<perl6> or F<perl6.exe> "fake executable". We call
it fake because it has only a small stub of code to launch the Parrot
executable, and passes itself as a chunk of bytecode for Parrot to
it fake because it has only a small stub of code to start the Parrot
virtual machine, and passes itself as a chunk of bytecode for Parrot to
execute. The source code of the "fakecutable" is generated as
F<perl6.c> with the stub at the very end. The entire contents of
F<perl6.pbc> are represented as escaped octal characters in one huge
Expand All @@ -65,10 +65,10 @@ includes a powerful Perl 6 regex engine. This gives a streamlined
compiler framework on which to build a very functional Perl 6
implementation.

NQP itself is also written in PIR. It is an important part of the Parrot
Compiler Toolkit (PCT) and is installed with Parrot. PCT is a standard
framework to make and use Parrot based languages. The source code of
NQP is in F<../parrot/ext/nqp-rx/> and the resulting compiler is
NQP itself is also written in PIR. It is an important part of the
Parrot Compiler Toolkit (PCT) and is installed with Parrot. PCT is a
standard framework to create and use Parrot based languages. The source
code of NQP is in F<../parrot/ext/nqp-rx/> and the resulting compiler is
F<../parrot_install/bin/parrot-nqp>. Note, NQP only I<builds> the
Rakudo compiler, and does not compile or run user programs.

Expand All @@ -82,9 +82,10 @@ added.

The "stage-1" compiler (note: not NQP) compiles all Rakudo's Perl 6 code
again, this time including all the library modules (F<gen/core.pm>), to
make F<perl6.pbc> (note: not in F<gen/>). That F<gen/core.pm> file is
generated by F<build/gen_core_pm.pl> from a list called C<CORE_SOURCES>
in F<Makefile>. Thanks to the staging process, a large and growing
make F<perl6.pbc> which could be called "stage-2" (note: not in
F<gen/>). That F<gen/core.pm> file is generated by
F<build/gen_core_pm.pl> from a list called C<CORE_SOURCES> in
F<Makefile>. Thanks to the staging process, a large and growing
proportion of Rakudo's source code is written in Perl 6.

We can conceivably use the Rakudo compiler to compile itself to PIR and
Expand Down Expand Up @@ -156,7 +157,8 @@ For example, here's the parse rule for Rakudo's C<unless> statement

This token says that an C<unless> statement consists of the word
"unless" (captured into C<< $<sym> >>), and then an expression followed
by a block.
by a block. The C<.panic:> is a typical "Awesome" error message and the
syntax is almost exactly the same as in F<STD.pm>, described below.

Remember that for a match, not only must the C<< <sym> >> match the word
C<unless>, the C<< <xblock> >> must also match the C<xblock> token. If
Expand Down Expand Up @@ -221,53 +223,89 @@ statement that was just parsed.
The Parrot Compiler Toolkit provides a wide variety of PAST node types
for representing the various components of a HLL program -- for more
details about the available node types, see PDD 26
(L<http://svn.parrot.org/parrot/trunk/docs/pdds/pdd26_ast.pod>).

(
L<http://svn.parrot.org/parrot/trunk/docs/pdds/pdd26_ast.pod> or
L<http://docs.parrot.org/parrot/latest/html/docs/pdds/pdd26_ast.pod.html>
).
The PAST representation is the final stage of processing in Rakudo
itself. The PAST data structure is then passed on to Parrot directly.
Parrot does the remainder of the work translating from PAST to pir and
then to bytecode.
itself, and is given to Parrot directly. Parrot does the remainder of
the work translating from PAST to PIR and then to bytecode.

=head2 5. Parrot extensions

F<Perl6/Compiler.pir> has three C<.loadlib> commands early on, for
C<perl6_group>, C<perl6_ops> and C<math_ops>. All three dynamically
extend Parrot with respectively Rakudo specific PMC's (Poly Morphic
Containers, formerly Parrot Magic Cookies), opcodes, and mathematical
operators. The source is in F<pmc/*>, F<ops/*> and
F<parrot/src/ops/math.ops>.

(F<binder/*>)

(F<ops/*)

(F<pmc/*>)

--- ng update progress point

Lastly, the F<src/parser/quote_expression.pir> file implements
code to parse the various forms of Perl 6 quoting rules. It's
far easier to write this component using PIR instead of a
regular expression, but otherwise it acts just like any other
rule in the grammar.
Rakudo extends the Parrot virtual machine dynamically (i.e. at run
time), adding 14 dynamic opcodes ("dynops") which are additional virtual
machine code instructions, and 9 dynamic PMCs ("dynpmcs") (PolyMorphic
Container, remember?) which are are Parrot's equivalent of class
definitions.

The dynops source is in F<ops/perl6.ops>, which looks like C, apart from
some Perlish syntactic sugar.
A F<../parrot_install/lib/x.y.z-devel/tools/build/ops2c.pl> desugars
that to F<build/perl6.c> which your C compiler turns into a library.

For this overview, the opcode names and parameters might give a vague
idea what they're about:

rakudo_dynop_setup()
rebless_subclass(in PMC, in PMC)
find_lex_skip_current(out PMC, in STR)
x_is_uprop(out INT, in STR, in STR, in INT)
get_next_candidate_info(out PMC, out PMC, out PMC)
transform_to_p6opaque(inout PMC)
deobjectref(out PMC, in PMC)
descalarref(out PMC, in PMC)
allocate_signature(out PMC, in INT)
get_signature_size(out INT, in PMC)
set_signature_elem(in PMC, in INT, in STR, in INT, inout PMC,
inout PMC, inout PMC, inout PMC, inout PMC, inout PMC, in STR)
get_signature_elem(in PMC, in INT, out STR, out INT, out PMC, out PMC,
out PMC, out PMC, out PMC, out PMC, out STR)
bind_signature(in PMC)
x_setprophash(in PMC, in PMC)

The dynamic PMCs are in F<pmc/*.pmc>, one file per class. The language
is again almost C, but with other sugary differences this time, for
example definitions like C<group perl6_group> whose purpose will appear
shortly.
A F<../parrot_install/lib/x.y.z-devel/tools/build/pmc2c.pl> converts the
sugar to something your C compiler understands.

For a rough idea what these classes are for, here are the names:
P6Invocation P6LowLevelSig MutableVAR Perl6Scalar ObjectRef P6role
Perl6MultiSub Perl6Str and P6Opaque.

=head3 Binder

The dynops and the dynpmcs call a utility routine called a signature
binder, via a function pointer called C<bind_signature_func>. A binder
matches parameters passed by callers of subs, methods and other code
blocks, to the lexical names used internally. Parrot has a flexible set
of calling conventions, but the Perl 6 permutations of arity, multiple
dispatch, positional and named parameters, with constraints, defaults,
flattening and slurping needs a higher level of operation. The answer
lies in F<binder/bind.c> which is compiled into C<perl6_ops> and
C<perl6_group> libraries. Read
L<http://use.perl.org/~JonathanWorthington/journal/39772> for a more
detailed explanation of the binder.

F<Perl6/Compiler.pir> has three C<.loadlib> commands early on. The
C<perl6_group> loads the 9 PMCs, the C<perl6_ops> does the 14 dynops,
and the C<math_ops> adds over 30 mathematical operators such as C<add>,
C<sub>, C<mul>, C<div>, C<sin>, C<cos>, C<sqrt>, C<log10> etc. (source
in F<parrot/src/ops/math.ops>)

=head2 6. Builtin functions and runtime support

The last component of the compiler are the various builtin
functions and libraries that a Perl 6 program expects to
have available when it is running. These include functions
for the basic operations (C<< infix:<+> >>, C<< prefix:<abs> >>)
as well as common global functions such as C<say> and C<print>.

(F<builtins/*.pir>)

(F<cheats/*>)

(F<core/*.pm>)

(F<glue/*.pir>)
The last component of the compiler are the various builtin functions and
libraries that a Perl 6 program expects to have available when it is
running. These include functions for the basic operations
(C<< infix:<+> >>, C<< prefix:<abs> >>) as well as common global
functions such as C<say> and C<print>.

(F<metamodel/*>)
The stage-1 compiler compiles these all and they become part of the
final F<perl6.pbc>. The source code is in F<builtins/*.pir>,
F<cheats/*>, F<core/*.pm>, F<glue/*.pir> and F<metamodel/*>.

=head2 Still to be documented

Expand All @@ -282,7 +320,7 @@ maintainer of Rakudo. The other contributors and named in F<CREDITS>.

=head1 COPYRIGHT

Copyright (C) 2007-2009, The Perl Foundation.
Copyright (C) 2007-2010, The Perl Foundation.

=cut

Expand Down

0 comments on commit 3b869ce

Please sign in to comment.