Permalink
Browse files

First draft of a perl 6 parser in p6 rules and pir.

git-svn-id: http://svn.perl.org/parrot/trunk/languages/perl6@12013 d31e2699-5ff4-0310-a27c-f18f2fbe73fe
  • Loading branch information...
0 parents commit 1b6cbf6d51b787556f2b47cd55dbf7b00fde04c0 @pmichaud pmichaud committed Mar 24, 2006
Showing with 510 additions and 0 deletions.
  1. +78 −0 README
  2. +119 −0 lib/grammar.g
  3. +203 −0 lib/parse.pir
  4. +67 −0 p6shell.pir
  5. +43 −0 perl6.pir
78 README
@@ -0,0 +1,78 @@
+=head1 Perl 6 parser
+
+This is a Perl 6 parser, an early version (no version numbers yet).
+
+I know you're probably looking for the Perl 6 compiler, but this
+isn't it, at least not yet. Eventually it probably will be, but
+for the moment it's just the parser. As we finish up TGE
+(compilers/tge) and code generation, then this will probably
+be the Perl 6 compiler.
+
+(If you're in a hurry to write Perl 6 programs, you might try
+looking at PUGS -- http://www.pugscode.org. Or, help us build
+the compiler here -- read on!)
+
+However, even though this is not a complete compiler yet, you can
+still see how Perl 6 programs are parsed, help us create test
+cases, and extend/improve the grammar to cover more of Perl 6.
+Here's how to do it:
+
+=head2 Compiling
+
+The Perl 6 parser lives in the perl6.pbc file. To create this
+file, simply issue the command
+
+ $ make
+
+This directory comes with F<p6shell.pir>, which is a simple parrot
+script designed to exercise the parser, both on static input files
+and interactively via command line input. To run the parser
+on a perl 6 input file named "foo.p6" and display the resulting
+parse tree:
+
+ $ parrot p6shell.pir foo.p6
+
+To run the parser interactively, entering single-line statements
+and displaying the parse tree for each:
+
+ $ parrot p6shell.pir
+
+=head2 Files
+
+The "top" file for the parser is F<perl6.pir> which is used to
+create the F<perl6.pbc> file. It initializes the overall
+parsing system and registers the parser as a Parrot "Perl6" compiler.
+
+The other files needed for parsing are in the F<lib/> subdirectory.
+
+The F<lib/grammar.g> file defines the "top-down" grammar used for
+large Perl 6 program structures. It consists of rule statements
+defined in Perl 6 rules syntax and is compiled using the
+C<rulec.pir> "rules compiler" from PGE to produce a lib/grammar.pir
+file with the PIR version of the rules. (For more information on
+Perl 6 rules, see Synopsis 5 and the Parrot Grammar Engine in the
+F<compilers/pge> directory.)
+
+The F<lib/parse.pir> file defines the "bottom-up" parser
+operators, as well as any special-purpose rules needed for
+parsing Perl 6 that are better written directly in PIR instead
+of using the top-down rules syntax or bottom-up operator
+precedence parser.
+
+The PIR files in F<lib/> are then included as part of compiling
+F<perl6.pir> to produce F<perl6.pbc>. One can then parse Perl 6
+source by doing:
+
+ $P0 = compreg "Perl6"
+ $S0 = "...perl 6 source code..."
+ $P1 = $P0($S0)
+ # $P1 holds the parse tree of the Perl 6 source code
+
+
+=head1 AUTHOR
+
+Patrick Michaud (pmichaud@pobox.com) is the author and maintainer.
+Patches and suggestions should be sent to the Perl 6 compiler list
+(perl6-compiler@perl.org).
+
+=cut
@@ -0,0 +1,119 @@
+## TITLE
+## The Perl 6 grammar
+##
+## DESCRIPTION
+##
+## These are the rules used to compile Perl 6 programs.
+## This is just a first draft of a grammar for parsing
+## Perl 6 programs, undoubtedly more rules will be added
+## soon. Much of the work is hidden in the <opparse>
+## rule, which is defined in L<Perl6/parse.pir>
+## and handles most expressions using a bottom-up
+## parsing algorithm.
+
+grammar Perl6::Grammar ;
+
+
+## This rule handles whitespace and comments between tokens.
+## XXX: Add pod directive handling
+
+rule ws { [ \# \N+ | \s+ ]* ::: }
+
+
+rule program { ^ <statement_list> <?ws> [ $ | <?syntax_error> ] }
+
+
+rule statement_list { <statement> [ <?statement_end> <statement> ]* }
+
+
+rule statement_end {
+ <after ;> ::
+ | <after \}> :: <before \s*? [ \n | \# ]>
+}
+
+
+## XXX: Eventually this rule will probably be done
+## using a %statement_control hash. Since PGE doesn't
+## support hashes in rules yet, we're just using
+## an alternation for now.
+
+rule statement {
+ <if_statement>
+ | <while_statement>
+ | <expression>
+}
+
+
+## XXX: TODO: Add "elsif" and "else"
+rule if_statement {:w
+ if <expression> <block>
+}
+
+
+rule while_statement {:w
+ while <expression> <block>
+}
+
+
+rule block { <simple_block> | <pointy_sub> }
+
+
+## A <simple_block> is just a statement_list inside of a pair of braces.
+
+rule simple_block {
+ <?ws>:
+ \{
+ <?ws>: <statement_list> <?ws>:
+ [ \} | <?syntax_error> ]
+}
+
+
+## XXX: This isn't the real <pointy_sub> rule -- it doesn't know
+## how to parse arguments yet. It's just here as a placeholder
+## for now.
+
+rule pointy_sub { --\> <simple_block> }
+
+
+## We handle Perl 6 expressions using PGE's operator
+## precedence parser. The tokens and for this are
+## defined in L<Perl6/parser.pir>.
+
+rule expression { <opparse> }
+
+
+## The <term> rule gets called from the operator precedence
+## parser whenever it needs a term.
+
+rule term {
+ <sigil> <name>
+ | <block>
+ | <integer>
+ | <number>
+ | <PGE::Text::bracketed: '">
+}
+
+
+## The <listop> rule gets called from the operator precedence
+## parser whenever it's looking for a term. At the moment
+## it primarily grabs bareword terms.
+
+rule listop { <reserved_word> ::: <fail> | <ident> }
+
+rule reserved_word { [ if | unless | while | until | for | loop ] \b }
+
+
+## XXX: These are just placeholder rules for demonstration,
+## they certainly need to be expanded to be more complete.
+
+rule sigil { <[$@%^]> }
+rule integer { \d+ }
+rule number { \d+ \. \d+ }
+
+
+## The <syntax_error> rule generates a simple syntax
+## error message, and displays the line number and context
+## of the error.
+
+rule syntax_error { <?die: Syntax error> }
+
Oops, something went wrong.

0 comments on commit 1b6cbf6

Please sign in to comment.