Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP
Browse files

merge html_cleanup branch with trunk

git-svn-id: https://svn.parrot.org/parrot/branches/html_cleanup@48139 d31e2699-5ff4-0310-a27c-f18f2fbe73fe
  • Loading branch information...
commit ee2ab72d4d2bde9e6e09c0d46e1bd68ffdaed0cb 2 parents 58142f3 + 4b53831
@mikehh mikehh authored
Showing with 34,281 additions and 32,573 deletions.
  1. +1 −2  CREDITS
  2. +47 −5 DEPRECATED.pod
  3. +6 −3 MANIFEST
  4. +7 −0 NEWS
  5. +1 −1  compilers/imcc/pbc.c
  6. +1 −0  compilers/pct/src/PAST/Compiler.pir
  7. +1 −1  config/gen/core_pmcs.pm
  8. +5 −1 config/gen/makefiles/editor.in
  9. +3 −3 config/gen/platform/generic/env.c
  10. +1 −1  docs/book/pir/ch04_variables.pod
  11. +4 −0 docs/project/release_manager_guide.pod
  12. +13 −7 docs/project/support_policy.pod
  13. +11 −5 editor/README.pod
  14. +0 −1  editor/filetype_parrot.vim
  15. +3 −2 examples/languages/squaak/MAINTAINER
  16. +1 −0  examples/languages/squaak/PARROT_REVISION
  17. +3 −49 examples/languages/squaak/README
  18. +25 −30 examples/languages/squaak/doc/tutorial_episode_1.pod
  19. +79 −81 examples/languages/squaak/doc/tutorial_episode_2.pod
  20. +102 −86 examples/languages/squaak/doc/tutorial_episode_3.pod
  21. +87 −90 examples/languages/squaak/doc/tutorial_episode_4.pod
  22. +93 −139 examples/languages/squaak/doc/tutorial_episode_5.pod
  23. +67 −48 examples/languages/squaak/doc/tutorial_episode_6.pod
  24. +120 −168 examples/languages/squaak/doc/tutorial_episode_7.pod
  25. +56 −62 examples/languages/squaak/doc/tutorial_episode_8.pod
  26. +3 −3 examples/languages/squaak/doc/tutorial_episode_9.pod
  27. +61 −35 examples/languages/squaak/setup.pir
  28. +4 −55 examples/languages/squaak/squaak.pir
  29. +404 −0 examples/languages/squaak/src/Squaak/Actions.pm
  30. +12 −0 examples/languages/squaak/src/Squaak/Compiler.pm
  31. +205 −0 examples/languages/squaak/src/Squaak/Grammar.pm
  32. +22 −0 examples/languages/squaak/src/Squaak/Runtime.pm
  33. +0 −99 examples/languages/squaak/src/builtins/say.pir
  34. +0 −446 examples/languages/squaak/src/parser/actions.pm
  35. +0 −235 examples/languages/squaak/src/parser/grammar.pg
  36. +56 −0 examples/languages/squaak/src/squaak.pir
  37. +3,588 −3,379 ext/nqp-rx/src/stage0/HLL-s0.pir
  38. +19,548 −18,433 ext/nqp-rx/src/stage0/NQP-s0.pir
  39. +9,437 −9,046 ext/nqp-rx/src/stage0/P6Regex-s0.pir
  40. +123 −15 ext/nqp-rx/src/stage0/Regex-s0.pir
  41. +8 −7 include/parrot/key.h
  42. +6 −5 include/parrot/runcore_api.h
  43. +2 −2 lib/Parrot/Configure/Options/Conf/File.pm
  44. +17 −0 lib/Parrot/Install.pm
  45. +1 −1  lib/Pod/Simple/Search.pm
  46. +5 −1 runtime/parrot/library/distutils.pir
  47. +9 −9 src/call/args.c
  48. +1 −1  src/dynpmc/rational.pmc
  49. +1 −0  src/key.c
  50. +1 −1  src/oo.c
  51. +1 −1  src/pmc/hash.pmc
  52. +12 −0 src/pmc/key.pmc
  53. +2 −2 src/pmc/sub.pmc
  54. +1 −0  src/runcore/main.c
  55. +1 −1  src/runcore/profiling.c
  56. +3 −3 src/vtable.tbl
  57. +4 −2 t/library/pcre.t
  58. +1 −1  t/steps/auto/format-01.t
  59. +3 −3 t/steps/auto/inline-01.t
  60. +1 −1  t/steps/init/hints-01.t
  61. +1 −1  t/tools/pmc2cutils/05-gen_c.t
View
3  CREDITS
@@ -129,8 +129,7 @@ D: Keeps us running
E: ask@develooper.com
N: Audrey Tang
-U: audreyt
-U: autrijus
+U: au
E: audreyt@audreyt.org
D: Pugs, a Perl6->Parrot implementation.
View
52 DEPRECATED.pod
@@ -43,6 +43,18 @@ deprecations specifically by including this snippet:
=over 4
+=item Task [eligible in 2.7]
+
+For the gsoc_threads branch, the word "Task" has been stolen
+for Green Threads. If this branch is merged, the current task PMC
+may be renamed or removed and related behavior may change.
+
+L<http://trac.parrot.org/parrot/ticket/1709>
+
+=item ParrotThread [eligible in 2.7]
+
+L<http://trac.parrot.org/parrot/ticket/1708>
+
=item GzipHandle [experimental]
L<https://trac.parrot.org/parrot/ticket/1580>
@@ -305,6 +317,27 @@ As of the latest PCC changes, this does nothing different from '.call'.
L<https://trac.parrot.org/parrot/ticket/1624>
+=item :main Sub behaviour and selection. [eligible in 2.7]
+
+Currently, if no :main sub is found, the first .sub in a file is used as
+main. Also, arguments are passed to the main sub regardless of the .param
+declarations in that sub.
+
+After this change, if no sub is marked with :main, an exception will be
+raised. Multiple :main declarations will be still be allowed, and all but the
+first will be ignored.
+
+This change will also force all subs, including :main, to have their
+arguments checked - to allow an arbitrary number of arguments, have
+this be the only .param declaration in the sub.
+
+ .param pmc args :slurpy
+
+
+L<https://trac.parrot.org/parrot/ticket/1033>
+L<https://trac.parrot.org/parrot/ticket/1704>
+L<https://trac.parrot.org/parrot/ticket/1705>
+
=back
=head1 Functions
@@ -406,13 +439,22 @@ use a different detection mechanism or eliminate it altogether.
L<https://trac.parrot.org/parrot/ticket/463>
-=item PAST::Val node generation [eligible in 1.5]
+=item PAST::Val node generation [eligible in 1.5]
+
+The PAST::Compiler may generate the code for PAST::Val nodes
+(i.e., constants) at the beginning of the block (Parrot sub) instead
+of the location where they occur in the PAST tree.
+
+L<https://trac.parrot.org/parrot/ticket/868>
-The PAST::Compiler may generate the code for PAST::Val nodes
-(i.e., constants) at the beginning of the block (Parrot sub) instead
-of the location where they occur in the PAST tree.
+=item Meta-model implementation used by PCT [eligible in 2.7]
-L<https://trac.parrot.org/parrot/ticket/868>
+PCT is set to switch to a new meta-model implementation for its classes
+and objects. This will most likely only affect those who rely on the
+interface of what is returned from .HOW, or rely on PCT objects exhibiting
+various other peculiarities of the P6object implementation. (Even when that
+is the case, the HOW API will not be changing too drastically, so for most
+PCT users there should be little to no upheavel.)
=back
View
9 MANIFEST
@@ -626,6 +626,7 @@ examples/languages/abc/t/abc_functions [examples]
examples/languages/abc/t/abc_special_variables [examples]
examples/languages/abc/t/abc_statement [examples]
examples/languages/squaak/MAINTAINER [examples]
+examples/languages/squaak/PARROT_REVISION [examples]
examples/languages/squaak/README [examples]
examples/languages/squaak/doc/tutorial_episode_1.pod [examples]
examples/languages/squaak/doc/tutorial_episode_2.pod [examples]
@@ -640,9 +641,11 @@ examples/languages/squaak/examples/factorial.sq [examples]
examples/languages/squaak/examples/life.sq [examples]
examples/languages/squaak/setup.pir [examples]
examples/languages/squaak/squaak.pir [examples]
-examples/languages/squaak/src/builtins/say.pir [examples]
-examples/languages/squaak/src/parser/actions.pm [examples]
-examples/languages/squaak/src/parser/grammar.pg [examples]
+examples/languages/squaak/src/Squaak/Actions.pm [examples]
+examples/languages/squaak/src/Squaak/Compiler.pm [examples]
+examples/languages/squaak/src/Squaak/Grammar.pm [examples]
+examples/languages/squaak/src/Squaak/Runtime.pm [examples]
+examples/languages/squaak/src/squaak.pir [examples]
examples/languages/squaak/t/00-sanity.t [examples]
examples/languages/squaak/t/01-math.t [examples]
examples/library/acorn.life [examples]
View
7 NEWS
@@ -1,5 +1,12 @@
# $Id$
+New in 2.6.0
+- Platforms
+ + The Fedora package 'parrot-devel' install the files for syntax-highlighting
+ and automatic indenting for the vim editor.
+- Documentation
+ + Updated the Squaak tutorial to use modern NQP-rx and PCT.
+
New in 2.5.0
- Core
+ Added ByteBuffer PMC to allow direct byte manipulation
View
2  compilers/imcc/pbc.c
@@ -1465,7 +1465,7 @@ add_const_pmc_sub(PARROT_INTERP, ARGMOD(SymReg *r), size_t offs, size_t end)
if (unit->vtable_name) {
vtable_name = Parrot_str_new(interp, unit->vtable_name + 1,
strlen(unit->vtable_name) - 2);
- UNIT_FREE_CHAR(unit->method_name);
+ UNIT_FREE_CHAR(unit->vtable_name);
}
else
vtable_name = sub->name;
View
1  compilers/pct/src/PAST/Compiler.pir
@@ -28,6 +28,7 @@ basic flags are:
+ PMC, int register, num register, or numeric constant
~ PMC, string register, or string constant
: argument (same as '*'), possibly with :named or :flat
+ 0-9 use the nth input operand as the output result of this operation
These flags are used to describe signatures and desired return
types for various operations. For example, if an opcode is
View
2  config/gen/core_pmcs.pm
@@ -76,7 +76,7 @@ END_H
END_H
print {$OUT} coda();
- close $OUT or die "Can't close file: $!";;
+ close $OUT or die "Can't close file: $!";
move_if_diff( "$file.tmp", $file );
View
6 config/gen/makefiles/editor.in
@@ -1,4 +1,4 @@
-# Copyright (C) 2005-2009, Parrot Foundation.
+# Copyright (C) 2005-2010, Parrot Foundation.
# $Id$
#IF(win32):VIM_DIR = $(USERPROFILE)/vimfiles
@@ -13,6 +13,9 @@ CP = @cp@
MKPATH = @mkpath@
RM_F = @rm_f@
+SKEL_FILE_DIR = `$(PERL) -e 'print "$(SKELETON)" || "$(VIM_DIR)"'`
+LINE = "au BufNewFile *.pir 0r $(SKEL_FILE_DIR)/skeleton.pir"
+
default: all
all: pir.vim imc.kate skeleton.pir
@@ -49,6 +52,7 @@ vim-install: pir.vim skeleton.pir
$(CP) pmc.vim "$(VIM_SYN_DIR)"
$(MKPATH) "$(VIM_FT_DIR)"
$(CP) filetype_parrot.vim "$(VIM_FT_DIR)/parrot.vim"
+ echo $(LINE) >> "$(VIM_FT_DIR)/parrot.vim"
$(MKPATH) "$(VIM_IN_DIR)"
$(CP) indent_pir.vim "$(VIM_IN_DIR)/pir.vim"
View
6 config/gen/platform/generic/env.c
@@ -74,13 +74,13 @@ UnSet Environment vars
void
Parrot_unsetenv(PARROT_INTERP, STRING *str_name)
{
- char * const name = Parrot_str_to_cstring(interp, str_name);
#ifdef PARROT_HAS_UNSETENV
+ char * const name = Parrot_str_to_cstring(interp, str_name);
unsetenv(name);
+ Parrot_str_free_cstring(name);
#else
- Parrot_setenv(name, "");
+ Parrot_setenv(interp, str_name, Parrot_str_new(interp, "", 0));
#endif
- Parrot_str_free_cstring(name);
}
/*
View
2  docs/book/pir/ch04_variables.pod
@@ -182,7 +182,7 @@ it's true and the second argument otherwise:
$P0 = or $P1, $P2
-=end PIR_FRAGMENT_INVALID[
+=end PIR_FRAGMENT_INVALID
Both C<and> and C<or> are short-circuiting ops. If they can determine what
value to return from the first argument, they'll never evaluate the second.
View
4 docs/project/release_manager_guide.pod
@@ -261,6 +261,10 @@ works well for use Perl and PerlMonks, and text for the rest. It is not a
bad idea to add a "highlights" section to draw attention to major new
features, just be sure to say the same thing in both text and HTML versions.
+Compute the SHA1 sum of the tarball and append it to the release announcement:
+
+ $ sha1sum parrot-a.b.c.tar.gz
+
=item 10.
Update the website. You will need an account with editor rights
View
20 docs/project/support_policy.pod
@@ -71,13 +71,19 @@ After a feature is announced as deprecated, it might not appear in the
next supported release. We sometimes delay removing deprecated features
for various reasons, including dependencies by other parts of the core.
-The developer releases have more flexibility in feature removal, while
-still meeting the deprecation requirements for support releases. A
-feature that has a deprecation notification in the 2.0 release may be
-removed from any monthly developer release between 2.0 and the next
-supported release, though we're likely to stagger the removals. An
-experimental feature that was never included in a supported release may
-be removed before a supported release without a deprecation cycle.
+The developer releases have more flexibility in feature removal, while still
+meeting the deprecation requirements for support releases. A feature that has a
+deprecation notification in the 2.0 release may be removed from any monthly
+developer release between 2.0 and the next supported release, though we're
+likely to stagger the removals.
+
+=head2 Experimental Features
+
+From time to time, we may add features to get feedback on their utility and
+design. Marking them as "Experimental" in F<DEPRECATED.pod> indicates that we
+may modify or remove them without official deprecation notices. Use them at
+your own risk--and please provide feedback through official channels if you use
+them successfully or otherwise.
=head2 Supported Older Versions
View
16 editor/README.pod
@@ -14,14 +14,20 @@ your favorite editor. For a summary on what is available do
=head2 Vim
-Calling C<make vim-install> in the F<editor/> directory will
-install several files in F<~/.vim>. All these files have the F<.vim>
+By default calling C<make vim-install> in the F<editor/> directory will
+install several files in F<~/.vim>. You can use the variable C<VIM_DIR>
+on the command line by calling C<make> to choose a different target directory
+for the vim files.
+
+ make vim-install [VIM_DIR=/vim_files_target_directory]
+
+All these files have the F<.vim>
extension. F<pir.vim> (generated from F<pir_vim.in>), F<pasm.vim>, and
F<pmc.vim> are syntax files; F<indent_pir.vim> is an indent plugin;
and F<filetype_parrot.vim> is a filetype script that tells vim to
associate the extensions .pir, .pasm, and .pmc with the
right syntax. The syntax files are installed to F<~/.vim/syntax/>;
-F<filetype_parrot.vim> is installed to F<~/.vim/ftdetect>;
+F<filetype_parrot.vim> is installed to F<~/.vim/parrot.vim>;
F<indent_pir.vim> is copied to F<~/.vim/indent/pir.vim>. If you want
indenting, you should also place C<filetype indent on> somewhere in
your F<~/.vimrc>.
@@ -35,7 +41,7 @@ Run:
in F<editor/> to build it.
-TODO: How do we install Kate syntax files?
+Copy the file F<imcc.xml> to F<~/.kde/share/apps/katepart/syntax>.
=head2 Emacs
@@ -77,7 +83,7 @@ Additionally, you might want to add:
(function (lambda ()
(setq indent-tabs-mode nil))))
-to F<~/.emacs> as this seems to prevent the odd behavior that is noted when
+to F<~/.emacs> as this seems to prevent the odd behavior that is noted when
using tabs in the pasm mode.
=back
View
1  editor/filetype_parrot.vim
@@ -1,4 +1,3 @@
au BufNewFile,BufRead *.pmc set ft=pmc cindent
au BufNewFile,BufRead *.pasm set ft=pasm ai sw=4
au BufNewFile,BufRead *.pir set ft=pir ai sw=4
-au BufNewFile *.pir 0r ~/.vim/skeleton.pir
View
5 examples/languages/squaak/MAINTAINER
@@ -1,4 +1,5 @@
# $Id$
-N: Klaas-Jan Stol (kj,kjs)
-E: parrotcode at gmail dot com
+N: Tyler Curtis
+U: tcurtis
+E: tyler.l.curtis@gmail.com
View
1  examples/languages/squaak/PARROT_REVISION
@@ -0,0 +1 @@
+47087
View
52 examples/languages/squaak/README
@@ -1,51 +1,5 @@
-Squaak: A Simple Language
-
-Squaak is a case-study language described in the Parrot Compiler Tools
-tutorial at http://www.parrotblog.org/2008/03/targeting-parrot-vm.html.
-
-Note that Squaak is NOT an implementation Squeak; it has nothing to do
-with any SmallTalk implementation.
-
-Squaak demonstrates some common language constructs, but at the same
-time is currently lacking some other, seemingly simple features. For instance,
-Squaak does not have break or continue statements (or equivalents
-in your favorite syntax). Once PCT has built-in support for these, they
-will be added.
-
-Squaak has the following features:
-
- * global and local variables
- * basic types: integer, floating-point and strings
- * aggregate types: arrays and hash tables
- * operators: +, -, /, *, %, <, <=, >, >=, ==, !=, .., and, or, not
- * subroutines and parameters
- * assignments and various control statements, such as "if" and "while" and "return"
- * library functions: print, read
-
-A number of common (more advanced) features are missing.
-Most notable are:
-
- * classes and objects
- * exceptional control statements such as break and continue
- * advanced control statements such as switch
- * closures (nested subroutines and accessing local variables in an outer scope)
-
-Squaak is designed to be a simple showcase language, to show the use of the
-Parrot Compiler Tools for implementing a language.
-
-In order to use Squaak:
-
- $ make
-
-Running Squaak in interactive mode:
-
- $ ../../parrot squaak.pbc
-
-Running Squaak with a file (for instance, the included Game of Life example):
-
- $ ../../parrot squaak.pbc examples/life.sq
-
-Bug reports and improvements can be sent to the maintainer or Parrot porters
-mailing list.
+Language 'Squaak' was created with tools/dev/mk_language_shell.pl, r47087.
+ $ parrot setup.pir
+ $ parrot setup.pir test
View
55 examples/languages/squaak/doc/tutorial_episode_1.pod
@@ -99,36 +99,29 @@ parts:
=over 4
-=item B<P>arrot B<G>rammar B<E>ngine (PGE).
+=item B<N>ot B<Q>uite B<P>erl (6) (NQP-rx).
-The PGE is an advanced engine for regular expressions. Besides regexes as found
-in Perl 5, it can also be used to define language grammars, using Perl 6 syntax.
-(Check the references for the specification.)
+NQP is a lightweight language inspired by Perl 6 and can be used to write the
+methods that must be executed during the parsing phase, just as you can write
+actions in a Yacc/Bison input file. It also provides the regular expression engine we'll use to
+write our grammar. In addition to the capabilities of Perl 5's regexes, the Perl 6 regexes that NQP
+ implements can be used to define language grammars. (Check the references for the specification.)
=item B<P>arrot B<A>bstract B<S>yntax B<T>ree (PAST).
The PAST nodes are a set of classes defining generic abstract syntax tree nodes
that represent common language constructs.
-=item HLLCompiler class.
+=item HLL::Compiler class.
This class is the compiler driver for any PCT-based compiler.
-=item B<N>ot B<Q>uite B<P>erl (6) (NQP).
-
-NQP is a lightweight language inspired by Perl 6 and can be used to write the
-methods that must be executed during the parsing phase, just as you can write
-actions in a Yacc/Bison input file.
-
=back
=head2 Getting Started
For this tutorial, it is assumed you have successfully compiled parrot
-(and maybe even run the test suite). If you browse through the languages
-directory in the Parrot source tree, you'll find a number of language
-implementations. Most of them are not complete yet; some are maintained
-actively and others aren't. If, after reading this tutorial, you feel like
+(and maybe even run the test suite). If, after reading this tutorial, you feel like
contributing to one of these languages, you can check out the mailing list or
join IRC (see the references section for details).
@@ -137,13 +130,13 @@ Parrot comes with a special shell script to generate the necessary files for a
language implementation. In order to generate these files for our language,
type (assuming you're in Parrot's root directory):
- $ perl tools/dev/mk_language_shell.pl Squaak languages/squaak
+ $ perl tools/dev/mk_language_shell.pl Squaak ~/src/squaak
(Note: if you're on Windows, you should use backslashes.) This will generate the
-files in a directory F<languages/squaak>, and use the name Squaak as the language's
+files in a directory F<~/src/squaak>, and use the name Squaak as the language's
name.
-After this, go to the directory F<languages/squaak> and type:
+After this, go to the directory F<~/src/squaak> and type:
$ parrot setup.pir test
@@ -165,20 +158,22 @@ your favorite editor, and put in this statement:
Save it the as file F<test.sq> and type:
- $ ../../parrot squaak.pbc test.sq
+ $ ./installable_squaak test.sq
+"installable_squaak" is a "fake-cutable" an executable that bundles the Parrot interpreter and the
+compiled bytecode for a program to allow treating a Parrot program as a normal executable program.
This will run Parrot, specifying squaak.pbc as the file to be run by Parrot,
which takes a single argument: the file test.sq. If all went well, you should
see the following output:
- $ ../../parrot squaak.pbc test.sq
+ $ ./installable_squaak test.sq
Squaak!
Instead of running a script file, you can also run the Squaak compiler as an
interactive interpreter. Run the Squaak compiler without specifying a script
file, and type the same statement as you wrote in the file:
- $ ../../parrot squaak.pbc
+ $ ./installable_squaak
say "Squaak!";
which will print:
@@ -191,7 +186,7 @@ This first episode of this tutorial is mainly an overview of what will be
coming. Hopefully you now have a global idea of what the Parrot Compiler Tools
are, and how they can be used to build a compiler targeting Parrot. If you want
to check out some serious usage of the PCT, check out Rakudo (Perl 6 on Parrot)
-in languages/perl6 or Pynie (Python on Parrot) in languages/pynie.
+at http://rakudo.org/ or Pynie (Python on Parrot) at http://code.google.com/p/pynie/ .
The next episodes will focus on the step-by-step implementation of our language,
including the following topics:
@@ -200,7 +195,7 @@ including the following topics:
=item structure of PCT-based compilers
-=item using PGE rules to define the language grammar
+=item using NQP-rx rules to define the language grammar
=item implementing operator precedence using an operator precedence table
@@ -223,8 +218,8 @@ will be posted several days after the episode.
=head3 Advanced interactive mode.
-Launch your favorite editor and look at the file squaak.pir in the directory
-languages/squaak. This file contains the main function (entry point) of the
+Launch your favorite editor and look at the file Compiler.pm in the directory
+F<~/src/squaak/src/Squaak/>. This file contains the main function (entry point) of the
compiler. The class HLLCcompiler defines methods to set a command-line banner
and prompt for your compiler when it is running in interactive mode. For
instance, when you run Python in interactive mode, you'll see:
@@ -242,19 +237,19 @@ which is called a prompt. For Squaak, we'd like to see the following when
running in interactive mode (of course you can change this according to your
personal taste):
- $ ../../parrot squaak.pbc
+ $ ./installable_squaak
Squaak for Parrot VM.
>
Add code to the file squaak.pir to achieve this.
-Hint 1: Look in the onload subroutine.
+Hint 1: Look in the INIT block.
-Hint 2: Note that only double-quoted strings in PIR can interpret
+Hint 2: Note that only double-quoted strings in NQP can interpret
escape-characters such as '\n'.
Hint 3: The functions to do this are documented in
-compilers/pct/src/PCT/HLLCompiler.pir.
+F<compilers/pct/src/PCT/HLLCompiler.pir>.
=head2 References
@@ -270,7 +265,7 @@ compilers/pct/src/PCT/HLLCompiler.pir.
=item * Operator Precedence Parsing with PCT: docs/pct/pct_optable_guide.pod
-=item * Perl 6/PGE rules syntax: Synopsis 5
+=item * Perl 6/NQP rules syntax: Synopsis 5 at http://perlcabal.org/syn/S05.html or http://svn.pugscode.org/pugs/docs/Perl6/Spec/S05-regex.pod
=back
View
160 examples/languages/squaak/doc/tutorial_episode_2.pod
@@ -24,7 +24,7 @@ file, or invoke the compiler without a command line argument, in which case our
compiler enters the interactive mode. Consider the first case, passing the file
test.sq, just as we did before:
- $ ../../parrot squaak.pbc test.sq
+ $ ./installable_squeak test.sq
When invoking our compiler like this, the file test.sq is compiled and the
generated code (bytecode) is executed immediately by Parrot. How does this work,
@@ -50,7 +50,7 @@ four default compilation phases of an HLLCompiler object:
This is an example of using the target option set to "parse", which will print
the parse tree of the input to stdout:
- $ ../../parrot squaak.pbc --target=parse test.sq
+ $ ./installable_squeak --target=parse test.sq
In interactive mode, giving this input:
@@ -58,24 +58,32 @@ In interactive mode, giving this input:
will print this parse tree (without the line numbers):
- 1 "parse" => PMC 'Squaak::Grammar' => "say 42;\r\n" @ 0 {
- 2 <statement> => ResizablePMCArray (size:1) [
- 3 PMC 'Squaak::Grammar' => "say 42;\r\n" @ 0 {
- 4 <value> => ResizablePMCArray (size:1) [
- 5 PMC 'Squaak::Grammar' => "42" @ 4 {
- 6 <integer> => PMC 'Squaak::Grammar' => "42" @ 4
- 7 }
- 8 ]
- 9 }
- 10 ]
- 11 }
+ 1 "parse" => PMC 'Regex;Match' => "say 42;\n" @ 0 {
+ 2 <statementlist> => PMC 'Regex;Match' => "say 42;\n" @ 0 {
+ 3 <statement> => ResizablePMCArray (size:1) [
+ 4 PMC 'Regex;Match' => "say 42" @ 0 {
+ 5 <statement_control> => PMC 'Regex;Match' => "say 42" @ 0 {
+ 6 <sym> => PMC 'Regex;Match' => "say" @ 0
+ 7 <EXPR> => ResizablePMCArray (size:1) [
+ 8 PMC 'Regex;Match' => "42" @ 4 {
+ 9 <integer> => PMC 'Regex;Match' => "42" @ 4 {
+ 10 <VALUE> => PMC 'Regex;Match' => "42" @ 4
+ 11 <decint> => \parse[0][0]
+ 12 }
+ 13 }
+ 14 ]
+ 15 }
+ 16 }
+ 17 ]
+ 18 }
+ 19 }
When changing the value of the target option, the output changes into a
different representation of the input. Why don't you try that right now?
So, a HLLCompiler object has four compilation phases: parsing, construction of a
Parrot Abstract Syntax Tree (PAST), construction of a Parrot Opcode Syntax Tree
-(POST), generation of Parrot Intermediate Representation (PIR). After
+(POST) and generation of Parrot Intermediate Representation (PIR). After
compilation, the generated PIR is executed immediately.
If your compiler needs additional stages, you can add them to your HLLCompiler
@@ -89,58 +97,55 @@ simultaneously. Therefore, these are discussed together.
Parse phase: match objects and PAST construction
During the parsing phase, the input is analyzed using Perl 6's extended regular
expressions, known as Rules (see Synopsis 5 for details). When a rule matches
-some input string, a so-called Match object is created. A Match object is a
-combined array and hashtable, implying it can be indexed by integers as well as
+some input string, a Match object is created. A Match object is a
+combined array and hashtable and can be indexed by integers as well as
strings. As rules typically consist of other (sub) rules, it is easy to retrieve
a certain part of the match. For instance, this rule:
rule if_statement {
'if' <expression> 'then' <statement> 'end'
- {*}
}
has two other subrules: expression and statement. The match object for the rule
-if_statement represents the whole string from if to end. When you're interested
-only in the expression or statement part, you can retrieve that by indexing the
-match object by the name of the subrule (in this case, expression and statement,
-respectively).
+C<if_statement> represents the whole string from if to end. You can retrieve a
+the Match for a subrule by indexing into the Match object using the name of
+that subrule. For instance, to get the match for C<< <expression> >>, you
+would use C<< $/<expression> >>. (In nqp, C<< $foo<bar> >> indexes into
+C<$foo> using the constant string C<bar> as a hash key.)
During the parse phase, the PAST is constructed. There is a small set of PAST
-node types, for instance, C<PAST::Var> to represent variables (identifiers, such
-as C<print>), C<PAST::Val> to represent literal values (for instance, C<"hello">
-and C<42>), and so on. Later we shall discuss the various PAST nodes in more
-detail.
+node types. For instance, C<PAST::Var> to represent variables (identifiers, such
+as C<print>) and C<PAST::Val> to represent literal values (for instance, C<"hello">
+and C<42>). Later we'll go through the various PAST nodes in more detail.
Now, you might wonder, at which point exactly is this PAST construction
-happening? This is where the special {*} symbol comes in, just below the string
-'if' in the if_statement rule shown above. These special markers indicate that a
-parse action should be invoked. Such a parse action is just a method that has
-the same name as the rule in which it is written (in this case: if_statement).
-So, during the parsing phase, several parse actions are executed, each of which
-builds a piece of the total PAST representing the input string. More on this
-will be explained later.
-
-The Parrot Abstract Syntax Tree is just a different representation of the same
-input string (your program being compiled). It is a convenient data structure to
-transform into something different (such as executable Parrot code) but also to
-do all sorts of analysis, such as compile-time type checking.
+happening? At the end of a successfully matching rule, the rule's parse action
+is performed. Such a parse action is just a method that has the same name as
+the rule which triggers it (in this case: C<if_statement>). So, during the
+parsing phase, several parse actions are executed, each of which builds a piece
+of the total PAST representing the input string.
+
+A Parrot Abstract Syntax Tree is just a compiler-friendly tree-based
+representation of your program. It is convenient both for analysis and
+optimization, and for further transformation into a lower-level representation
+such as POST.
=head2 PAST to POST
-After the parse phase during which the PAST is constructed, the HLLCompiler
-transforms this PAST into something called a Parrot Opcode Syntax Tree (POST).
-The POST representation is also a tree structure, but these nodes are on a lower
-abstraction level. For instance, on the PAST level there is a node type to
-represent a while statement (constructed as
-C<PAST::Op.new( :pasttype('while') )> ).
+After the PAST is constructed, the HLLCompiler transforms this PAST into a
+Parrot Opcode Syntax Tree (POST). The POST representation is also a tree
+structure, but these nodes are on a lower abstraction level and correspond very
+closely to PIR ops. For instance, the PAST node type which represents a while
+statement (constructed as C<PAST::Op.new( :pasttype('while') )> ) decomposes
+into several POST nodes.
-The template for a while statement typically consists of a number of labels and
+The template for a C<while> statement typically consists of a number of labels and
jump instructions. On the POST level, the same while statement is represented by
-a set of nodes, each representing a one instruction or a label. Therefore, it is
-much easier to transform a POST into something executable than when this is done
-from the PAST level.
+a set of nodes, each representing a one instruction or a label. This makes it
+much easier to transforn POST into executable code.
+
Usually, as a user of the PCT, you don't need to know details of POST nodes,
-which is why this will not be discussed in further detail. Use the target option
+which is why this will not be discussed in further detail. Use C<--target=post>
to see what a POST looks like.
=head2 POST to PIR
@@ -168,34 +173,35 @@ four stages:
=back
-where we noted that the first two are done during the parse stage. Now, as
-you're reading this tutorial, you're probably interested in using the PCT for
-implementing Your Favorite Language for Parrot. We already saw that a language
-grammar is expressed in Perl 6 Rules. What about the other transformations?
-Well, earlier in this episode we mentioned the term parse actions, and that
-these actions create PAST nodes. After you have written a parse action for each
-grammar rule, you're done!
+The first two transformations happen during the parse stage. Now, as you're
+reading this tutorial, you're probably interested in using the PCT to implement
+Your Favorite Language on top of Parrot. We already saw that a language grammar
+is expressed in Perl 6 Rules. What about the other transformations? Well,
+earlier in this episode we mentioned parse actions and that these actions
+create PAST nodes. After you have written a parse action for each grammar rule,
+you're done!
Say what?
That's right. Once you have correctly constructed a PAST, your compiler can
generate executable PIR, which means you just implemented your first language
-for Parrot. Of course, you still need to implement any language specific
-libraries, but that's besides the point.
+on top of Parrot. Of course, you'll still need to implement any language specific
+libraries, but that's beside the point.
-PCT-based compilers already know how to transform a PAST into a POST, and how to
-transform a POST into PIR. These transformation stages are already provided by
+PCT-based compilers already know how to transform PAST into POST and how to
+transform POST into PIR. These transformation stages are already provided by
the PCT.
=head2 What's next?
In this episode we took a closer look at the internals of a PCT-based compiler.
-We discussed the four compilation stages, that transform an input string (a
-program, or script, depending on your definition) into a PAST, a POST and
-finally executable PIR.
+We discussed the four compilation stages which transform an input string (a
+program or script, depending on your definition) into PAST, POST and finally
+executable PIR.
+
The next episodes is where the Fun Stuff is: we will be implementing Squaak for
Parrot. Piece by piece, we will implement the parser and the parse actions.
-Finally, we shall demonstrate John Conway's "Game of Life" running on Parrot,
+Finally, we'll demonstrate John Conway's "Game of Life" running on Parrot,
implemented in Squaak.
=head2 Exercises
@@ -203,27 +209,22 @@ implemented in Squaak.
Last episode's exercise was to add a command line banner and prompt for the
interactive mode of our compiler. Given the hints that were provided, it was
probably not too hard to find the solution, which is shown below. This
-subroutine onload can be found in the file Squaak.pir. The relevant lines are
+INIT block can be found in the file src/Squaak/Compiler.pm. The relevant lines are
marked with a comment
- .sub 'onload' :anon :load :init
- load_bytecode 'PCT.pbc'
-
- $P0 = get_hll_global ['PCT'], 'HLLCompiler'
- $P1 = $P0.'new'()
- $P1.'language'('Squaak')
- $P1.'parsegrammar'('Squaak::Grammar')
- $P1.'parseactions'('Squaak::Grammar::Actions')
+ INIT {
+ Squaak::Compiler.language('Squaak');
+ Squaak::Compiler.parsegrammar(Squaak::Grammar);
+ Squaak::Compiler.parseactions(Squaak::Actions);
- $P1.'commandline_banner'("Squaak for Parrot VM\n") ## set banner
- $P1.'commandline_prompt'('> ') ## set prompt
-
- .end
+ Squaak::Compiler.commandline_banner("Squaak for Parrot VM.\n"); # set banner
+ Squaak::Compiler.commandline_prompt('> '); # set prompt
+ }
Starting in the next episode, the exercises will be more interesting. For now,
it would be useful to browse around through the source files of the compiler,
-and see if you understand the relation between the grammar rules in grammar.pg
-and the methods in actions.pm.
+and see if you understand the relation between the grammar rules in src/Squaak/Grammar.pm
+and the methods in src/Squaak/Actions.pm.
It's also useful to experiment with the --target option described in this
episode. If you don't know PIR, now is the time to do some preparation for that.
There's sufficient information to be found on PIR, see the References section
@@ -236,10 +237,7 @@ whatnot, don't hesitate to leave a comment.
=item 1. PIR language specification: docs/pdds/draft/PDD19_pir.pod
-=item 2. PIR articles: docs/art/*.pod
-
=back
-
=cut
View
188 examples/languages/squaak/doc/tutorial_episode_3.pod
@@ -10,11 +10,10 @@ Starting from a high-level overview, we quickly created our own little scripting
language called I<Squaak>, using a Perl script provided with Parrot. We
discussed the general structure of PCT-based compilers, and each of the default
four transformation phases.
-This third episode is where the Fun begins. In this episode, we shall introduce
-the full specification of Squaak. In this and following episodes, we will
-implement this specification step by step, in small increments that are easy to
-digest. Once you get a feel for it, you'll notice implementing Squaak is almost
-trivial, and most important, a lot of fun! So, let's get started!
+This third episode is where the Fun begins. In this episode, we'll introduce
+the full specification of Squaak. In this and following episodes, we'll
+implement this specification step by step in small easy-to-digest increments.
+So let's get started!
=head2 Squaak Grammar
@@ -26,7 +25,7 @@ specification uses the following meta-syntax:
[step] indicates an optional step
'do' indicates the keyword 'do'
-Below is Squaak's grammar. The start symbol is program.
+Below is Squaak's grammar. The start symbol is C<program>.
program ::= {stat-or-def}
@@ -123,66 +122,66 @@ Gee, that's a lot, isn't it? Actually, this grammar is rather small compared to
"real world" languages such as C, not to mention Perl 6. No worries though, we
won't implement the whole thing at once, but in small steps. What's more, the
exercises section contains enough exercises for you to learn to use the PCT
-yourself! The solutions to these exercises will be posted a few days later (but
-you really only need a couple of hours to figure them out).
+yourself! The solutions to these exercises are in later episodes if you don't
+want to take the time to solve them yourself.
=head2 Semantics
-Most of the Squaak language is straightforward; the if-statement executes
+Most of the Squaak language is straightforward; the C<if-statement> executes
exactly as you would expect. When we discuss a grammar rule (for its
-implementation), a semantic specification will be included. This is to prevent
-myself from writing a complete language manual, which could take some pages.
-
-=head2 Interactive Squaak
-
-Although the Squaak compiler can be used in interactive mode, there is one point
-of attention to be noted. When defining a local variable using the C<var>
-keyword, this variable will be lost in any consecutive commands. The variable
-will only be available to other statements within the same command (a command is
-a set of statements before you press enter). This has to do with the code
-generation by the PCT, and will be fixed at a later point. For now, just
-remember it doesn't work.
+implementation), a semantic specification will be included. This is to avoid
+writing a complete language manual since that's probably not what you're here
+for.
=head2 Let's get started!
In the rest of this episode we will implement the basic parts of the grammar,
such as the basic data types and assignments. At the end of this episode,
-you'll be able to assign simple values to (global) variables. It ain't much, but
+you'll be able to assign simple values to (global) variables. It's not much but
it's a very important first step. Once these basics are in place, you'll notice
-that adding a certain syntactic construct becomes a matter of minutes.
+that adding a certain syntactic construct can be done in a matter of minutes.
-First, open your editor and open the files F<src/parser/grammar.pg> and
-F<src/parser/actions.pm>. The former implements the parser using Perl 6 rules,
+First, open your editor and open the files F<src/Squaak/Grammar.pm> and
+F<src/Squaak/Actions.pm>. The former implements the parser using Perl 6 rules
and the latter contains the parse actions, which are executed during the parsing
stage.
-In the file grammar.pg, you'll see the top-level rule, named C<TOP>. It's
+In the file Grammar.pm you'll see the top-level rule, named C<TOP>. It's
located at, ehm... the top. When the parser is invoked, it will start at this
-rule (a rule is nothing else than a method of the grammar class).
-When we generated this language (in the first episode), some default rules were
-defined. Now we're going to make some small changes, just enough to get us
-started. Firstly, change the statement rule to this:
+rule. A rule is nothing else than a method of the Grammar class. When we
+generated this language some default rules were defined. Now we're going to
+make some small changes, just enough to get us started. Replace the
+C<statement> rule with this rule:
rule statement {
<assignment>
- {*}
}
-and add these rules:
+Replace the statementlist rule with this:
+
+ rule statement_list {
+ <stat_or_def>*
+ }
+
+When you work on the action methods later, you'll also want to replace $<statement> in the action
+method with $<stat_or_def>
+
+Add these rules:
+
+ rule stat_or_def {
+ <statement>
+ }
rule assignment {
- <primary> '=' <expression>
- {*}
+ <primary> '=' <EXPR>
}
rule primary {
<identifier>
- {*}
}
token identifier {
<!keyword> <ident>
- {*}
}
token keyword {
@@ -190,15 +189,33 @@ and add these rules:
|'not'|'or' |'sub' |'throw'|'try' |'var'|'while']>>
}
-Now, change the rule "value" into this (renaming to "expression"):
+ token term:sym<primary> {
+ <primary>
+ }
+
+Rename the token C<< term:sym<integer> >> to C<< term:sym<integer_constant> >> and
+C<< term:sym<quote> >> to C<< term:sym<string_constant> >> (to better match our
+language specification).
+
+Add action methods for term:sym<integer_constant> and term:sym<string_constant>
+to F<src/Squaak/Actions.pm>:
- rule expression {
- | <string_constant> {*} #= string_constant
- | <integer_constant> {*} #= integer_constant
+ method term:sym<integer_constant>($/) {
+ make PAST::Val.new(:value($<integer>.ast), :returns<Integer>);
+ }
+ method term:sym<string_constant>($/) {
+ my $past := $<quote>.ast;
+ $past.returns('String');
+ make $past;
}
+ method term:sym<primary>($/) {
+ make $<primary>.ast;
+ }
+
+PAST::Val nodes are used the represent constant values.
-Rename the rule C<integer> as C<integer_constant>, and C<quote> as
-C<string_constant> (to better match our language specification).
+Finally, remove the rules C<proto token statement_control>,
+C<< rule statement_control:sym<say> >>, and C<< rule statement_control:sym<print> >>.
Phew, that was a lot of information! Let's have a closer look at some things
that may look unfamiliar. The first new thing is in the rule C<identifier>.
@@ -206,16 +223,21 @@ Instead of the C<rule> keyword, you see the keyword C<token>. In short, a token
doesn't skip whitespace between the different parts specified in the token,
while a rule does. For now, it's enough to remember to use a token if you want
to match a string that doesn't contain any whitespace (such as literal constants
-and identifiers), and use a rule if your string does (and should) contain
+and identifiers) and use a rule if your string does (and should) contain
whitespace (such as a an if-statement). We shall use the word C<rule> in a
general sense, which could refer to a token. For more information on rules and
-tokens (and there's a third type, called C<regex>), take a look at synopsis 5.
+tokens take a look at Synopsis 5 or look at Moritz's blog post on the subject
+in the references.
+
+In rule C<assignment>, the <EXPR> subrule is one that we haven't defined. The
+EXPR rule is inherited from HLL::Grammar, and it initiates the grammar's
+operator-precedence parser to parse an expression. For now, don't worry about
+it. All you need to know is that it will give us one of our terms.
-In token C<identifier>, the first subrule is called an assertion. It asserts
-that an C<identifier> does not match the rule keyword. In other words, a keyword
-cannot be used as an identifier. The second subrule is called C<ident>, which is
-a built-in rule in the class C<PCT::Grammar>, of which this grammar is a
-subclass.
+In token C<identifier> the first subrule is called an assertion. It asserts
+that an C<identifier> does not match the rule keyword. In other words a keyword
+cannot be used as an identifier. The second subrule is called C<ident> which is
+a built-in rule in the class C<PCT::Grammar>, the parent class of this grammar.
In token C<keyword>, all keywords of Squaak are listed. At the end there's a
C<<< >> >>> marker, which indicates a word boundary. Without this marker, an
@@ -225,18 +247,6 @@ identifier such as "forloop" would wrongly be disqualified, because the part
matched), the string "forloop" cannot be matched as an identifier. The required
presence of the word boundary prevents this.
-The last rule is C<expression>. An expression is either a string-constant or an
-integer-constant. Either way, an action is executed. However, when the action is
-executed, it does not know what the parser matched; was it a string-constant, or
-an integer-constant? Of course, the match object can be checked, but consider
-the case where you have 10 alternatives, then doing 9 checks only to find out
-the last alternative was matched is somewhat inefficient (and adding new
-alternatives requires you to update this check). That's why you see the special
-comments starting with a "#=" character. Using this notation, you can specify a
-key, which will be passed as a second argument to the action method. As we will
-see, this allows us to write very simple and efficient action methods for rules
-such as expression. (Note there's a space between the C<#=> and the key's name).
-
=head2 Testing the Parser
It is useful to test the parser before writing any action methods. This can save
@@ -253,24 +263,21 @@ you know for sure your parser doesn't accept input that it shouldn't.
Now we have implemented the initial version of the Squaak grammar, it's time to
implement the parse actions we mentioned before. The actions are written in a
-file called F<src/parser/actions.pm>. If you look at the methods in this file,
-here and there you'll see that the match object ($/) , or rather, hash fields of
-it (like $<statement>) is evaluated in scalar context, by writing "$( ... )".
-As mentioned in Synopsis 5, evaluating a Match object in scalar context returns
-its result object. Normally the result object is the matched portion of the
-source text, but the special make function can be used to set the result object
-to some other value.
+file called F<src/Squaak/Actions.pm>. If you look at the methods in this file,
+here and there you'll see that the C<ast> method being called on the match object ($/) , or rather,
+hash fields of it (like $<statement>).
+The special make function can be used to set the ast to a value.
This means that each node in the parse tree (a Match object) can also hold its
PAST representation. Thus we use the make function to set the PAST
-representation of the current node in the parse tree, and later use the $( ... )
-operator to retrieve the PAST representation from it.
+representation of the current node in the parse tree, and later use the C<ast>
+method to retrieve the PAST representation from it.
In recap, the match object ($/) and any subrules of it (for instance
$<statement>) represent the parse tree; of course, $<statement>
represents only the parse tree what the $<statement> rule matched. So, any
action method has access to the parse tree that the equally named grammar rule
-matched, as the match object is always passed as an argument. Evaluating a parse
-tree in scalar context yields the PAST representation (obviously, this PAST
+matched, as the match object is always passed as an argument. Calling the C<ast> method
+on a parse tree yields the PAST representation (obviously, this PAST
object should be set using the make function).
If you're following this tutorial, I highly advise you to get your feet wet, and
@@ -300,8 +307,7 @@ Squaak.
=item 1.
Rename the names of the action methods according to the name changes we made on
-the grammar rules. So, "integer" becomes "integer_constant", "value" becomes
-"expression", and so on.
+the grammar rules. So, "integer" becomes "integer_constant", and so on.
=item 2.
@@ -326,6 +332,11 @@ out how you do such a binding).
=item 5.
+Write the action method for stat_or_def. Simply retrieve the result object from statement and make
+that the result object.
+
+=item 6.
+
Run your compiler on a script or in interactive mode. Use the target option to
see what PIR is being generated on the input "x = 42".
@@ -338,8 +349,7 @@ see what PIR is being generated on the input "x = 42".
=item * Help! I get the error message "no result object".
This means that the result object was not set properly (duh!).
-Make sure each action method is invoked (check each rule for a "{*}" marker),
-and that there is an action method for that rule, and that "make" is used to set
+Make sure there is an action method for that rule and that "make" is used to set
the appropriate PAST node. Note that not all rules have action methods, for
instance the C<keyword> rule (there's no point in that).
@@ -354,6 +364,8 @@ we'll need them, these rules will be added.
=over 4
+=item * rules, regexes and tokens: http://perlgeek.de/blog-en/perl-5-to-6/07-rules.writeback#Named_Regexes_and_Grammars
+
=item * pdd26: ast
=item * synopsis 5: Rules
@@ -374,8 +386,7 @@ the end of Episode 2, and the latter didn't have any coding assignments).
=item 1
Rename the names of the action methods according to the name changes we made
-on the grammar rules. So, "integer" becomes "integer_constant", "value" becomes
-"expression", and so on.
+on the grammar rules. So, "integer" becomes "integer_constant", and so on.
I assume you don't need any help with this.
@@ -387,15 +398,11 @@ object of this assignment and set it as statement's result object using the
special make function. Do the same for rule primary.
method statement($/) {
- make $( $<assignment> );
+ make $<assignment>.ast;
}
-Note that at this point, the rule statement doesn't define different #= keys
-for each type of statement, so we don't declare a parameter C<$key>. This will
-be changed later.
-
method primary($/) {
- make $( $<identifier> );
+ make $<identifier>.ast;
}
=item 3
@@ -417,8 +424,8 @@ expression to the primary. (Check out pdd26 for C<PAST::Op> node types, and
find out how you do such a binding).
method assignment($/) {
- my $lhs := $( $<primary> );
- my $rhs := $( $<expression> );
+ my $lhs := $<primary>.ast;
+ my $rhs := $<expression>.ast;
$lhs.lvalue(1);
make PAST::Op.new( $lhs, $rhs, :pasttype('bind'), :node($/) );
}
@@ -427,6 +434,15 @@ Note that we set the lvalue flag on $lhs. See PDD26 for details on this flag.
=item 5
+Write the action method for stat_or_def. Simply retrieve the result object from statement and make
+that the result object.
+
+ method stat_or_def {
+ make $<statement>.ast;
+ }
+
+=item 6
+
Run your compiler on a script or in interactive mode. Use the target option to
see what PIR is being generated on the input "x = 42".
View
177 examples/languages/squaak/doc/tutorial_episode_4.pod
@@ -60,39 +60,37 @@ if-statements and throw-statements.
The first statement we're going to implement now is the if-statement. An
if-statement has typically three parts (but this of course depends on the
programming language): a conditional expression, a "then" part and an "else"
-part. Implementing this in Perl 6 rules and PAST is almost trivial:
-
- rule if_statement {
- 'if' <expression> 'then' <block>
+part. Implementing this in Perl 6 rules and PAST is almost trivial, but first, let's add a little
+infrastructure to simplify adding new statement types. Replace the statement rule with the
+following:
+ proto rule statement { <...> }
+
+Delete the statement method from Action.pm, and rename the assignment rule in both Grammar.pm and
+Actions.pm to statement:sym<assignment>. The new statement rule is a "proto" rule. A proto rule is
+equivalent to a normal rule whose body contains each specialization of the rule separated by the |
+operator. The name of a particular specialization of a proto rule is placed between the angle
+brackets. Within the body of the rule, it can be matched literally with <sym>.
+
+ rule statement:sym<if> {
+ <sym> <EXPR> 'then' $<then>=<block>
['else' $<else>=<block> ]?
'end'
- {*}
}
rule block {
<statement>*
- {*}
- }
-
- rule statement {
- | <assignment> {*} #= assignment
- | <if_statement> {*} #= if_statement
}
-Note that the optional else block is stored in the match object's "else" field.
+Note that the optional else block is stored in the match object's "else" field, and the then block
+is stored in the match object's "then" field.
If we hadn't written this $<else>= part, then <block> would have been an array,
with block[0] the "then" part, and block[1] the optional else part. Assigning
the optional else block to a different field, makes the action method slightly
easier to read.
-Also note that the statement rule has been updated; a statement is now either
-an assignment or an if-statement. As a result, the action method statement now
-takes a key argument. The relevant action methods are shown below:
-
- method statement($/, $key) {
- # get the field stored in $key from the $/ object,
- # and retrieve the result object from that field.
- make $( $/{$key} );
- }
+Note that the proto declaration for statement means that the result object for $<statement> in any
+rule which calls statement as a subrule will be result object for whichever statement type matched.
+Because of this, we can delete the statement action method.
+ The relevant action methods are shown below:
method block($/) {
# create a new block, set its type to 'immediate',
@@ -105,19 +103,18 @@ takes a key argument. The relevant action methods are shown below:
# for each statement, add the result
# object to the block
for $<statement> {
- $past.push( $( $_ ) );
+ $past.push($_.ast);
}
make $past;
}
- method if_statement($/) {
- my $cond := $( $<expression> );
- my $then := $( $<block> );
- my $past := PAST::Op.new( $cond, $then,
+ method statement:sym<if>($/) {
+ my $cond := $<EXPR>.ast;
+ my $past := PAST::Op.new( $cond, $<then>.ast,
:pasttype('if'),
:node($/) );
if $<else> {
- $past.push( $( $<else>[0] ) );
+ $past.push($<else>[0].ast);
}
make $past;
}
@@ -133,13 +130,12 @@ with the make function.
At this point it's wise to spend a few words on the make function, the parse
actions and how the whole PAST is created by the individual parse actions.
-Have another look at the action method if_statement. In the first two lines,
+Have another look at the action method statement:sym<if>. In the first two lines,
we request the result objects for the conditional expression and the "then"
block. When were these result objects created? How can we be sure they're there?
-The answer lies in the order in which the parse actions are executed. The
-special "{*}" symbol that triggers a parse action invocation, is usually placed
-at the end of the rule. For this input string: "if 42 then x = 1 end" this
-implies the following order:
+The answer lies in the order in which the parse actions are executed. The parse action invocation
+usually occurs at the end of the rule. For this input string: "if 42 then x = 1 end" this implies
+the following order:
=over 4
@@ -147,9 +143,9 @@ implies the following order:
=item 2. parse statement
-=item 3. parse if_statement
+=item 3. parse statement:sym<if>
-=item 4. parse expression
+=item 4. parse EXPR
=item 5. parse integer
@@ -159,7 +155,7 @@ implies the following order:
=item 8. parse statement
-=item 9. parse assignment
+=item 9. parse statement:sym<assignment>
=item 10. parse identifier
@@ -179,7 +175,7 @@ implies the following order:
=back
-As you can see, PAST nodes are created in the leafs of the parse tree first,
+As you can see, PAST nodes are created in the leaves of the parse tree first,
so that later, action methods higher in the parse tree can retrieve them.
=head2 Throwing Exceptions
@@ -188,12 +184,11 @@ The grammar rule for the "throw" statement is really quite easy, but it's useful
to discuss the parse action, as it shows the use of generating custom PIR
instructions. First the grammar rule:
- rule throw_statement {
- 'throw' <expression>
- {*}
+ rule statement:sym<throw> {
+ <sym> <EXPR>
}
-I assume you know how to update the "statement" rule by now. The throw statement
+The throw statement
will compile down to Parrot's "throw" instruction, which takes one argument.
In order to generate a custom Parrot instruction, the instruction can be
specified in the C<:pirop> attribute when creating a C<PAST::Op> node. Any child
@@ -201,16 +196,15 @@ nodes are passed as arguments to this instruction, so we need to pass the result
object of the expression being thrown as a child of the C<PAST::Op> node
representing the "throw" instruction.
- method throw_statement($/) {
- make PAST::Op.new( $( $<expression> ),
- :pirop('throw'),
+ method statement:sym<throw>($/) {
+ make PAST::Op.new( $<EXPR>.ast,
+ :pirop('die'),
:node($/) );
}
-
=head2 What's Next?
-In this episode we implemented two more statement types of Squaak. You should
+In this episode we implemented two more Squaak statement types. You should
get a general idea of how and when PAST nodes are created, and how they can be
retrieved as sub (parse) trees. In the next episode we'll take a closer look at
variable scope and subroutines.
@@ -266,31 +260,29 @@ C<PAST::Op> nodes you should create.
The while-statement is straightforward:
- method while_statement($/) {
- my $cond := $( $<expression> );
- my $body := $( $<block> );
+ method statement:sym<while>($/) {
+ my $cond := $<EXPR>.ast;
+ my $body := $<block>.ast;
make PAST::Op.new( $cond, $body, :pasttype('while'), :node($/) );
}
The try-statement is a bit more complex. Here are the grammar rules and
action methods.
- rule try_statement {
- 'try' $<try>=<block>
+ rule statement:sym<try> {
+ <sym> $<try>=<block>
'catch' <exception>
$<catch>=<block>
'end'
- {*}
}
rule exception {
<identifier>
- {*}
}
- method try_statement($/) {
+ method statement:sym<try>($/) {
## get the try block
- my $try := $( $<try> );
+ my $try := $<try>.ast;
## create a new PAST::Stmts node for
## the catch block; note that no
@@ -299,12 +291,12 @@ action methods.
## exception object. For now this will
## do.
my $catch := PAST::Stmts.new( :node($/) );
- $catch.push( $( $<catch> ) );
+ $catch.push($<catch>.ast);
## get the exception identifier;
## set a declaration flag, the scope,
## and clear the viviself attribute.
- my $exc := $( $<exception> );
+ my $exc := $<exception>.ast;
$exc.isdecl(1);
$exc.scope('lexical');
$exc.viviself(0);
@@ -323,17 +315,10 @@ action methods.
}
method exception($/) {
- our $?BLOCK;
- my $past := $( $<identifier> );
- $?BLOCK.symbol( $past.name(), :scope('lexical') );
+ my $past := $<identifier>.ast;
make $past;
}
-Instead of putting "identifier" after the "catch" keyword, we made it a
-separate rule, with its own action method. This allows us to insert the
-identifier into the symbol table of the current block (the try-block),
-before the catch block is parsed.
-
First the PAST node for the try block is retrieved. Then, the catch block is
retrieved, and stored into a C<PAST::Stmts> node. This is needed, so that we
can make sure that the instructions that retrieve the exception object come
@@ -378,41 +363,53 @@ generated, and see if you can recognize which instructions make up the
conditional expression, which represent the "then" block, and which represent
the "else" block (if any).
+Note that this may not be the exact result produced when you try it. Sub ids, block numbers, and
+register numbers may differ, but it should be analogous.
+
> if 1 then else end
+ .HLL "squaak"
+
.namespace []
- .sub "_block16"
- new $P18, "Integer"
- assign $P18, 1
-
- ## this is the condition:
- if $P18, if_17
-
- ## this is invoking the else-block:
- get_global $P21, "_block19"
- newclosure $P21, $P21
- $P20 = $P21()
- set $P18, $P20
- goto if_17_end
-
- ## this is invoking the then-block:
- if_17:
- get_global $P24, "_block22"
- newclosure $P24, $P24
- $P23 = $P24()
- set $P18, $P23
- if_17_end:
- .return ($P18)
+ .sub "_block11" :anon :subid("10_1279319328.02043")
+ .annotate 'line', 0
+ .const 'Sub' $P20 = "12_1279319328.02043"
+ capture_lex $P20
+ .const 'Sub' $P17 = "11_1279319328.02043"
+ capture_lex $P17
+ .annotate 'line', 1
+ set $I15, 1
+ if $I15, if_14
+ .const 'Sub' $P20 = "12_1279319328.02043"
+ capture_lex $P20
+ $P21 = $P20()
+ set $P13, $P21
+ goto if_14_end
+ if_14:
+ .const 'Sub' $P17 = "11_1279319328.02043"
+ capture_lex $P17
+ $P18 = $P17()
+ set $P13, $P18
+ if_14_end:
+ .return ($P13)
.end
+
+ .HLL "squaak"
+
.namespace []
- .sub "_block22" :outer("_block16")
- .return ()
+ .sub "_block19" :anon :subid("12_1279319328.02043") :outer("10_1279319328.02043")
+ .annotate 'line', 1
+ .return ()
.end
+
+ .HLL "squaak"
+
.namespace []
- .sub "_block19" :outer("_block16")
- .return ()
+ .sub "_block16" :anon :subid("11_1279319328.02043") :outer("10_1279319328.02043")
+ .annotate 'line', 1
+ .return ()
.end
=back
View
232 examples/languages/squaak/doc/tutorial_episode_5.pod
@@ -30,9 +30,8 @@ For each individual scope, there's a separate symbol table.
Squaak has a so-called do-block statement, that is defined below.
- rule do_block {
- 'do' <block> 'end'
- {*}
+ rule statement:sym<do> {
+ <sym> <block> 'end'
}
Each do-block defines a new scope; local variables declared between the C<do>
@@ -94,9 +93,8 @@ The following is the grammar rule for variable declarations. This is a type of
statement, so I assume you know how to extend the statement rule to allow for
variable declarations.
- rule variable_declaration {
- 'var' <identifier> ['=' <expression>]?
- {*}
+ rule statement:sym<var> {
+ <sym> <identifier> ['=' <EXPR>]?
}
A local variable is declared using the C<var> keyword, and has an optional
@@ -104,9 +102,9 @@ initialization expression. If the latter is missing, the variable's value
defaults to the undefined value called "Undef". Let's see what the parse action
looks like:
- method variable_declaration($/) {
+ method statement:sym<var>($/) {
# get the PAST for the identifier
- my $past := $( $<identifier> );
+ my $past := $<identifier>.ast;
# this is a local (it's being defined)
$past.scope('lexical');
@@ -115,10 +113,10 @@ looks like:
$past.isdecl(1);
# check for the initialization expression
- if $<expression> {
+ if $<EXPR> {
# use the viviself clause to add a
# an initialization expression
- $past.viviself( $( $<expression>[0] );
+ $past.viviself($<EXPR>[0].ast);
}
else { # no initialization, default to "Undef"
$past.viviself('Undef');
@@ -152,46 +150,27 @@ are parsed (and their parse actions are executed -- these might need to enter
symbols in the block's symbol table), we add a few extra parse actions. Let's
take a look at them.
- rule TOP {
- {*} #= open
- <statement>*
- [ $ || <.panic: syntax error> ]
- {*} #= close
- }
+Add this token to the grammar:
-We now have two parse actions for TOP, which are differentiated by an
-additional key parameter. The first parse action is executed before any input
-is parsed, which is particularly suitable for any initialization actions you
-might need. The second action (which was already there) is executed after the
-whole input string is parsed. Now we can create a C<PAST::Block> node before
-any statements are parsed, so that when we need the current block, it's there
-(somewhere, later we'll see where exactly). Let's take a look at the parse
-action for TOP.
-
- method TOP($/, $key) {
- our $?BLOCK;
- our @?BLOCK;
-
- if $key eq 'open' {
- $?BLOCK := PAST::Block.new( :blocktype('declaration'),
- :node($/) );
+ token begin_TOP {
+ <?>
+ }
- @?BLOCK.unshift($?BLOCK);
- }
- else { # key is 'close'
- my $past := @?BLOCK.shift();
+It uses something we haven't seen before, <?>. The null pattern <?> always returns true without
+consuming any text. Tokens consisting of only <?> are frequently used to invoke additional action
+methods.
- for $<statement> {
- $past.push( $( $_ ) );
- }
+Add this method to Actions.pm:
- make $past;
- }
+ method begin_TOP ($/) {
+ our $?BLOCK := PAST::Block.new(:blocktype<declaration>, :node($/),
+ :hll<squaak>);
+ our @?BLOCK;
+ @?BLOCK.unshift($?BLOCK);
}
-Let's see what's happening here. When the parse action is invoked for the first
-time (when C<$key> equals "open"), a new C<PAST::Block> node is created and
-assigned to a strange-looking (if you don't know Perl, like me. Oh wait,
+We create a new C<PAST::Block> node and
+assign it to a strange-looking (if you don't know Perl, like me. Oh wait,
this is Perl. Never mind..) variable called C<$?BLOCK>. This variable is
declared as "our", which means that it is a package variable. This means that
the variable is shared by all methods in the same package (or class), and,
@@ -203,98 +182,61 @@ After that, this block is unshifted onto another funny-looking variable, called
C<@?BLOCK>. This variable has a "@" sigil, meaning this is an array. The
unshift method puts its argument on the front of the list. In a sense, you
could think of the front of this list as the top of a stack. Later we'll see
-why this stack is necessary.
-
-This C<@?BLOCK> variable is also declared with "our", meaning it's also
-package-scoped. However, as we call a method on this variable, it should have
-been already created; otherwise you'd invoke the method on an undefined
-("Undef") variable. So, this variable should have been created before the
-parsing starts. We can do this in the compiler's main program, squaak.pir.
-Before doing so, let's take a quick look at the "else" part of the parse action
+why this stack is necessary. This C<@?BLOCK> variable is also declared with "our", meaning it's also
+package-scoped. Since it's an array variable, it is automatically initialized with an empty
+ResizablePMCArray.
+
+Now we need to modify our TOP rule to call begin_TOP.
+
+ rule TOP {
+ <.begin_TOP>
+ <statementlist>
+ [ $ || <.panic: "Syntax error"> ]
+ }
+
+"<.begin_TOP>" is just like <begin_TOP>, calling the subrule begin_TOP, with one difference: The
+<.subrule> form does not capture. Normally, when match a subrule <foo>, $<foo> on the match object
+is bound to the subrule's match result. With <.foo>, $<foo> is not bound.
+
+The parse action for begin_TOP is executed before any input
+is parsed, which is particularly suitable for any initialization actions you
+might need. The action for TOP is executed after the
+whole input string is parsed. Now we can create a C<PAST::Block> node before
+any statements are parsed, so that when we need the current block, it's there
+(somewhere, later we'll see where exactly). Let's take a look at the parse
+action for TOP.
+
+ method TOP($/, $key) {
+ our @?BLOCK;
+ my $past := @?BLOCK.shift();
+ $past.push($<statementlist>.ast);
+ make $past;
+ }
+
+Let's take a quick look at the updated parse action
for TOP, which is executed after the whole input string is parsed. The
C<PAST::Block> node is retrieved from C<@?BLOCK>, which makes sense, as it was
created in the first part of the method and unshifted on C<@?BLOCK>. Now this
node can be used as the final result object of TOP. So, now we've seen how to
use the scope stack, let's have a look at its implementation.
-=head2 A List Class
-
-We'll implement the scope stack as a C<ResizablePMCArray> object. This is a
-built-in PMC type. However, this built-in PMC does not have any methods; in
-PIR it can only be used as an operand of the built-in shift and unshift
-instructions. In order to allow us to write this as method calls, we create a
-new subclass of ResizablePMCArray. The code below creates the new class and
-defines the methods we need.
-
- 1 .namespace []
-
- 2 .sub 'initlist' :anon :init :load
- 3 subclass $P0, 'ResizablePMCArray', 'List'
- 4 new $P1, 'List'
- 5 set_hll_global ['Squaak';'Grammar';'Actions'], '@?BLOCK', $P1
- 6 .end
-
- 7 .namespace ['List']
-
- 8 .sub 'unshift' :method
- 9 .param pmc obj
- 10 unshift self, obj
- 11 .end
-
- 12 .sub 'shift' :method
- 13 shift $P0, self
- 14 .return ($P0)
- 15 .end
-
-Well, here you have it: part of the small amount of PIR code you need to write
-for the Squaak compiler (there's some more for some built-in subroutines, more
-on that later). Let's discuss this code snippet in more detail (if you know
-PIR, you could skip this section).
-Line 1 resets the namespace to the root namespace in Parrot, so that the sub
-C<initlist> is stored in that namespace. The sub 'initlist' defined in lines
-2-6 has some flags: C<:anon> means that the sub is not stored by name in the
-namespace, implying it cannot be looked up by name. The :init flag means that
-the sub is executed before the main program (the "main" sub) is executed. The
-C<:load> flag makes sure that the sub is executed if this file was compiled and
-loaded by another file through the load_bytecode instruction. If you don't
-understand this, no worries. You can forget about it now. In any case, we know
-for sure there's a List class when we need it, because the class creation is
-done before running the actual compiler code.
-Line 3 creates a new subclass of ResizablePMCArray, called "List". This results
-in a new class object, which is left in register $P0, but it's not used after
-that.
-Line 4 creates a new List object, and stores it in register $P1. Line 5,
-stores this List object by name of C<@?BLOCK> (that name should ring a bell
-now...) in the namespace of the Actions class. The semicolons in between the
-several key strings indicate nested namespaces. So, lines 4 and 5 are important,
-because the create the @?BLOCK variable and store it in a place that can be
-accessed from the action methods in the Actions class.
-Lines 7-11 define the unshift method, which is a method in the "List" namespace.
-This means that it can be invoked as a method on a List object. As the sub is
-marked with the :method flag, the sub has an implicit first parameter called
-"self", which refers to the invocant object. The unshift method invokes
-Parrot's unshift instruction on self, passing the obj argument as the second
-operand. So, obj is unshifted onto self, which is the List object itself.
-Finally, lines 12-15 define the "shift" method, which does the opposite of
-"unshift", removing the first element and returning it to its caller.
-
=head2 Storing Symbols
Now, we set up the necessary infrastructure to store the current scope block,
and we created a datastructure that acts as a scope stack, which we will need
-later. We'll now go back to the parse action for variable_declaration, because
+later. We'll now go back to the parse action for statement:sym<var>, because
we didn't enter the declared variable into the current block's symbol table yet.
We'll see how to do that now.
First, we need to make the current block accessible from the method
-variable_declaration. We've already seen how to do that, using the "our"
+statement:sym<var>. We've already seen how to do that, using the "our"
keyword. It doesn't really matter where in the action method we enter the
symbol's name into the symbol table, but let's do it at the end, after the
initialization stuff. Naturally, we're only going to enter the symbol if it's
not there already; duplicate variable declarations (in the same scope) should
result in an error message (using the panic method of the match object).
-The code to be added to the method variable_declaration looks then like this:
+The code to be added to the method statement:sym<var> looks then like this:
- method variable_declaration($/) {
+ method statement:sym<var>($/) {
our $?BLOCK;
# get the PAST node for identifier
# set the scope and declaration flag
@@ -304,7 +246,7 @@ The code to be added to the method variable_declaration looks then like this:
if $?BLOCK.symbol( $name ) {
# symbol is already present
- $/.panic("Error: symbol " ~ $name ~ " was already defined.\n");
+ $/.CURSOR.panic("Error: symbol " ~ $name ~ " was already defined.\n");
}
else {
$?BLOCK.symbol( $name, :scope('lexical') );
@@ -329,14 +271,15 @@ programming language. Hope to catch you later!
=item *
In this episode, we changed the action method for the C<TOP> rule; it is now
-invoked twice, once at the beginning of the parse, once at the end of the parse.
+invokes the new begin_TOP action at the beginning of the parse.
The block rule, which defines a block to be a series of statements, represents
a new scope. This rule is used in for instance if-statement
(the then-part and else-part), while-statement (the loop body) and others.
-Update the parse action for block so it is invoked twice; once before parsing
-the statements, during which a new C<PAST::Block> is created and stored onto the
-scope stack, and once after parsing the statements, during which this PAST node
-is set as the result object. Make sure C<$?BLOCK> is always pointing to the
+Add a new begin_block rule consisting of <?>; in the action for it, create a new PAST::Block and
+store it onto the scope stack.
+Update the rule for block so that it calls begin_block before parsing
+the statements. Update the parse action for block after parsing the statements, during which this
+PAST node is set as the result object. Make sure C<$?BLOCK> is always pointing to the
current block. In order to do this exercise correctly, you should understand
well what the shift and unshift methods do, and why we didn't implement methods
to push and pop, which are more appropriate words in the context of a (scope)
@@ -385,24 +328,35 @@ or at the back.
I hope it's clear what I mean here... otherwise, have a look at the code,
and try to figure out what's happening:
+ # In src/Squaak/Grammar.pm
+ token begin_block {
+ <?>
+ }
+
+ rule block {
+ <.begin_block>
+ <statement>*
+ }
+
+ # In src/Squaak/Actions.pm
+ method begin_block {
+ our $?BLOCK;
+ our @?BLOCK;
+ $?BLOCK := PAST::Block.new(:blocktype('immediate'),
+ :node($/));
+ @?BLOCK.unshift($?BLOCK);
+ }
+
method block($/, $key) {
- our $?BLOCK;
- our @?BLOCK;
- if $key eq 'open' {
- $?BLOCK := PAST::Block.new(
- :blocktype('immediate'),
- :node($/) );
- @?BLOCK.unshift($?BLOCK);
- }
- else {
- my $past := @?BLOCK.shift();
- $?BLOCK := @?BLOCK[0];
-
- for $<statement> {
- $past.push( $( $_ ) );
- }
- make $past;
+ our $?BLOCK;
+ our @?BLOCK;
+ my $past := @?BLOCK.shift();
+ $?BLOCK := @?BLOCK[0];
+
+ for $<statement> {
+ $past.push($_.ast);
}
+ make $past;
}
=cut
View
115 examples/languages/squaak/doc/tutorial_episode_6.pod
@@ -77,14 +77,26 @@ for subroutine definitions:
'sub' <identifier> <parameters>
<statement>*
'end'
- {*}
}
rule parameters {
- '(' [<identifier> [',' <identifier>]* ]? ')'
- {*}
+ '(' [<identifier> ** ',']? ')'
}
+And we need to add it to rule stat_or_def:
+
+ rule stat_or_def {
+ | <statement>
+ | <sub_definition>
+ }
+
+Appropriately modifying the action method is simple. It's analogous to the action method for
+expression.
+
+"**" is the repetition specifier; "<identifier> ** ','" matches <identifier> separated by commas.
+Since it's in a rule and there is space between the ** and its operands, whitespace is allowed
+between the commas and both the preceding and following identifiers.
+
This is rather straightforward, and the action methods for these rules are
quite simple, as you will see. First, however, let's have a look at the rule
for sub definitions. Why is the sub body defined as <statement>* and not as a
@@ -112,7 +124,7 @@ symbols right in time. Let's look at the action methods.
# now add all parameters to this block
for $<identifier> {
- my $param := $( $_ );
+ my $param := $_.ast;
$param.scope('parameter');
$past.push($param);
@@ -128,23 +140,23 @@ symbols right in time. Let's look at the action methods.
}
method sub_definition($/) {
- our $?BLOCK;
- our @?BLOCK;
- my $past := $( $<parameters> );
- my $name := $( $<identifier> );
-
- # set the sub's name
- $past.name( $name.name() );
-
- # add all statements to the sub's body
- for $<statement> {
- $past.push( $( $_ ) );
- }
-
- # and remove the block from the scope stack and restore the current block
- @?BLOCK.shift();
- $?BLOCK := @?BLOCK[0];
- make $past;
+ our $?BLOCK;
+ our @?BLOCK;
+ my $past := $<parameters>.ast;
+ my $name := $<identifier>.ast;
+
+ # set the sub's name
+ $past.name($name.name);
+
+ # add all statements to the sub's body
+ for $<statement> {
+ $past.push($_.ast);
+ }
+
+ # and remove the block from the scope stack and restore the current block
+ @?BLOCK.shift();
+ $?BLOCK := @?BLOCK[0];
+ make $past;
}
First, let's check out the parse action for parameters. First, a new
@@ -178,26 +190,29 @@ Episode 5, we already gave some tips on how to create the PAST nodes for a
subroutine invocation. In this section, we'll give a complete description.
First we'll introduce the grammar rules.
- rule sub_call {
+ rule statement:sym<sub_call> {
<primary> <arguments>
- {*}
+ }
+
+ rule arguments {
+ '(' [<EXPR> ** ',']? ')'
}
Not only allows this to invoke subroutines by their name, you can also store
the subroutines in an array or hash field, and invoke them from there. Let's
take a look at the action method, which is really quite straightforward.
- method sub_call($/) {
- my $invocant := $( $<primary> );
- my $past := $( $<arguments> );
+ method statement:sym<sub_call>($/) {
+ my $invocant := $<primary>.ast;
+ my $past := $<arguments>.ast;
$past.unshift($invocant);
make $past;
}
method arguments($/) {
my $past := PAST::Op.new( :pasttype('call'), :node($/) );
- for $<expression> {
- $past.push( $( $_ ) );
+ for $<EXPR> {
+ $past.push($_.ast);
}
make $past;
}
@@ -260,20 +275,17 @@ First, let us look at the BNF of the for-statement:
It's pretty easy to convert this to Perl 6 rules:
- rule for_statement {
- 'for' <for_init> ',' <expression> <step>?
+ rule statement:sym<for> {
+ <sym> <for_init> ',' <EXPR> <step>?
'do' <statement>* 'end'
- {*}
}
rule step {
- ',' <expression>
- {*}
+ ',' <EXPR>
}
rule for_init {
- 'var' <identifier> '=' <expression>
- {*}
+ 'var' <identifier> '=' <EXPR>
}
Pretty easy huh? Let's take a look at the semantics. A for-loop is just
@@ -323,12 +335,12 @@ which is local to the for-statement. Let's check out the rule for for_init:
:node($/) );
@?BLOCK.unshift($?BLOCK);
- my $iter := $( $<identifier> );
+ my $iter := $<identifier>.ast;
## set a flag that this identifier is being declared
$iter.isdecl(1);
$iter.scope('lexical');
## the identifier is initialized with this expression
- $iter.viviself( $( $<expression> ) );
+ $iter.viviself( $<EXPR>.ast );
## enter the loop variable into the symbol table.
$?BLOCK.symbol($iter.name(), :scope('lexical'));
@@ -339,16 +351,22 @@ which is local to the for-statement. Let's check out the rule for for_init:
So, just as we created a new C<PAST::Block> for the subroutine in the action
method for parameters, we create a new C<PAST::Block> for the for-statement in
the action method that defines the loop variable. (Guess why we made for-init
-a subrule, and didn't put in "C<var> <ident&gt = <expression>" in the rule of
+a subrule, and didn't put in "C<var> <ident&gt = <EXPR>" in the rule of
for-statement). This block is the place to live for the loop variable. The
loop variable is declared, initialized using the viviself attribute, and
entered into the new block's symbol table. Note that after creating the new
C<PAST::Block> object, we put it onto the stack scope.
+The action method for step is simple:
+
+ method step($/) {
+ make $<EXPR>.ast;
+ }
+
Now, the action method for the for statement is quite long, so I'll just
embed my comments, which makes reading it easier.
- method for_statement($/) {
+ method statement:sym<for>($/) {
our $?BLOCK;
our @?BLOCK;
@@ -356,7 +374,7 @@ First, get the result object of the for statement initialization rule; this
is the C<PAST::Var> object, representing the declaration and initialization
of the loop variable.
- my $init := $( $<for_init> );
+ my $init := $<for_init>.ast;
Then, create a new node for the loop variable. Yes, another one (besides the
one that is currently contained in the C<PAST::Block>). This one is used when
@@ -381,7 +399,7 @@ statement PAST nodes onto it.
my $body := @?BLOCK.shift();
$?BLOCK := @?BLOCK[0];
for $<statement> {
- $body.push($($_));
+ $body.push($_.ast);
}
If there was a step, we use that value; otherwise, we use assume a default
@@ -392,8 +410,9 @@ exercise to the reader.
my $step;
if $<step> {
- my $stepsize := $( $<step>[0] );
- $step := PAST::Op.new( $iter, $stepsize, :pirop('add'), :node($/) );
+ my $stepsize := $<step>[0].ast;
+ $step := PAST::Op.new( $iter, $stepsize,
+ :pirop('add__OP+'), :node($/) );
}
else { ## default is increment by 1
$step := PAST::Op.new( $iter, :pirop('inc'), :node($/) );
@@ -404,13 +423,13 @@ incrementing statement to $body.
$body.push($step);
-The loop condition uses the "<=" operator, and compares the loop variable
-with the maximum value that was specified.
+The loop condition uses the isle opcode, which checks that its first operand is less than or equal
+to its second, and compares the loop variable with the maximum value that was specified.
## while loop iterator <= end-expression
- my $cond := PAST::Op.new( $iter,
- $( $<expression> ),
- :name('infix:<=') );
+ my $cond := PAST::Op.new( :pirop<isle__IPP>,
+ $iter,
+ $<EXPR>.ast );
Now we have the PAST for the loop condition and the loop body, so now create
a PAST to represent the (while) loop.
View
288 examples/languages/squaak/doc/tutorial_episode_7.pod
@@ -13,7 +13,7 @@ expressions.
=head2 Operators, precedence and parse trees
We will first briefly introduce the problem with recursive-descent parsers
-(which parsers generated with the PCT are) when parsing expressions. Consider
+(which parsers generated with NQP are) when parsing expressions. Consider
the following mini-grammar, which is a very basic calculator.
rule TOP {
@@ -116,29 +116,39 @@ into account. However, it's been about 6 years that I did this in a CS class,
and I don't remember the particular details. If you really want to know, check
out the links at the end of the previous section. It's actually worth checking
out. For now, I'll just assume you know what the problem is, so that I'll
-introduce the solution for PCT-based compilers immediately.
+introduce the solution for NQP-based compilers immediately.
At some point when parsing your input, you might encounter an expression. At
this point, we'd like the parser to switch from top-down to bottom-up parsing.
-The Parrot Grammar Engine supports this, and is used as follows:
+NQP-rx supports this, and is used as follows:
- rule expression is optable { ... }
+ <EXPR>
-Note that we used the word C<expression> here, but you can name it anything.
-This declares that, whenever you need an expression, the bottom-up parser is
-activated. Of course, this "optable" must be populated with some operators that
-we need to be able to parse. This can be done by declaring operators as follows:
+Of course, the optable must be populated with some operators that
+we need to be able to parse and it might be told what precedence and associativity they have. The
+easiest way to do this is by setting up precedence levels in an C<INIT> block:
- proto 'infix:*' is tighter('infix:+') { ... }
+ INIT {
+ Squaak::Grammar.O(':prec<t>, :assoc<left>', '%additive');
+ Squaak::Grammar.O(':prec<u>, :assoc<lefT>', '%multiplicative');
+ }
+
+In this C<INIT> block, we use the C<O> method of the compiler to set up two precedence levels: one
+for operators like addition (named C<%additive>), and one for operators like multiplication (named
+C<%multiplicative>). Each of themhas a ":prec" value and an ":assoc" value. ":prec" determines the
+precedence. Lexicographically greater values indicate higher precedence, so C<%additive> operators,
+with a precedence value of "t", have lower precedence than C<%multiplicative> operators with a
+precedence value of "u".":assoc" defines the associativity of the operators. If C<@> is a left
+associative operator, then 1 @ 2 @ 3 is equivalent to (1 @ 2) @ 3. However, if C<@> is right
+associative, then 1 @ 2 @ 3 is equivalent to 1 @ (2 @ 3). There are other options for the
+associativity, but we'll discuss them as we come to them.
+
+ token infix:sym<*> { <sym> <O('%multiplicative, :pirop<mul>')> }
This defines the operator C<*> (the C<infix:> is a prefix that tells the
operator parser that this operator is an infix operator; there are other types,
-such as prefix, postfix and others). The C<is tighter> clause tells that the
-C<*> operator has a higher precedence than the C<+> operator. As you could have
-guessed, there are other clauses to declare equivalent precedence (C<is equiv>)
-and lower precedence (C<is looser>).It is very important to spell all clauses,
-such as C<is equiv> correctly (for instance, not C<is equil>), otherwise you
-might get some cryptic error message when trying to run your compiler. See the
-references section for the optable guide, that has more details on this.
+such as prefix, postfix and others). As you can see, it uses the O rule to specify that it is part
+of the C<%multiplicative> group of operators. The ":pirop" value specifies that the operator should
+compile to the C&l