Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP
Browse files

+ Compiled website

  • Loading branch information...
commit f0346a2c7da1cf3cf409fc5e1ed9c9fd4ba35a74 1 parent b8b4fa7
@kschiess authored
View
109 website/build/contribute.html
@@ -0,0 +1,109 @@
+<!DOCTYPE html>
+<html>
+ <head>
+ <meta content="text/html;charset=UTF-8" http-equiv="Content-type" />
+ <title>parslet -Contribute</title>
+ <meta content="Kaspar Schiess (http://absurd.li)" name="author" />
+ <link href="images/favicon3.ico" rel="shortcut icon" /><link href="/cod/stylesheets/site.css" media="screen" rel="stylesheet" type="text/css" /><link href="/cod/stylesheets/sh_whitengrey.css" media="screen" rel="stylesheet" type="text/css" /><script src="/cod/javascripts/sh_main.min.js" type="text/javascript"></script><script src="/cod/javascripts/sh_ruby.min.js" type="text/javascript"></script></head>
+ <body class="code" onload="sh_highlightDocument();">
+ <div id="everything">
+ <div class="main_menu"><img alt="Parslet Logo" src="/cod/images/parsley_logo.png" /><ul>
+ <li><a href="index.html">about</a></li>
+ <li><a href="get-started.html">get started</a></li>
+ <li><a href="install.html">install</a></li>
+ <li><a href="documentation.html">documentation</a></li>
+ <li><a href="contribute.html">contribute</a></li>
+ </ul>
+ </div>
+ <div class="content">
+ <h1>Contribute</h1><p>Find parslet to be really useful? Or just found a bug that is really ruining
+the day for you? Please contribute! Find the code on
+<a href="http://github.com/kschiess/parslet">github</a>.</p>
+<h2>Contact</h2>
+<p>Join us on <span class="caps">IRC</span> in #parslet.</p>
+<p>Discussion and patches (or the odd cry for &#8216;Help! How can I parse X?&#8217;) should
+go to our mailing list at
+<a href="mailto:ruby.parslet@librelist.com">ruby.parslet@librelist.com</a>. Just write a
+short message to that address and <a href="http://librelist.com">librelist</a> will
+subscribe you. <span class="caps">NNTP</span>/web interface can be had through
+<a href="http://dir.gmane.org/gmane.comp.lang.ruby.parslet">gmane.org</a>:
+<code>gmane.comp.lang.ruby.parslet</code>.</p>
+<h2>Bugs</h2>
+<p>Log in to github and open a bug ticket
+<a href="https://github.com/kschiess/parslet/issues">here</a>. Please be sure to include
+the version of parslet and Ruby; maybe you can even provide some code that
+exhibits the bug?</p>
+<p>And of course if you provide a properly tested patch, you&#8217;ll be our hero and
+get a place in the space below for lifetime.</p>
+<h2>Projects</h2>
+<p>Have you got a project that uses parslet? Please write
+<a href="mailto:kaspar.schiess@absurd.li">us</a> about it.</p>
+<p><a href="https://github.com/matthewd/capuchin"><strong>Capuchin</strong></a></p>
+<p>A javascript compiler that targets the rubinius VM. (Matthew Draper)</p>
+<p><a href="https://github.com/kschiess/parslet/tree/master/example/"><strong>Examples</strong></a></p>
+<p>In here, you can find a parser for a lisp like language and much more.</p>
+<p><a href="https://github.com/postmodern/net-http-server"><strong>Net::<span class="caps">HTTP</span>::Server</strong></a></p>
+<p>A really small and elegant <span class="caps">HTTP</span> server written in Ruby. Think Webrick. Using
+parslet. (Postmodern)</p>
+<p><a href="https://github.com/undees/thnad"><strong>thnad</strong></a></p>
+<p>Thnad is a tiny programming language with so few features that it is not useful for anything at all &#8212; except showing how to write a compiler in half an hour.</p>
+<p><a href="https://github.com/kschiess/wt"><strong>wt</strong> aka working title</a></p>
+<p>A small parser that compiles to a postscript file. This is mostly for demoing
+the various aspects of a parser.</p>
+<p><a href="https://github.com/rk/werd"><strong>Werd.rb</strong></a></p>
+<p>A variant of Chris Pound&#8217;s word generator written in Ruby, with some
+improvements. (Robert Kosek)</p>
+<p><a href="https://github.com/meh/versionub"><strong>versionub</strong></a></p>
+<p>A semantic version parser. (meh)</p>
+<h2>Thanks for all the fish &#8212; Contributions</h2>
+<ul>
+ <li><strong>rogerbraun</strong> (<a href="https://github.com/rogerbraun">rogerbraun</a>) for being my
+ unicode tester.</li>
+</ul>
+<ul>
+ <li><strong>meh</strong> (<a href="http://meh.paranoid.pk/">meh</a>) for taking a real close look.</li>
+</ul>
+<ul>
+ <li><strong>John Mettraux</strong> (<a href="http://jmettraux.wordpress.com/">jmettraux</a>) for the really
+ nice <span class="caps">JSON</span> example and for pushing parslet beyond its limits.</li>
+</ul>
+<ul>
+ <li><strong>Josep M. Bach</strong> (<a href="http://www.txustice.me/">txus</a>) for minding the small
+ things that make a big difference.</li>
+</ul>
+<ul>
+ <li><strong>Matthew Draper</strong> (<a href="http://matthewd.net/">matthewd</a>) for bothering with my
+ broken <span class="caps">CSS</span>.</li>
+</ul>
+<ul>
+ <li><strong>Hal Brodigan</strong> (<a href="http://postmodern.github.com/">postmodern</a>) for solving our
+ email parsing needs!</li>
+</ul>
+<ul>
+ <li><strong>R. Konstantin Haase</strong> (<a href="http://rkh.im/">rhk</a>) for rspec matchers that help
+ stamp out, eliminate and abolish redundancy.</li>
+</ul>
+<ul>
+ <li><strong>Florian Hanke</strong> (<a href="http://floere.github.com">floere</a>) has given a lot of very
+ inspiring input for parslet. His questions have been key to rounding off the
+ corners and making the library as aesthetic as it is. And just look at the
+ logo.</li>
+</ul>
+<ul>
+ <li><strong>Kaspar Schiess</strong> (<a href="http://www.absurd.li">absurd.li</a>) for being brave enough
+ to actually add another parser library to a field that&#8217;s already bursting
+ at the seams.</li>
+</ul></div>
+ <div class="copyright"><p><span class="caps">MIT</span> License, 2010-2012, &#169; <a href="http://absurd.li">Kaspar Schiess</a><br/>
+ Logo by <a href="http://floere.github.com">Florian Hanke</a>, <a href="http://creativecommons.org/licenses/by/1.0/">CC Attribution</a> license</p></div>
+ <script type="text/javascript">var _gaq = _gaq || [];
+ _gaq.push(['_setAccount', 'UA-16365074-2']);
+ _gaq.push(['_trackPageview']);
+ (function() {
+ var ga = document.createElement('script'); ga.type = 'text/javascript'; ga.async = true;
+ ga.src = ('https:' == document.location.protocol ? 'https://ssl' : 'http://www') + '.google-analytics.com/ga.js';
+ var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(ga, s);
+ })();</script>
+ </div>
+ </body>
+</html>
View
70 website/build/documentation.html
@@ -0,0 +1,70 @@
+<!DOCTYPE html>
+<html>
+ <head>
+ <meta content="text/html;charset=UTF-8" http-equiv="Content-type" />
+ <title>parslet -Documentation</title>
+ <meta content="Kaspar Schiess (http://absurd.li)" name="author" />
+ <link href="images/favicon3.ico" rel="shortcut icon" /><link href="/cod/stylesheets/site.css" media="screen" rel="stylesheet" type="text/css" /><link href="/cod/stylesheets/sh_whitengrey.css" media="screen" rel="stylesheet" type="text/css" /><script src="/cod/javascripts/sh_main.min.js" type="text/javascript"></script><script src="/cod/javascripts/sh_ruby.min.js" type="text/javascript"></script></head>
+ <body class="code" onload="sh_highlightDocument();">
+ <div id="everything">
+ <div class="main_menu"><img alt="Parslet Logo" src="/cod/images/parsley_logo.png" /><ul>
+ <li><a href="index.html">about</a></li>
+ <li><a href="get-started.html">get started</a></li>
+ <li><a href="install.html">install</a></li>
+ <li><a href="documentation.html">documentation</a></li>
+ <li><a href="contribute.html">contribute</a></li>
+ </ul>
+ </div>
+ <div class="content">
+ <h1>Documentation</h1><p><a href="get-started.html"><strong>Getting Started</strong></a></p>
+ <p>Are you brand new to parslet? Well then let&#8217;s introduce you guys. This is what
+ you should read and try out first.</p>
+ <p><a href="https://github.com/kschiess/parslet/tree/master/example/"><strong>Examples</strong></a></p>
+ <p>Parslet comes with a lot of examples that explain how to use various aspects.
+ Take a look at those.</p>
+ <p><a href="overview.html"><strong>In depth</strong></a></p>
+ <p>This is the real technical documentation, showing you how to use all aspects
+ of parslet. Especially:</p>
+ <ul>
+ <li><a href="overview.html">Overview</a> explains parslet&#8217;s goals and gives you a bigger
+ picture.</li>
+ <li>Using <a href="parser.html">Parslet::Parser</a> to <strong>write parsers</strong>.</li>
+ <li>Using <a href="transform.html">Parslet::Transform</a> to <strong>transmogrify your intermediary
+ trees</strong>.</li>
+ <li><a href="tricks.html">Tricks</a> for common situations.</li>
+ </ul>
+ <p><strong>Presentations</strong></p>
+ <ul>
+ <li><a href="https://docs.google.com/present/view?id=0AfXgUAUtzyc7ZGZrcG1mNXNfMzIwZ3JjY2c3NW0">Parslet, An Introduction</a> introduces parslet in a few poignant slides. (Bo Jeanes and David Pick)</li>
+ </ul>
+ <p><strong>Blogs</strong></p>
+ <ul>
+ <li><a href="http://florianhanke.com/blog/2011/02/01/parslet-intro.html">Parslet Intro</a>
+ explains quite a few things on how parsers work and on parser
+ metaprogramming. Besides, Florian Hanke also explains how to create an <span class="caps">ERB</span>
+ parser in just a few lines!</li>
+ </ul>
+ <ul>
+ <li><a href="http://jmettraux.wordpress.com/2011/05/11/parslet-and-json/">Parslet and
+ <span class="caps">JSON</span></a> shows how
+ to construct a <span class="caps">JSON</span> parser in a few lines.
+ <a href="http://jmettraux.wordpress.com/about/">John</a> does a great job of explaining
+ how parslet ties back in with railroad diagrams.</li>
+ </ul>
+ <p><a href="http://rubydoc.info/gems/parslet/frames"><strong><span class="caps">YARD</span> Class Documentation</strong></a></p>
+ <p>The <a href="http://rubydoc.info/gems/parslet/frames"><span class="caps">YARD</span> documentation</a> will help you
+ with the nitty gritty. This documentation is real important too. It will be
+ constantly improved! (Thanks linode.com and DockYard for sponsoring this tool.)</p></div>
+ <div class="copyright"><p><span class="caps">MIT</span> License, 2010-2012, &#169; <a href="http://absurd.li">Kaspar Schiess</a><br/>
+ Logo by <a href="http://floere.github.com">Florian Hanke</a>, <a href="http://creativecommons.org/licenses/by/1.0/">CC Attribution</a> license</p></div>
+ <script type="text/javascript">var _gaq = _gaq || [];
+ _gaq.push(['_setAccount', 'UA-16365074-2']);
+ _gaq.push(['_trackPageview']);
+ (function() {
+ var ga = document.createElement('script'); ga.type = 'text/javascript'; ga.async = true;
+ ga.src = ('https:' == document.location.protocol ? 'https://ssl' : 'http://www') + '.google-analytics.com/ga.js';
+ var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(ga, s);
+ })();</script>
+ </div>
+ </body>
+</html>
View
338 website/build/get-started.html
@@ -0,0 +1,338 @@
+<!DOCTYPE html>
+<html>
+ <head>
+ <meta content="text/html;charset=UTF-8" http-equiv="Content-type" />
+ <title>parslet -Get Started</title>
+ <meta content="Kaspar Schiess (http://absurd.li)" name="author" />
+ <link href="images/favicon3.ico" rel="shortcut icon" /><link href="/cod/stylesheets/site.css" media="screen" rel="stylesheet" type="text/css" /><link href="/cod/stylesheets/sh_whitengrey.css" media="screen" rel="stylesheet" type="text/css" /><script src="/cod/javascripts/sh_main.min.js" type="text/javascript"></script><script src="/cod/javascripts/sh_ruby.min.js" type="text/javascript"></script></head>
+ <body class="code" onload="sh_highlightDocument();">
+ <div id="everything">
+ <div class="main_menu"><img alt="Parslet Logo" src="/cod/images/parsley_logo.png" /><ul>
+ <li><a href="index.html">about</a></li>
+ <li><a href="get-started.html">get started</a></li>
+ <li><a href="install.html">install</a></li>
+ <li><a href="documentation.html">documentation</a></li>
+ <li><a href="contribute.html">contribute</a></li>
+ </ul>
+ </div>
+ <div class="content">
+ <h1>Get Started</h1><p>Let&#8217;s develop a small language that allows for simple computation together.
+Here&#8217;s a valid input file for that language:</p>
+<pre class="sh_ruby"><code>
+ puts(1 + 2)
+ puts(4 + 2)
+</code></pre>
+<p>To install the parslet library, please do a</p>
+<pre><code>
+ gem install parslet
+</code></pre>
+<p>Now let&#8217;s write the first part of our parser. For now, we&#8217;ll just recognize
+simple numbers like &#8216;1&#8217; or &#8216;42&#8217;.</p>
+<pre class="sh_ruby"><code title="mini_parser">
+ require 'parslet'
+
+ class Mini &lt; Parslet::Parser
+ rule(:integer) { match('[0-9]').repeat(1) }
+ root(:integer)
+ end
+
+ Mini.new.parse("132432") # =&gt; "132432"@0
+</code></pre>
+<p>Running this example will print &#8220;132432@0&#8221;. Congratulations! You just have
+written your first parser. Running it on the input &#8216;<code>puts(1)</code>&#8217; will
+not work yet. Let&#8217;s see what happens in case of a failure:</p>
+<pre class="sh_ruby"><code>
+ Mini.new.parse("puts(1)") # raises Parslet::ParseFailed
+</code></pre>
+<p>Here&#8217;s the error message provided by that exception: &#8220;Expected at least 1 of
+[0-9] at line 1 char 1.&#8221; parslet tries to find a number there, but can&#8217;t find
+one.</p>
+<p>There are just two lines to the definition of this parser, let&#8217;s go through
+them:</p>
+<pre class="sh_ruby"><code>
+ rule(:integer) { match('[0-9]').repeat(1) }
+</code></pre>
+<p><code>rule</code> lets you create a new parser rule. Inside the block of that
+<code>:integer</code> rule, you find <code>match('[0-9]').repeat(1)</code>.
+This says: &#8220;match a character that is in the range <code>0-9</code>, then
+match any number of those, but at least match one.&#8221;</p>
+<pre class="sh_ruby"><code>
+ root(:integer)
+</code></pre>
+<p>That second line just says: Start parsing at the rule called
+<code>:integer</code>.</p>
+<h2>Addition</h2>
+<p>Let&#8217;s go for simple addition. We&#8217;ll have to allow for spaces in our input,
+since those help make code readable.</p>
+<pre class="sh_ruby"><code>
+ rule(:space) { match('\s').repeat(1) }
+ rule(:space?) { space.maybe }
+</code></pre>
+<p>Two things are new here: (and both in the second line)</p>
+<ul>
+ <li>you can use (&#8216;call&#8217;) other rules in your rules</li>
+ <li><code>.maybe</code>, the same as <code>.repeat(0,1)</code><sup class="footnote" id="fnr1"><a href="#fn1">1</a></sup>, indicating
+ that the thing before it is maybe present once in the input.</li>
+</ul>
+<p>Essentially, you can think about parslet rules as instructing Ruby to &#8220;parse
+this&#8221; and &#8220;parse that&#8221;. Calling other rules can be looked at in the same way;
+you tell Ruby to go off, parse that subrule and then come back with the
+results. This helps when thinking about rule recursion. For example, a
+self-recursive rule like this one will of course create an endless loop:</p>
+<pre class="sh_ruby"><code>
+ rule(:infinity) {
+ infinity &gt;&gt; str(';')
+ }
+</code></pre>
+<p>Even though infinity seems to be delimited by &#8216;;&#8217;, in reality, infinity is
+very long, especially towards the end. There is no way of knowing for the
+parser when to stop processing <code>infinity</code> and start reading
+semicolons. Ergo, we need to make sure we talk about concrete items that
+consume input first, and then do recursion. This way we ensure that our
+grammar terminates, since in a way, it is like a normal program.</p>
+<p>Here&#8217;s the full parser:</p>
+<pre class="sh_ruby"><code title="full_parser">
+ class Mini &lt; Parslet::Parser
+ rule(:integer) { match('[0-9]').repeat(1) &gt;&gt; space? }
+
+ rule(:space) { match('\s').repeat(1) }
+ rule(:space?) { space.maybe }
+
+ rule(:operator) { match('[+]') &gt;&gt; space? }
+
+ rule(:sum) { integer &gt;&gt; operator &gt;&gt; expression }
+ rule(:expression) { sum | integer }
+
+ root :expression
+ end
+
+ def parse(str)
+ mini = Mini.new
+
+ mini.parse(str)
+ rescue Parslet::ParseFailed =&gt; failure
+ puts failure.cause.ascii_tree
+ end
+
+ parse "1 + 2 + 3" # =&gt; "1 + 2 + 3"@0
+ parse "a + 2" # fails, see below
+</code></pre>
+<p>As you can see, the parser got decorated with the <code>space?</code> idiom.
+Every atom of our language consumes the space right after it. This is a useful
+convention that makes top level rules (the important ones) look cleaner.</p>
+<p>Note also the addition of <code>:operator</code>, <code>:sum</code> and
+<code>:expression</code>. The runner code has been extended a bit, so as to
+throw nice explanations of what went wrong when a parse failure is
+encountered. Running the code on &#8216;<code>a + 2</code>&#8217; for example outputs:</p>
+<pre class="output">
+Expected one of [SUM, INTEGER] at line 1 char 1.
+|- Failed to match sequence (INTEGER OPERATOR EXPRESSION) at line 1 char 1.
+| `- Failed to match sequence ([0-9]{1, } SPACE?) at line 1 char 1.
+| `- Expected at least 1 of [0-9] at line 1 char 1.
+| `- Failed to match [0-9] at line 1 char 1.
+`- Failed to match sequence ([0-9]{1, } SPACE?) at line 1 char 1.
+ `- Expected at least 1 of [0-9] at line 1 char 1.
+ `- Failed to match [0-9] at line 1 char 1.
+</pre>
+<p>This is what parslet calls an <code>#error_tree</code>. Not only the output of
+your parser, but also its grammar is constructed like a tree. When things go
+wrong, every branch of the tree has its own reasons for not accepting a given
+input. The <code>#cause</code> method returns those reasons.</p>
+<p>Our grammar has essentially two branches, <code>SUM</code> and
+<code>INTEGER</code>. Can you see why all rules expect a number as the first
+character?</p>
+<h2>Tree output (and what to do about it)</h2>
+<p>But if we leave the negative examples for a second; what happens if the parse
+succeeds? It turns out, not much:</p>
+<pre class="sh_ruby"><code>
+ parse "1 + 2 + 3" # =&gt; "1 + 2 + 3"@0
+</code></pre>
+<p>The only notable difference between input and output is that the output has an
+extra &#8216;@0&#8217; appended to it. This is related to line number tracking and will be
+explained later on (or you can skip ahead and look up
+<code>Parslet::Slice</code>).</p>
+<p>The code we now have parses the input successfully, but doesn&#8217;t do much else.
+Parslet hasn&#8217;t got its own opinion on what to do with your input. By default,
+it will just play it back to you. But parslet provides also a method of
+structuring its output:</p>
+<pre class="sh_ruby"><code title="output_samples">
+ # Without structure: just strings.
+ str('ooo').parse('ooo') # =&gt; "ooo"@0
+ str('o').repeat.parse('ooo') # =&gt; "ooo"@0
+
+ # Added structure: .as(...)
+ str('ooo').as(:ex1).parse('ooo') # =&gt; {:ex1=&gt;"ooo"@0}
+
+ long = str('o').as(:ex2a).repeat.as(:ex2b).parse('ooo')
+ long # =&gt; {:ex2b=&gt;[{:ex2a=&gt;"o"@0}, {:ex2a=&gt;"o"@1}, {:ex2a=&gt;"o"@2}]}
+</code></pre>
+<p>You get to name things the way you want! This is also free. Seriously: parslet
+requires you to add all the structure to its output. Annotate important parts
+of your grammar with <code>.as(:symbol)</code> and get back a tree-like
+structure composed of hashes (sequence), arrays (repetition) and strings (like
+we had initially).</p>
+<p>Once you start naming things, you&#8217;ll notice that what you don&#8217;t name,
+disappears. Parslet assumes that <em>what you don&#8217;t name is unimportant</em>.</p>
+<pre class="sh_ruby"><code title="inline_parser">
+parser = str('a').as(:a) &gt;&gt; str(' ').maybe &gt;&gt;
+ str('+').as(:o) &gt;&gt; str(' ').maybe &gt;&gt;
+ str('b').as(:b)
+parser.parse('a + b') # =&gt; {:a=&gt;"a"@0, :o=&gt;"+"@2, :b=&gt;"b"@4}
+</code></pre>
+<p>Think of this like using a highlighter on your input: What is there not to
+like about neon yellow?</p>
+<h2>Making the parser complete</h2>
+<p>Let&#8217;s look at the complete parser definition that also allows for function
+calls:</p>
+<pre class="sh_ruby"><code title="full_parser">
+class MiniP &lt; Parslet::Parser
+ # Single character rules
+ rule(:lparen) { str('(') &gt;&gt; space? }
+ rule(:rparen) { str(')') &gt;&gt; space? }
+ rule(:comma) { str(',') &gt;&gt; space? }
+
+ rule(:space) { match('\s').repeat(1) }
+ rule(:space?) { space.maybe }
+
+ # Things
+ rule(:integer) { match('[0-9]').repeat(1).as(:int) &gt;&gt; space? }
+ rule(:identifier) { match['a-z'].repeat(1) }
+ rule(:operator) { match('[+]') &gt;&gt; space? }
+
+ # Grammar parts
+ rule(:sum) { integer.as(:left) &gt;&gt; operator.as(:op) &gt;&gt; expression.as(:right) }
+ rule(:arglist) { expression &gt;&gt; (comma &gt;&gt; expression).repeat }
+ rule(:funcall) { identifier.as(:funcall) &gt;&gt; lparen &gt;&gt; arglist.as(:arglist) &gt;&gt; rparen }
+
+ rule(:expression) { funcall | sum | integer }
+ root :expression
+end
+
+require 'pp'
+pp MiniP.new.parse("puts(1 + 2 + 3, 45)")
+</code></pre>
+<p>That&#8217;s really all there is to it &#8212; our language is a really simple language.
+When fed with a string like &#8217;<code>puts(1 + 2 + 3, 45)</code>, our parser outputs
+the following:</p>
+<pre class="output">
+{:funcall=&gt;"puts"@0,
+ :arglist=&gt;
+ [{:left=&gt;{:int=&gt;"1"@5},
+ :op=&gt;"+ "@7,
+ :right=&gt;{:left=&gt;{:int=&gt;"2"@9}, :op=&gt;"+ "@11, :right=&gt;{:int=&gt;"3"@13}}},
+ {:int=&gt;"45"@16}]}
+</code></pre>
+<p>Parslet calls this the <em>intermediary tree</em>. There are three types of nodes in
+this tree:</p>
+<ul>
+ <li><strong>Hashes</strong>: a node that has named subtrees</li>
+ <li><strong>Arrays</strong>: a node storing a collection of sub-nodes</li>
+ <li><strong>Strings</strong> are the leaves, containing the <em>accepted source</em></li>
+</ul>
+<p>The format of this tree is easy to work with and to read. Here&#8217;s what the
+above tree would look like as a graphic:</p>
+<p><img src="images/ast.png" alt="" /></p>
+<h2>Where to go from here: An Interpreter</h2>
+<p>As nice as the format above is for printing and looking at &#8211; it may be
+difficult at times to get the information out of it again. Let&#8217;s look at how
+to transform the tree:</p>
+<pre class="sh_ruby"><code>
+class SimpleTransform &lt; Parslet::Transform
+ rule(funcall: 'puts', arglist: sequence(:args)) {
+ "puts(#{args.inspect})"
+ }
+ # ... other rules
+end
+
+tree = {funcall: 'puts', arglist: [1,2,3]}
+SimpleTransform.new.apply(tree) # =&gt; "puts([1, 2, 3])"
+</code></pre>
+<p>Transformation is an entire topic by itself; this will be covered in detail
+<a href="transform.html">later on</a>. To whet your appetite, let me just give you a few
+teasers:</p>
+<ul>
+ <li>Transformations match portions of your tree at any depth, replacing them
+ with whatever you decide.</li>
+ <li>In addition to <code>sequence(sym)</code>, there is also
+ <code>simple(sym)</code> and <code>subtree(sym)</code>. Those match simple
+ strings and entire subtrees respectively. Caution with the latter.</li>
+</ul>
+<p>Here&#8217;s how you would write a somewhat classical interpreter for our little
+language by using a transformation. Note that from this point on, there is
+not one way to go about this, but thousands; you are really free (and on
+your own):</p>
+<pre class="sh_ruby"><code title="putting it all together">
+class MiniP &lt; Parslet::Parser
+ # Single character rules
+ rule(:lparen) { str('(') &gt;&gt; space? }
+ rule(:rparen) { str(')') &gt;&gt; space? }
+ rule(:comma) { str(',') &gt;&gt; space? }
+
+ rule(:space) { match('\s').repeat(1) }
+ rule(:space?) { space.maybe }
+
+ # Things
+ rule(:integer) { match('[0-9]').repeat(1).as(:int) &gt;&gt; space? }
+ rule(:identifier) { match['a-z'].repeat(1) }
+ rule(:operator) { match('[+]') &gt;&gt; space? }
+
+ # Grammar parts
+ rule(:sum) {
+ integer.as(:left) &gt;&gt; operator.as(:op) &gt;&gt; expression.as(:right) }
+ rule(:arglist) { expression &gt;&gt; (comma &gt;&gt; expression).repeat }
+ rule(:funcall) {
+ identifier.as(:funcall) &gt;&gt; lparen &gt;&gt; arglist.as(:arglist) &gt;&gt; rparen }
+
+ rule(:expression) { funcall | sum | integer }
+ root :expression
+end
+
+class IntLit &lt; Struct.new(:int)
+ def eval; int.to_i; end
+end
+class Addition &lt; Struct.new(:left, :right)
+ def eval; left.eval + right.eval; end
+end
+class FunCall &lt; Struct.new(:name, :args);
+ def eval
+ p args.map { |s| s.eval }
+ end
+end
+
+class MiniT &lt; Parslet::Transform
+ rule(:int =&gt; simple(:int)) { IntLit.new(int) }
+ rule(
+ :left =&gt; simple(:left),
+ :right =&gt; simple(:right),
+ :op =&gt; '+') { Addition.new(left, right) }
+ rule(
+ :funcall =&gt; 'puts',
+ :arglist =&gt; subtree(:arglist)) { FunCall.new('puts', arglist) }
+end
+
+parser = MiniP.new
+transf = MiniT.new
+
+ast = transf.apply(
+ parser.parse(
+ 'puts(1,2,3, 4+5)'))
+
+ast.eval # =&gt; [1, 2, 3, 9]
+</code></pre>
+<p>That&#8217;s a bunch of code for printing <code>[1, 2, 3, 9]</code>. Welcome to the
+fantastic world of compiler and interpreter writing!</p>
+<p><sup class="footnote" id="fnr1"><a href="#fn1">1</a></sup> As far as parsing goes. There is a subtle difference between
+<code>#repeat(0,1)</code> and <code>#maybe</code>. Can you figure it out?</p></div>
+ <div class="copyright"><p><span class="caps">MIT</span> License, 2010-2012, &#169; <a href="http://absurd.li">Kaspar Schiess</a><br/>
+ Logo by <a href="http://floere.github.com">Florian Hanke</a>, <a href="http://creativecommons.org/licenses/by/1.0/">CC Attribution</a> license</p></div>
+ <script type="text/javascript">var _gaq = _gaq || [];
+ _gaq.push(['_setAccount', 'UA-16365074-2']);
+ _gaq.push(['_trackPageview']);
+ (function() {
+ var ga = document.createElement('script'); ga.type = 'text/javascript'; ga.async = true;
+ ga.src = ('https:' == document.location.protocol ? 'https://ssl' : 'http://www') + '.google-analytics.com/ga.js';
+ var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(ga, s);
+ })();</script>
+ </div>
+ </body>
+</html>
View
BIN  website/build/images/ast.png
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
View
BIN  website/build/images/favicon1.ico
Binary file not shown
View
BIN  website/build/images/favicon2.ico
Binary file not shown
View
BIN  website/build/images/favicon3.ico
Binary file not shown
View
BIN  website/build/images/parsley_logo.png
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
View
59 website/build/index.html
@@ -0,0 +1,59 @@
+<!DOCTYPE html>
+<html>
+ <head>
+ <meta content="text/html;charset=UTF-8" http-equiv="Content-type" />
+ <title>parslet -About</title>
+ <meta content="Kaspar Schiess (http://absurd.li)" name="author" />
+ <link href="images/favicon3.ico" rel="shortcut icon" /><link href="/cod/stylesheets/site.css" media="screen" rel="stylesheet" type="text/css" /><link href="/cod/stylesheets/sh_whitengrey.css" media="screen" rel="stylesheet" type="text/css" /><script src="/cod/javascripts/sh_main.min.js" type="text/javascript"></script><script src="/cod/javascripts/sh_ruby.min.js" type="text/javascript"></script></head>
+ <body class="code" onload="sh_highlightDocument();">
+ <div id="everything">
+ <div class="main_menu"><img alt="Parslet Logo" src="/cod/images/parsley_logo.png" /><ul>
+ <li><a href="index.html">about</a></li>
+ <li><a href="get-started.html">get started</a></li>
+ <li><a href="install.html">install</a></li>
+ <li><a href="documentation.html">documentation</a></li>
+ <li><a href="contribute.html">contribute</a></li>
+ </ul>
+ </div>
+ <div class="content">
+ <h1>About</h1><pre class="sh_ruby"><code>
+ require 'parslet'
+ include Parslet
+
+ # Constructs a parser using a Parser Expression Grammar
+ parser = str('"') &gt;&gt;
+ (
+ str('\\') &gt;&gt; any |
+ str('"').absnt? &gt;&gt; any
+ ).repeat.as(:string) &gt;&gt;
+ str('"')
+
+ result = parser.parse %Q("this is a valid string")
+ result # =&gt; {:string=&gt;"this is a valid string"@1}
+</code></pre>
+<p>A small Ruby library for constructing parsers in the
+<a href="http://en.wikipedia.org/wiki/Parsing_expression_grammar"><span class="caps">PEG</span></a> (Parsing
+Expression Grammar) fashion.</p>
+<p>Parslet makes developing complex parsers easy. It does so by</p>
+<ul>
+ <li>providing the best <strong>error reporting</strong> possible</li>
+ <li><strong>not generating</strong> reams of code for you to debug</li>
+</ul>
+<p>Parslet takes the long way around to make <strong>your job</strong> easier. It allows for
+incremental language construction. Often, you start out small, implementing
+the atoms of your language first; <em>parslet</em> takes pride in making this
+possible.</p>
+<p>Eager to try this out? <a href="get-started.html">Get started</a>!</p></div>
+ <div class="copyright"><p><span class="caps">MIT</span> License, 2010-2012, &#169; <a href="http://absurd.li">Kaspar Schiess</a><br/>
+ Logo by <a href="http://floere.github.com">Florian Hanke</a>, <a href="http://creativecommons.org/licenses/by/1.0/">CC Attribution</a> license</p></div>
+ <script type="text/javascript">var _gaq = _gaq || [];
+ _gaq.push(['_setAccount', 'UA-16365074-2']);
+ _gaq.push(['_trackPageview']);
+ (function() {
+ var ga = document.createElement('script'); ga.type = 'text/javascript'; ga.async = true;
+ ga.src = ('https:' == document.location.protocol ? 'https://ssl' : 'http://www') + '.google-analytics.com/ga.js';
+ var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(ga, s);
+ })();</script>
+ </div>
+ </body>
+</html>
View
44 website/build/install.html
@@ -0,0 +1,44 @@
+<!DOCTYPE html>
+<html>
+ <head>
+ <meta content="text/html;charset=UTF-8" http-equiv="Content-type" />
+ <title>parslet -Install</title>
+ <meta content="Kaspar Schiess (http://absurd.li)" name="author" />
+ <link href="images/favicon3.ico" rel="shortcut icon" /><link href="/cod/stylesheets/site.css" media="screen" rel="stylesheet" type="text/css" /><link href="/cod/stylesheets/sh_whitengrey.css" media="screen" rel="stylesheet" type="text/css" /><script src="/cod/javascripts/sh_main.min.js" type="text/javascript"></script><script src="/cod/javascripts/sh_ruby.min.js" type="text/javascript"></script></head>
+ <body class="code" onload="sh_highlightDocument();">
+ <div id="everything">
+ <div class="main_menu"><img alt="Parslet Logo" src="/cod/images/parsley_logo.png" /><ul>
+ <li><a href="index.html">about</a></li>
+ <li><a href="get-started.html">get started</a></li>
+ <li><a href="install.html">install</a></li>
+ <li><a href="documentation.html">documentation</a></li>
+ <li><a href="contribute.html">contribute</a></li>
+ </ul>
+ </div>
+ <div class="content">
+ <h1>Install</h1><p>Parslet is at version <em>1.4.0</em>.</p>
+<p><strong>Rubygems</strong></p>
+<pre>
+ gem install parslet
+</pre>
+<p><strong>Bundler</strong></p>
+<pre>
+ gem 'parslet', '~&gt; 1.3'
+</pre>
+<p>or if you want to track the edge:</p>
+<pre>
+ gem 'parslet', :git =&gt; 'git://github.com/kschiess/parslet.git'
+</pre></div>
+ <div class="copyright"><p><span class="caps">MIT</span> License, 2010-2012, &#169; <a href="http://absurd.li">Kaspar Schiess</a><br/>
+ Logo by <a href="http://floere.github.com">Florian Hanke</a>, <a href="http://creativecommons.org/licenses/by/1.0/">CC Attribution</a> license</p></div>
+ <script type="text/javascript">var _gaq = _gaq || [];
+ _gaq.push(['_setAccount', 'UA-16365074-2']);
+ _gaq.push(['_trackPageview']);
+ (function() {
+ var ga = document.createElement('script'); ga.type = 'text/javascript'; ga.async = true;
+ ga.src = ('https:' == document.location.protocol ? 'https://ssl' : 'http://www') + '.google-analytics.com/ga.js';
+ var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(ga, s);
+ })();</script>
+ </div>
+ </body>
+</html>
View
5 website/build/javascripts/sh_main.min.js
@@ -0,0 +1,5 @@
+/* Copyright (C) 2007, 2008 gnombat@users.sourceforge.net */
+/* License: http://shjs.sourceforge.net/doc/gplv3.html */
+
+
+if(!this.sh_languages){this.sh_languages={}}var sh_requests={};function sh_isEmailAddress(a){if(/^mailto:/.test(a)){return false}return a.indexOf("@")!==-1}function sh_setHref(b,c,d){var a=d.substring(b[c-2].pos,b[c-1].pos);if(a.length>=2&&a.charAt(0)==="<"&&a.charAt(a.length-1)===">"){a=a.substr(1,a.length-2)}if(sh_isEmailAddress(a)){a="mailto:"+a}b[c-2].node.href=a}function sh_konquerorExec(b){var a=[""];a.index=b.length;a.input=b;return a}function sh_highlightString(B,o){if(/Konqueror/.test(navigator.userAgent)){if(!o.konquered){for(var F=0;F<o.length;F++){for(var H=0;H<o[F].length;H++){var G=o[F][H][0];if(G.source==="$"){G.exec=sh_konquerorExec}}}o.konquered=true}}var N=document.createElement("a");var q=document.createElement("span");var A=[];var j=0;var n=[];var C=0;var k=null;var x=function(i,a){var p=i.length;if(p===0){return}if(!a){var Q=n.length;if(Q!==0){var r=n[Q-1];if(!r[3]){a=r[1]}}}if(k!==a){if(k){A[j++]={pos:C};if(k==="sh_url"){sh_setHref(A,j,B)}}if(a){var P;if(a==="sh_url"){P=N.cloneNode(false)}else{P=q.cloneNode(false)}P.className=a;A[j++]={node:P,pos:C}}}C+=p;k=a};var t=/\r\n|\r|\n/g;t.lastIndex=0;var d=B.length;while(C<d){var v=C;var l;var w;var h=t.exec(B);if(h===null){l=d;w=d}else{l=h.index;w=t.lastIndex}var g=B.substring(v,l);var M=[];for(;;){var I=C-v;var D;var y=n.length;if(y===0){D=0}else{D=n[y-1][2]}var O=o[D];var z=O.length;var m=M[D];if(!m){m=M[D]=[]}var E=null;var u=-1;for(var K=0;K<z;K++){var f;if(K<m.length&&(m[K]===null||I<=m[K].index)){f=m[K]}else{var c=O[K][0];c.lastIndex=I;f=c.exec(g);m[K]=f}if(f!==null&&(E===null||f.index<E.index)){E=f;u=K;if(f.index===I){break}}}if(E===null){x(g.substring(I),null);break}else{if(E.index>I){x(g.substring(I,E.index),null)}var e=O[u];var J=e[1];var b;if(J instanceof Array){for(var L=0;L<J.length;L++){b=E[L+1];x(b,J[L])}}else{b=E[0];x(b,J)}switch(e[2]){case -1:break;case -2:n.pop();break;case -3:n.length=0;break;default:n.push(e);break}}}if(k){A[j++]={pos:C};if(k==="sh_url"){sh_setHref(A,j,B)}k=null}C=w}return A}function sh_getClasses(d){var a=[];var b=d.className;if(b&&b.length>0){var e=b.split(" ");for(var c=0;c<e.length;c++){if(e[c].length>0){a.push(e[c])}}}return a}function sh_addClass(c,a){var d=sh_getClasses(c);for(var b=0;b<d.length;b++){if(a.toLowerCase()===d[b].toLowerCase()){return}}d.push(a);c.className=d.join(" ")}function sh_extractTagsFromNodeList(c,a){var f=c.length;for(var d=0;d<f;d++){var e=c.item(d);switch(e.nodeType){case 1:if(e.nodeName.toLowerCase()==="br"){var b;if(/MSIE/.test(navigator.userAgent)){b="\r"}else{b="\n"}a.text.push(b);a.pos++}else{a.tags.push({node:e.cloneNode(false),pos:a.pos});sh_extractTagsFromNodeList(e.childNodes,a);a.tags.push({pos:a.pos})}break;case 3:case 4:a.text.push(e.data);a.pos+=e.length;break}}}function sh_extractTags(c,b){var a={};a.text=[];a.tags=b;a.pos=0;sh_extractTagsFromNodeList(c.childNodes,a);return a.text.join("")}function sh_mergeTags(d,f){var a=d.length;if(a===0){return f}var c=f.length;if(c===0){return d}var i=[];var e=0;var b=0;while(e<a&&b<c){var h=d[e];var g=f[b];if(h.pos<=g.pos){i.push(h);e++}else{i.push(g);if(f[b+1].pos<=h.pos){b++;i.push(f[b]);b++}else{i.push({pos:h.pos});f[b]={node:g.node.cloneNode(false),pos:h.pos}}}}while(e<a){i.push(d[e]);e++}while(b<c){i.push(f[b]);b++}return i}function sh_insertTags(k,h){var g=document;var l=document.createDocumentFragment();var e=0;var d=k.length;var b=0;var j=h.length;var c=l;while(b<j||e<d){var i;var a;if(e<d){i=k[e];a=i.pos}else{a=j}if(a<=b){if(i.node){var f=i.node;c.appendChild(f);c=f}else{c=c.parentNode}e++}else{c.appendChild(g.createTextNode(h.substring(b,a)));b=a}}return l}function sh_highlightElement(d,g){sh_addClass(d,"sh_sourceCode");var c=[];var e=sh_extractTags(d,c);var f=sh_highlightString(e,g);var b=sh_mergeTags(c,f);var a=sh_insertTags(b,e);while(d.hasChildNodes()){d.removeChild(d.firstChild)}d.appendChild(a)}function sh_getXMLHttpRequest(){if(window.ActiveXObject){return new ActiveXObject("Msxml2.XMLHTTP")}else{if(window.XMLHttpRequest){return new XMLHttpRequest()}}throw"No XMLHttpRequest implementation available"}function sh_load(language,element,prefix,suffix){if(language in sh_requests){sh_requests[language].push(element);return}sh_requests[language]=[element];var request=sh_getXMLHttpRequest();var url=prefix+"sh_"+language+suffix;request.open("GET",url,true);request.onreadystatechange=function(){if(request.readyState===4){try{if(!request.status||request.status===200){eval(request.responseText);var elements=sh_requests[language];for(var i=0;i<elements.length;i++){sh_highlightElement(elements[i],sh_languages[language])}}else{throw"HTTP error: status "+request.status}}finally{request=null}}};request.send(null)}function sh_highlightDocument(g,k){var b=document.getElementsByTagName("pre");for(var e=0;e<b.length;e++){var f=b.item(e);var a=sh_getClasses(f);for(var c=0;c<a.length;c++){var h=a[c].toLowerCase();if(h==="sh_sourcecode"){continue}if(h.substr(0,3)==="sh_"){var d=h.substring(3);if(d in sh_languages){sh_highlightElement(f,sh_languages[d])}else{if(typeof(g)==="string"&&typeof(k)==="string"){sh_load(d,f,g,k)}else{throw'Found <pre> element with class="'+h+'", but no such language exists'}}break}}}};
View
1  website/build/javascripts/sh_ruby.min.js
@@ -0,0 +1 @@
+if(!this.sh_languages){this.sh_languages={}}sh_languages.ruby=[[[/\b(?:require)\b/g,"sh_preproc",-1],[/\b[+-]?(?:(?:0x[A-Fa-f0-9]+)|(?:(?:[\d]*\.)?[\d]+(?:[eE][+-]?[\d]+)?))u?(?:(?:int(?:8|16|32|64))|L)?\b/g,"sh_number",-1],[/"/g,"sh_string",1],[/'/g,"sh_string",2],[/</g,"sh_string",3],[/\/[^\n]*\//g,"sh_regexp",-1],[/(%r)(\{(?:\\\}|#\{[A-Za-z0-9]+\}|[^}])*\})/g,["sh_symbol","sh_regexp"],-1],[/\b(?:alias|begin|BEGIN|break|case|defined|do|else|elsif|end|END|ensure|for|if|in|include|loop|next|raise|redo|rescue|retry|return|super|then|undef|unless|until|when|while|yield|false|nil|self|true|__FILE__|__LINE__|and|not|or|def|class|module|catch|fail|load|throw)\b/g,"sh_keyword",-1],[/(?:^\=begin)/g,"sh_comment",4],[/(?:\$[#]?|@@|@)(?:[A-Za-z0-9_]+|'|\"|\/)/g,"sh_type",-1],[/[A-Za-z0-9]+(?:\?|!)/g,"sh_normal",-1],[/~|!|%|\^|\*|\(|\)|-|\+|=|\[|\]|\\|:|;|,|\.|\/|\?|&|<|>|\|/g,"sh_symbol",-1],[/(#)(\{)/g,["sh_symbol","sh_cbracket"],-1],[/#/g,"sh_comment",5],[/\{|\}/g,"sh_cbracket",-1]],[[/$/g,null,-2],[/\\(?:\\|")/g,null,-1],[/"/g,"sh_string",-2]],[[/$/g,null,-2],[/\\(?:\\|')/g,null,-1],[/'/g,"sh_string",-2]],[[/$/g,null,-2],[/>/g,"sh_string",-2]],[[/^(?:\=end)/g,"sh_comment",-2]],[[/$/g,null,-2]]];
View
87 website/build/overview.html
@@ -0,0 +1,87 @@
+<!DOCTYPE html>
+<html>
+ <head>
+ <meta content="text/html;charset=UTF-8" http-equiv="Content-type" />
+ <title>parslet -Overview</title>
+ <meta content="Kaspar Schiess (http://absurd.li)" name="author" />
+ <link href="images/favicon3.ico" rel="shortcut icon" /><link href="/cod/stylesheets/site.css" media="screen" rel="stylesheet" type="text/css" /><link href="/cod/stylesheets/sh_whitengrey.css" media="screen" rel="stylesheet" type="text/css" /><script src="/cod/javascripts/sh_main.min.js" type="text/javascript"></script><script src="/cod/javascripts/sh_ruby.min.js" type="text/javascript"></script></head>
+ <body class="code" onload="sh_highlightDocument();">
+ <div id="everything">
+ <div class="main_menu"><img alt="Parslet Logo" src="/cod/images/parsley_logo.png" /><ul>
+ <li><a href="index.html">about</a></li>
+ <li><a href="get-started.html">get started</a></li>
+ <li><a href="install.html">install</a></li>
+ <li><a href="documentation.html">documentation</a></li>
+ <li><a href="contribute.html">contribute</a></li>
+ </ul>
+ </div>
+ <div class="content">
+ <h1>Overview</h1><p>Parslet is a library with a clear philosophy: It makes parser writing easy and
+ testable. On top of that, it provides understandable error messages (&#8220;General
+ Protection Fault&#8221;) to you, the language writer. In extension, you will
+ hopefully manage to provide good error messages to your users. Together, we
+ can create a better world!</p>
+ <p>Traditional texts on the subject will have you write a compiler or interpreter
+ for a language in several stages:</p>
+ <ul>
+ <li>Parsing or Lexing/Parsing</li>
+ <li>Abstract Syntax Tree construction</li>
+ <li>Optimization and checking of the tree</li>
+ <li>Generation of code / Execution</li>
+ </ul>
+ <p>This library will be good for the first two stages only. After that, you&#8217;ll
+ be on your own.</p>
+ <p>The parsing step has literally been implemented by hundreds (thousands) of
+ clever people; No lack of alternatives. Even in Ruby, you&#8217;ll have the choice
+ among a handful of libraries. There&#8217;s a distinction to make on this level as
+ well:</p>
+ <p><strong>LRk parsers and related fields</strong> Parsers in this class use a lexical analyzer
+ (a <em>lexer</em> ) to transform the input text into tokens (<em>tokenizing</em> ). They
+ then check if your stream of tokens has a corresponding tree that conforms to
+ the grammar. Most of these parsers allow grammars that are ambiguous and will
+ provide a mechanism for resolving the arising ambiguities. These are the
+ earliest parser generators and the most widely used (<em>yacc</em>, <em>bison</em>, &#8230;) &#8212;
+ chances are, you&#8217;ll have more than one of these libraries installed on your
+ system.</p>
+ <p><strong><span class="caps">PEG</span> or packrat parsers</strong> Parsers in this class are based on a slightly more
+ modern algorithm, which translates what one does when writing a parser by hand
+ in top-down fashion. This is what we programmers do all over, methods calling
+ methods &#8211; and its also (grossly) what these parsers do to recognize input.
+ Left recursion is impossible to express in these grammars. No lexers are
+ required &#8211; lexical tokenizing and parsing are one single step. Ruby has
+ several implementations of this algorithm in library form:
+ <a href="http://github.com/nathansobo/treetop">Treetop</a>,
+ <a href="http://github.com/mjijackson/citrus">Citrus</a>,
+ <a href="http://wiki.github.com/luikore/rsec/">rsec</a> and of course
+ <a href="http://kschiess.github.com/parslet">Parslet</a>.</p>
+ <p>All of these generators are different in small ways; yet most implement common
+ patterns and provide almost identical APIs. We believe this is an error:
+ choice is good, but there should be visible attributes distinguishing the
+ choices.</p>
+ <p>Parslet is not like the others, in fact it is radically different on some
+ key elements.</p>
+ <h2>Writing a language</h2>
+ <p>Whether you write a language for a configuration file or a new computer
+ language (the holy grail), the steps are always the same:</p>
+ <ol>
+ <li>Create a grammar: <em>What should be legal syntax?</em></li>
+ <li>Annotate the grammar: <em>What is important data?</em></li>
+ <li>Create a transformation: <em>How do I want to work with that data?</em></li>
+ </ol>
+ <p>The creation of grammars and the various concepts that are associated are treated
+ in <a href="parser.html">Parslet::Parser</a>.</p>
+ <p>Transformation of the resulting intermediary tree is treated in
+ <a href="transform.html">Parslet::Transform</a></p></div>
+ <div class="copyright"><p><span class="caps">MIT</span> License, 2010-2012, &#169; <a href="http://absurd.li">Kaspar Schiess</a><br/>
+ Logo by <a href="http://floere.github.com">Florian Hanke</a>, <a href="http://creativecommons.org/licenses/by/1.0/">CC Attribution</a> license</p></div>
+ <script type="text/javascript">var _gaq = _gaq || [];
+ _gaq.push(['_setAccount', 'UA-16365074-2']);
+ _gaq.push(['_trackPageview']);
+ (function() {
+ var ga = document.createElement('script'); ga.type = 'text/javascript'; ga.async = true;
+ ga.src = ('https:' == document.location.protocol ? 'https://ssl' : 'http://www') + '.google-analytics.com/ga.js';
+ var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(ga, s);
+ })();</script>
+ </div>
+ </body>
+</html>
View
233 website/build/parser.html
@@ -0,0 +1,233 @@
+<!DOCTYPE html>
+<html>
+ <head>
+ <meta content="text/html;charset=UTF-8" http-equiv="Content-type" />
+ <title>parslet -Parser construction</title>
+ <meta content="Kaspar Schiess (http://absurd.li)" name="author" />
+ <link href="images/favicon3.ico" rel="shortcut icon" /><link href="/cod/stylesheets/site.css" media="screen" rel="stylesheet" type="text/css" /><link href="/cod/stylesheets/sh_whitengrey.css" media="screen" rel="stylesheet" type="text/css" /><script src="/cod/javascripts/sh_main.min.js" type="text/javascript"></script><script src="/cod/javascripts/sh_ruby.min.js" type="text/javascript"></script></head>
+ <body class="code" onload="sh_highlightDocument();">
+ <div id="everything">
+ <div class="main_menu"><img alt="Parslet Logo" src="/cod/images/parsley_logo.png" /><ul>
+ <li><a href="index.html">about</a></li>
+ <li><a href="get-started.html">get started</a></li>
+ <li><a href="install.html">install</a></li>
+ <li><a href="documentation.html">documentation</a></li>
+ <li><a href="contribute.html">contribute</a></li>
+ </ul>
+ </div>
+ <div class="content">
+ <h1>Parser construction</h1><p>A parser is nothing more than a class that derives from
+<code>Parslet::Parser</code>. The simplest parser that one could write would
+look like this:</p>
+<pre class="sh_ruby"><code>
+ class SimpleParser &lt; Parslet::Parser
+ rule(:a_rule) { str('simple_parser') }
+ root(:a_rule)
+ end
+</code></pre>
+<p>The language recognized by this parser is simply the string &#8220;simple_parser&#8221;.
+Parser rules do look a lot like methods and are defined by</p>
+<pre class="sh_ruby"><code>
+ rule(name) { definition_block }
+</code></pre>
+<p>Behind the scenes, this really defines a method that returns whatever you
+return from it.</p>
+<p>Every parser has a root. This designates where parsing should start. It is like
+an entry point to your parser. With a root defined like this:</p>
+<pre class="sh_ruby"><code>
+ root(:my_root)
+</code></pre>
+<p>you create a <code>#parse</code> method in your parser that will start parsing
+by calling the <code>#my_root</code> method. You&#8217;ll also have a <code>#root</code>
+(instance) method that is an alias of the root method. The following things are
+really one and the same:</p>
+<pre class="sh_ruby"><code>
+ SimpleParser.new.parse(string)
+ SimpleParser.new.root.parse(string)
+ SimpleParser.new.a_rule.parse(string)
+</code></pre>
+<p>Knowing these things gives you a lot of flexibility; I&#8217;ll explain why at the
+end of the chapter. For now, just let me point out that because all of this is
+Ruby, your favorite editor will syntax highlight parser code just fine.</p>
+<h2>Atoms: The inside of a parser</h2>
+<h3>Matching strings of characters</h3>
+<p>A parser is constructed from parser atoms (or parslets, hence the name). The
+atoms are what appear inside your rules (and maybe elsewhere). We&#8217;ve already
+encountered an atom, the string atom:</p>
+<pre class="sh_ruby"><code>
+ str('simple_parser')
+</code></pre>
+<p>This returns a <code>Parslet::Atoms::Str</code> instance. These parser atoms
+all derive from <code>Parslet::Atoms::Base</code> and have essentially just
+one method you can call: <code>#parse</code>. So this works:</p>
+<pre class="sh_ruby"><code title="parser atoms">
+ str('foobar').parse('foobar') # =&gt; "foobar"@0
+</code></pre>
+<p>The atoms are small parsers that can recognize languages and throw errors, just
+like real <code>Parslet::Parser</code> subclasses.</p>
+<h3>Matching character ranges</h3>
+<p>The second parser atom you will have to know about allows you to match
+character ranges:</p>
+<pre class="sh_ruby"><code>
+ match('[0-9a-f]')
+</code></pre>
+<p>The above atom would match the numbers zero through nine and the letters &#8216;a&#8217;
+to &#8216;f&#8217; &#8211; yeah, you guessed right &#8211; hexadecimal numbers for example. The inside
+of such a match parslet is essentially a regular expression that matches
+a single character of input. Because we&#8217;ll be using ranges so much with
+<code>#match</code> and because typing (&#8216;[]&#8217;) is tiresome, here&#8217;s another way
+to write the above <code>#match</code> atom:</p>
+<pre class="sh_ruby"><code>
+ match['0-9a-f']
+</code></pre>
+<p>Character matches are instances of <code>Parslet::Atoms::Re</code>. Here are
+some more examples of character ranges:</p>
+<pre class="sh_ruby"><code>
+ match['[:alnum:]'] # letters and numbers
+ match['\\n'] # newlines
+ match('\\w') # word characters
+ match('.') # any character
+</code></pre>
+<h3>The wild wild <code>#any</code></h3>
+<p>The last example above corresponds to the regular expression <code>/./</code> that matches
+any one character. There is a special atom for that:</p>
+<pre class="sh_ruby"><code>
+ any
+</code></pre>
+<h2>Composition of Atoms</h2>
+<p>These basic atoms can be composed to form complex grammars. The following
+few sections will tell you about the various ways atoms can be composed.</p>
+<h3>Simple Sequences</h3>
+<p>Match &#8216;foo&#8217; and then &#8216;bar&#8217;:</p>
+<pre class="sh_ruby"><code>
+ str('foo') &gt;&gt; str('bar') # same as str('foobar')
+</code></pre>
+<p>Sequences correspond to instances of the class
+<code>Parslet::Atoms::Sequence</code>.</p>
+<h3>Repetition and its Special Cases</h3>
+<p>To model atoms that can be repeated, you should use <code>#repeat</code>:</p>
+<pre class="sh_ruby"><code>
+ str('foo').repeat
+</code></pre>
+<p>This will allow foo to repeat any number of times, including zero. If you
+look at the signature for <code>#repeat</code> in <code>Parslet::Atoms::Base</code>,
+you&#8217;ll see that it has really two arguments: <em>min</em> and <em>max</em>. So the following
+code all makes sense:</p>
+<pre class="sh_ruby"><code>
+ str('foo').repeat(1) # match 'foo' at least once
+ str('foo').repeat(1,3) # at least once and at most 3 times
+ str('foo').repeat(0, nil) # the default: same as str('foo').repeat
+</code></pre>
+<p>Repetition has a special case that is used frequently: Matching something
+once or not at all can be achieved by <code>repeat(0,1)</code>, but also
+through the prettier:</p>
+<pre class="sh_ruby"><code>
+ str('foo').maybe # same as str('foo').repeat(0,1)
+</code></pre>
+<p>These all map to <code>Parslet::Atoms::Repetition</code>. Please note this
+little twist to <code>#maybe</code>:</p>
+<pre class="sh_ruby"><code title="maybes twist">
+ str('foo').maybe.as(:f).parse('') # =&gt; {:f=&gt;nil}
+ str('foo').repeat(0,1).as(:f).parse('') # =&gt; {:f=&gt;[]}
+</code></pre>
+<p>The &#8216;nil&#8217;-value of <code>#maybe</code> is nil. This is catering to the
+intuition that <code>foo.maybe</code> either gives me <code>foo</code> or
+nothing at all, not an empty array. But have it your way!</p>
+<h3>Alternation</h3>
+<p>The most important composition method for grammars is alternation. Without
+it, your grammars would only vary in the amount of things matched, but not
+in content. Here&#8217;s how this looks:</p>
+<pre class="sh_ruby"><code>
+ str('foo') | str('bar') # matches 'foo' OR 'bar'
+</code></pre>
+<p>This reads naturally as &#8220;&#8216;foo&#8217; or &#8216;bar&#8217;&#8221;.</p>
+<h3>Operator precedence</h3>
+<p>The operators we have chosen for parslet atom combination have the operator
+precedence that you would expect. No parenthesis are needed to express
+alternation of sequences:</p>
+<pre class="sh_ruby"><code>
+ str('s') &gt;&gt; str('equence') |
+ str('se') &gt;&gt; str('quence')
+</code></pre>
+<h3>And more</h3>
+<p>Parslet atoms are not as pretty as Treetop atoms. There you go, we said it.
+However, there seems to be a different kind of aesthetic about them; they
+are pure Ruby and integrate well with the rest of your environment. Have a
+look at this:</p>
+<pre class="sh_ruby"><code>
+ # Also consumes the space after important things like ';' or ':'. Call this
+ # giving the character you want to match as argument:
+ #
+ # arg &gt;&gt; (spaced(',') &gt;&gt; arg).repeat
+ #
+ def spaced(character)
+ str(character) &gt;&gt; match["\s"]
+ end
+</code></pre>
+<p>or even this:</p>
+<pre class="sh_ruby"><code>
+ # Turns any atom into an expression that matches a left parenthesis, the
+ # atom and then a right parenthesis.
+ #
+ # bracketed(sum)
+ #
+ def bracketed(atom)
+ spaced('(') &gt;&gt; atom &gt;&gt; spaced(')')
+ end
+</code></pre>
+<p>You might say that because parslet is just plain old Ruby objects itself (<span class="caps">PORO</span>
+&#8482;), it allows for very tight code. Module inclusion, class inheritance, &#8230;
+all your tools should work well with parslet.</p>
+<h2>Tree construction</h2>
+<p>By default, parslet will just echo back to you the strings you feed into it.
+Parslet will not generate a parser for you and neither will it generate your
+abstract syntax tree for you. The method <code>#as(name)</code> allows you
+to specify exactly how you want your tree to look like:</p>
+<pre class="sh_ruby"><code title="using as">
+ str('foo').parse('foo') # =&gt; "foo"@0
+ str('foo').as(:bar).parse('foo') # =&gt; {:bar=&gt;"foo"@0}
+</code></pre>
+<p>So you think: <code>#as(name)</code> allows me to create a hash, big deal.
+That&#8217;s not all. You&#8217;ll notice that annotating everything that you want to keep
+in your grammar with <code>#as(name)</code> autocreates a sensible tree
+composed of hashes and arrays and strings. It&#8217;s really somewhat magic: Parslet
+has a set of clever rules that merge the annotated output from your atoms into
+a tree. Here are some more examples, with the atom on the left and the resulting
+tree (assuming a successful parse) on the right:</p>
+<pre class="sh_ruby"><code>
+ # Normal strings just map to strings
+ str('a').repeat "aaa"@0
+
+ # Arrays capture repetition of non-strings
+ str('a').repeat.as(:b) {:b=&gt;"aaa"@0}
+ str('a').as(:b).repeat [{:b=&gt;"a"@0}, {:b=&gt;"a"@1}, {:b=&gt;"a"@2}]
+
+ # Subtrees get merged - unlabeled strings discarded
+ str('a').as(:a) &gt;&gt; str('b').as(:b) {:a=&gt;"a"@0, :b=&gt;"b"@1}
+ str('a') &gt;&gt; str('b').as(:b) &gt;&gt; str('c') {:b=&gt;"b"@1}
+
+ # #maybe will return nil, not the empty array
+ str('a').maybe.as(:a) {:a=&gt;"a"@0}
+ str('a').maybe.as(:a) {:a=&gt;nil}
+</code></pre>
+<h2>And more</h2>
+<p>Now you know exactly how to create parsers using Parslet. Your parsers
+will output intricate structures made of endless arrays, complex hashes and
+a few string leftovers. But your programming skills fail you when you try
+to put all this data to use. Selecting keys upon keys in hash after hash, you
+feel like a cockroach that has just read Kafka&#8217;s works. This is no fun. This
+is not what you signed up for.</p>
+<p>Time to introduce you to <a href="transform.html">Parslet::Transform</a> and its workings.</p></div>
+ <div class="copyright"><p><span class="caps">MIT</span> License, 2010-2012, &#169; <a href="http://absurd.li">Kaspar Schiess</a><br/>
+ Logo by <a href="http://floere.github.com">Florian Hanke</a>, <a href="http://creativecommons.org/licenses/by/1.0/">CC Attribution</a> license</p></div>
+ <script type="text/javascript">var _gaq = _gaq || [];
+ _gaq.push(['_setAccount', 'UA-16365074-2']);
+ _gaq.push(['_trackPageview']);
+ (function() {
+ var ga = document.createElement('script'); ga.type = 'text/javascript'; ga.async = true;
+ ga.src = ('https:' == document.location.protocol ? 'https://ssl' : 'http://www') + '.google-analytics.com/ga.js';
+ var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(ga, s);
+ })();</script>
+ </div>
+ </body>
+</html>
View
139 website/build/stylesheets/sh_whitengrey.css
@@ -0,0 +1,139 @@
+pre.sh_sourceCode {
+ background-color: #ffffff;
+ color: #696969;
+ font-weight: normal;
+ font-style: normal;
+}
+
+pre.sh_sourceCode .sh_keyword {
+ color: #696969;
+ font-weight: bold;
+ font-style: normal;
+}
+
+pre.sh_sourceCode .sh_type {
+ color: #696969;
+ font-weight: normal;
+ font-style: normal;
+}
+
+pre.sh_sourceCode .sh_string {
+ color: #008800;
+ font-weight: normal;
+ font-style: normal;
+}
+
+pre.sh_sourceCode .sh_regexp {
+ color: #008800;
+ font-weight: normal;
+ font-style: normal;
+}
+
+pre.sh_sourceCode .sh_specialchar {
+ color: #008800;
+ font-weight: normal;
+ font-style: normal;
+}
+
+pre.sh_sourceCode .sh_comment {
+ color: #1326a2;
+ font-weight: normal;
+ font-style: italic;
+}
+
+pre.sh_sourceCode .sh_number {
+ color: #bb00ff;
+ font-weight: normal;
+ font-style: normal;
+}
+
+pre.sh_sourceCode .sh_preproc {
+ color: #470000;
+ font-weight: normal;
+ font-style: normal;
+}
+
+pre.sh_sourceCode .sh_function {
+ color: #000000;
+ font-weight: normal;
+ font-style: normal;
+}
+
+pre.sh_sourceCode .sh_url {
+ color: #008800;
+ font-weight: normal;
+ font-style: normal;
+}
+
+pre.sh_sourceCode .sh_date {
+ color: #696969;
+ font-weight: bold;
+ font-style: normal;
+}
+
+pre.sh_sourceCode .sh_time {
+ color: #696969;
+ font-weight: bold;
+ font-style: normal;
+}
+
+pre.sh_sourceCode .sh_file {
+ color: #696969;
+ font-weight: bold;
+ font-style: normal;
+}
+
+pre.sh_sourceCode .sh_ip {
+ color: #008800;
+ font-weight: normal;
+ font-style: normal;
+}
+
+pre.sh_sourceCode .sh_name {
+ color: #008800;
+ font-weight: normal;
+ font-style: normal;
+}
+
+pre.sh_sourceCode .sh_variable {
+ color: #696969;
+ font-weight: bold;
+ font-style: normal;
+}
+
+pre.sh_sourceCode .sh_oldfile {
+ color: #008800;
+ font-weight: normal;
+ font-style: normal;
+}
+
+pre.sh_sourceCode .sh_newfile {
+ color: #008800;
+ font-weight: normal;
+ font-style: normal;
+}
+
+pre.sh_sourceCode .sh_difflines {
+ color: #696969;
+ font-weight: bold;
+ font-style: normal;
+}
+
+pre.sh_sourceCode .sh_selector {
+ color: #696969;
+ font-weight: bold;
+ font-style: normal;
+}
+
+pre.sh_sourceCode .sh_property {
+ color: #696969;
+ font-weight: bold;
+ font-style: normal;
+}
+
+pre.sh_sourceCode .sh_value {
+ color: #008800;
+ font-weight: normal;
+ font-style: normal;
+}
+
View
83 website/build/stylesheets/site.css
@@ -0,0 +1,83 @@
+body {
+ font-size: 17px;
+ font-family: Helvetica;
+ background-color: white; }
+
+a {
+ color: black; }
+
+.main_menu {
+ font-size: 1.2em;
+ width: 50em;
+ margin-bottom: 1cm; }
+ .main_menu ul {
+ display: inline-block;
+ list-style-type: none;
+ font-size: 1.33em;
+ margin-left: -8em; }
+ .main_menu ul:after {
+ content: ".";
+ display: block;
+ height: 0;
+ clear: both;
+ visibility: hidden; }
+ * html .main_menu ul {
+ height: 1px; }
+ .main_menu li {
+ float: left;
+ font-variant: small-caps;
+ padding-left: 1em; }
+ .main_menu a, .main_menu:visited {
+ text-decoration: none;
+ color: #7c7c7c; }
+
+.content {
+ font-size: 1em;
+ width: 50em;
+ line-height: 1.4em;
+ margin-left: 1cm; }
+ .content p {
+ margin: 1em; }
+ .content img {
+ margin-left: 3em; }
+
+.copyright {
+ font-variant: small-caps;
+ font-size: 0.6em;
+ color: #00ad00;
+ margin-left: 22em;
+ margin-top: 3cm; }
+ .copyright a {
+ text-decoration: none;
+ color: #00ad00; }
+
+body.code code {
+ font-size: 0.9em;
+ font-family: "Monaco", monospace; }
+ body.code code .sh_keyword {
+ color: #761a47; }
+ body.code code .sh_comment {
+ color: #686868; }
+ body.code code .sh_string {
+ color: #42aa7a;
+ background-color: #edf7f7; }
+ body.code code .sh_constant {
+ color: #551e03; }
+ body.code code .sh_method {
+ color: #4679a9; }
+ body.code code .sh_symbol {
+ color: #551e03; }
+ body.code code .sh_number {
+ color: #42aa7a; }
+body.code pre {
+ font-size: 1em;
+ font-family: "Monaco", monospace;
+ background-color: #f8f8f8;
+ color: #4679a9;
+ line-height: 1.1em;
+ margin-left: 1em;
+ padding: -1em 0 -1em 0; }
+ body.code pre.sh_sourceCode {
+ border-radius: 0.5em;
+ background-color: #f8f8f8;
+ padding-bottom: 1.4em; }
View
250 website/build/transform.html
@@ -0,0 +1,250 @@
+<!DOCTYPE html>
+<html>
+ <head>
+ <meta content="text/html;charset=UTF-8" http-equiv="Content-type" />
+ <title>parslet -Transformation</title>
+ <meta content="Kaspar Schiess (http://absurd.li)" name="author" />
+ <link href="images/favicon3.ico" rel="shortcut icon" /><link href="/cod/stylesheets/site.css" media="screen" rel="stylesheet" type="text/css" /><link href="/cod/stylesheets/sh_whitengrey.css" media="screen" rel="stylesheet" type="text/css" /><script src="/cod/javascripts/sh_main.min.js" type="text/javascript"></script><script src="/cod/javascripts/sh_ruby.min.js" type="text/javascript"></script></head>
+ <body class="code" onload="sh_highlightDocument();">
+ <div id="everything">
+ <div class="main_menu"><img alt="Parslet Logo" src="/cod/images/parsley_logo.png" /><ul>
+ <li><a href="index.html">about</a></li>
+ <li><a href="get-started.html">get started</a></li>
+ <li><a href="install.html">install</a></li>
+ <li><a href="documentation.html">documentation</a></li>
+ <li><a href="contribute.html">contribute</a></li>
+ </ul>
+ </div>
+ <div class="content">
+ <h1>Transformation</h1><p>Parslet parsers output deep nested hashes. Those are nice for printing, but
+hard to work with. The structure of the nested hashes is determined by the
+grammar and can thus vary largely. Testing for the presence of individual
+keys would produce code that is hard to read and maintain.</p>
+<p>This is why parslet also comes with a hash transformation engine. To construct
+such a transform, you have to derive from <code>Parslet::Transform</code>:</p>
+<pre class="sh_ruby"><code title="simple transform">
+ class MyTransform &lt; Parslet::Transform
+ rule('a') { 'b' }
+ end
+ MyTransform.new.apply('a') # =&gt; "b"
+</code></pre>
+<p>This is a transformation that replaces all &#8217;a&#8217;s with &#8217;b&#8217;s. A transformation
+rule has two parts: A <strong>pattern</strong> (here: <code>'a'</code>) and an <strong>action block</strong>
+(<code>{ 'b' }</code>).</p>
+<p>The engine will go through the input and traverse the tree in depth-first
+post-order fashion. This means that for a given tree node, it will first visit
+the children and only then look at the node itself. While traversing, all
+rules are tested in the order in which they are defined. If a rule matches, the
+corresponding tree is <em>replaced</em> by whatever the action block returns.</p>
+<p>Here&#8217;s another way of saying the same thing, perhaps more in line with what
+you need as a user of Parslet: <code>Parslet::Transform</code> is what allows
+you to transform the <span class="caps">PORO</span>-trees magically into a real abstract syntax tree.
+The rule definitions are the futuristic nano-machines that act on tree leaves
+first, eating them away and replacing them with contraptions of your own
+design. Here&#8217;s how that might look like in Ruby:</p>
+<pre class="sh_ruby"><code title="poro magic">
+ tree = {:left =&gt; {:int =&gt; '1'},
+ :op =&gt; '+',
+ :right =&gt; {:int =&gt; '2'}}
+
+ class Trans &lt; Parslet::Transform
+ rule(:int =&gt; simple(:x)) { Integer(x) }
+ end
+ Trans.new.apply(tree) # =&gt; {:left=&gt;1, :op=&gt;"+", :right=&gt;2}
+</code></pre>
+<p>You can start thinking about the leaves first, transforming those <code>:int
+=&gt; '1'</code> into real Ruby integers. This incremental (test driven!)
+approach will prevent your intermediary tree from turning into grey goo
+from too many nano-machines. Rules should in general be simple and transform
+a small part of the tree into a more useful variant. Turns out that if we were
+looking for an interpreter, one more rule will give us evaluation:</p>
+<pre class="sh_ruby"><code title="building up">
+ tree = {:left =&gt; {:int =&gt; '1'},
+ :op =&gt; '+',
+ :right =&gt; {:int =&gt; '2'}}
+
+ class Trans &lt; Parslet::Transform
+ rule(:int =&gt; simple(:x)) { Integer(x) }
+ rule(:op =&gt; '+', :left =&gt; simple(:l), :right =&gt; simple(:r)) { l + r }
+ end
+ Trans.new.apply(tree) # =&gt; 3
+</code></pre>
+<p>Cool, isn&#8217;t it? To recap: parslet intentionally spits out deep nested hashes,
+because it also gives you the tool to work with those. Turning the intermediary
+trees into something useful is really easy.</p>
+<h2></h2>
+<h2>Working with Captures</h2>
+<p>What is this <code>simple(symbol)</code> business all about, you might ask.
+Glad you do.</p>
+<h3>Simple captures</h3>
+<p>Transform allows you to specify patterns that have wildcards in them. The
+wildcards match part of the tree, but at the same time capture it for working
+on it in your action block. The wildcard</p>
+<pre class="sh_ruby"><code>
+ simple(:x)
+</code></pre>
+<p>will match any object <span class="caps">BUT</span> hashes or arrays. While this is obviously useful
+for capturing strings, you can also capture other &#8216;simple&#8217; (as opposed to
+composed) objects of your own creation. <code>simple(:x)</code> would thus match
+all of these objects:</p>
+<pre class="sh_ruby"><code>
+ "a string"
+ 123
+ Foo.new(:some, :class, :instance)
+</code></pre>
+<p>If you think about what you&#8217;ll be doing to your intermediary trees, replacing
+leaves with more useful objects, <code>simple</code> really makes good sense,
+since it will stop you from matching entire subtrees.</p>
+<h3>Matching Repetitions and Sequences</h3>
+<p>Some patterns (like repetitions and sequences) produce arrays of objects as
+result. You can use <code>simple(...)</code> to replace all parts of these
+arrays with your own objects, but you cannot replace the array as a whole.
+This is the purpose of <code>sequence(symbol)</code>:</p>
+<pre class="sh_ruby"><code>
+ sequence(:x)
+</code></pre>
+<p>will match all of these:</p>
+<pre class="sh_ruby"><code>
+ ['a', 'b', 'c']
+ ['a', 'a', 'a']
+ [Foo.new, Bar.new]
+</code></pre>
+<p>but not</p>
+<pre class="sh_ruby"><code>
+ [{:a =&gt; :b}]
+ [['a', 'b']]
+</code></pre>
+<p>Like its smaller brother, <code>sequence</code> is very picky about what it
+consumes and what not. All for the same reasons.</p>
+<h3>Matching entire subtrees</h3>
+<p>So you don&#8217;t want to listen and really want that big gun with the foot aiming
+addon. You&#8217;ll be needing <code>subtree(symbol)</code>. It always matches.
+Nuff said.</p>
+<h3>Matching context</h3>
+<p>A match always binds in a context. The context consists of all bindings
+that were previously made. If you reuse the same symbol for two consecutive
+matches within the same pattern, the engine will assume that you want these
+two matched objects to be equal (under <code>==</code>). This allows to
+specify constraints on your matches that would need code to express otherwise:</p>
+<pre class="sh_ruby"><code>
+ # The following code is an excerpt from example/simple_xml.rb in the distro
+ t.rule(
+ open: {name: simple(:tag)},
+ close: {name: simple(:tag)},
+ inner: simple(:t)
+ ) { 'verified' }
+</code></pre>
+<p>This replaces <em>matching</em> open and close tags with the word &#8216;verified&#8217;,
+consuming them from the tree and allowing the same rule to match higher up. A
+valid <span class="caps">XML</span> tree will leave only the word &#8216;verified&#8217; behind, while the parser
+will stop at the problem nodes in invalid trees.</p>
+<h2>Transformation rules</h2>
+<p>In this chapter, we&#8217;ll look more closely at transformation rules and the
+different ways they can be laid out in your code.</p>
+<h3>Usage Patterns</h3>
+<p>The way the transformation engine is constructed, there is not one, but three
+ways to use it. Since at least one of those is inconvenient for you, the user,
+I am going to show only the remaining two, Variant 1 that produces an instance
+of the transform for direct use:</p>
+<pre class="sh_ruby"><code>
+ # Variant 1
+ transform = Parslet::Transform.new do
+ rule(...) { ... }
+ rule(...) { ... }
+ rule(...) { ... }
+ end
+ transform.apply(tree)
+</code></pre>
+<p>and Variant 2 that allows constructing the transformation as a class:</p>
+<pre class="sh_ruby"><code>
+ # Variant 2
+ class MyTransform &lt; Parslet::Transform
+ rule(...) { ... }
+ rule(...) { ... }
+ rule(...) { ... }
+ end
+ MyTransform.new.apply(tree)
+</code></pre>
+<p>I guess both have their sweet spot.</p>
+<h3>Action blocks: Two flavors</h3>
+<p>As you might have noticed by now, parslet provides choice as well as nice
+parsers. To recap: Rules have a left side called <em>pattern</em> and a right side
+called <em>action block</em>:</p>
+<pre class="sh_ruby"><code>
+ rule(PATTERN) {ACTION_BLOCK}
+</code></pre>
+<p>There are two ways of writing action blocks, and the difference might be
+fundamental to know to you one day. If written like this:</p>
+<pre class="sh_ruby"><code>
+ rule(:foo =&gt; simple(:x)) { puts x }
+</code></pre>
+<p>the block will be able to access <code>x</code> as a local variable. This is
+very convenient and shortens the action code, often to the point of being
+very expressive.</p>
+<p>But there is a <em>big downside</em> to this way of writing things: The action block
+must be executed in the context of some magic instance that has <code>x</code>
+as a local method (aka accessor). You can only have one self at any one time;
+variable access to the binding of the block isn&#8217;t possible inside this kind
+of action blocks:</p>
+<pre class="sh_ruby"><code>
+ y = 12
+ rule(:foo =&gt; simple(:x)) { Integer(x) + y }
+</code></pre>
+<p>This will (depending on the context) throw a <code>NameError</code> or a
+<code>NoMethodError</code>.</p>
+<p>But this can be fixed by using the other, less elegant style for action
+blocks:</p>
+<pre class="sh_ruby"><code>
+ y = 12
+ rule(:foo =&gt; simple(:x)) { |dictionary| Integer(dictionary[:x]) + y }
+</code></pre>
+<p>In this second flavor, the block gets executed in the context of definition,
+whatever that was. This means that it can capture and access local variables
+just fine. Access to the bindings (called <code>dictionary</code> here) is
+more clumsy, but hey, you can&#8217;t have your cake and eat it too, I guess. Even
+though that is a pity.</p>
+<h2>A word on patterns</h2>
+<p>Given the <span class="caps">PORO</span> hash</p>
+<pre class="sh_ruby"><code>
+ {
+ :dog =&gt; 'terrier',
+ :cat =&gt; 'suit' }
+</code></pre>
+<p>one might assume that the following rule matches <code>:dog</code> and
+replaces it by <code>'foo'</code>:</p>
+<pre class="sh_ruby"><code>
+ rule(:dog =&gt; 'terrier') { 'foo' }
+</code></pre>
+<p>This is frankly impossible. How would <code>'foo'</code> live besides
+<code>:cat =&gt; 'suit'</code> inside the hash? It cannot. This is why hashes are
+either matched completely, cats n&#8217; all, or not at all.</p>
+<p>Transformations are there for one thing: Getting out of the hash/array/slice
+mess parslet creates (on purpose) into the realm of your own beautifully
+crafted <span class="caps">AST</span> classes. Such
+<a href="http://en.wikipedia.org/wiki/Abstract_syntax_tree"><span class="caps">AST</span></a> nodes will generally
+correspond 1:1 to hashes inside your intermediary tree.</p>
+<p>If transformations get you into a mess, remember this simple truth: They have
+been designed for the above purpose. Abusing them is fun (and almost all the
+examples in the project do so) but the mess you get when you do is all yours.</p>
+<p>If you are really desperate, try to look at the example in <a href="get-started.html">Get Started</a> or at the parser in the sample project
+<a href="https://github.com/kschiess/wt">wt</a>. Imitating them would be a good first step. And if all else fails, we&#8217;re there for you, see the &#8216;Contact&#8217; section in <a href="contribute.html">Contribute</a>.</p>
+<h2>Summary</h2>
+<p>This concludes this (three part) introduction to parslet and leaves you with a
+good knowledge of most tricky parts. If you are missing some detail, maybe
+you can find it in the texts referenced <a href="documentation.html">here</a>? There is
+also an entire page on the tricks useful in practice here: <a href="tricks.html">Tricks</a>.</p>
+<p>If not, please tell us about it. We&#8217;ll include it in this documentation in no
+time.</p></div>
+ <div class="copyright"><p><span class="caps">MIT</span> License, 2010-2012, &#169; <a href="http://absurd.li">Kaspar Schiess</a><br/>
+ Logo by <a href="http://floere.github.com">Florian Hanke</a>, <a href="http://creativecommons.org/licenses/by/1.0/">CC Attribution</a> license</p></div>
+ <script type="text/javascript">var _gaq = _gaq || [];
+ _gaq.push(['_setAccount', 'UA-16365074-2']);
+ _gaq.push(['_trackPageview']);
+ (function() {
+ var ga = document.createElement('script'); ga.type = 'text/javascript'; ga.async = true;
+ ga.src = ('https:' == document.location.protocol ? 'https://ssl' : 'http://www') + '.google-analytics.com/ga.js';
+ var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(ga, s);
+ })();</script>
+ </div>
+ </body>
+</html>
View
197 website/build/tricks.html
@@ -0,0 +1,197 @@
+<!DOCTYPE html>
+<html>
+ <head>
+ <meta content="text/html;charset=UTF-8" http-equiv="Content-type" />
+ <title>parslet -Tricks for common situations</title>
+ <meta content="Kaspar Schiess (http://absurd.li)" name="author" />
+ <link href="images/favicon3.ico" rel="shortcut icon" /><link href="/cod/stylesheets/site.css" media="screen" rel="stylesheet" type="text/css" /><link href="/cod/stylesheets/sh_whitengrey.css" media="screen" rel="stylesheet" type="text/css" /><script src="/cod/javascripts/sh_main.min.js" type="text/javascript"></script><script src="/cod/javascripts/sh_ruby.min.js" type="text/javascript"></script></head>
+ <body class="code" onload="sh_highlightDocument();">
+ <div id="everything">
+ <div class="main_menu"><img alt="Parslet Logo" src="/cod/images/parsley_logo.png" /><ul>
+ <li><a href="index.html">about</a></li>
+ <li><a href="get-started.html">get started</a></li>
+ <li><a href="install.html">install</a></li>
+ <li><a href="documentation.html">documentation</a></li>
+ <li><a href="contribute.html">contribute</a></li>
+ </ul>
+ </div>
+ <div class="content">
+ <h1>Tricks for common situations</h1><h2>Matching <span class="caps">EOF</span> (End Of File)</h2>
+<p>Ahh Sir, you&#8217;ll be needin what us parsers call <em>epsilon</em>:</p>
+<pre class="sh_ruby"><code>
+ rule(:eof) { any.absent? }
+</code></pre>
+<p>Of course, most of us don&#8217;t use this at all, since any parser has <span class="caps">EOF</span> as
+implicit last input.</p>
+<h2>Matching Strings Case Insensitive</h2>
+<p>Parslet is fully hackable: You can use code to create parsers easily. Here&#8217;s
+how I would match a string in case insensitive manner:</p>
+<pre class="sh_ruby"><code title="case insensitive match">
+ def stri(str)
+ key_chars = str.split(//)
+ key_chars.
+ collect! { |char| match["#{char.upcase}#{char.downcase}"] }.
+ reduce(:&gt;&gt;)
+ end
+
+ # Constructs a parser using a Parser Expression Grammar
+ stri('keyword').parse "kEyWoRd" # =&gt; "kEyWoRd"@0
+</code></pre>
+<h2>Testing</h2>
+<p>Parslet helps you to create parsers that are in turn created out of many small
+parsers. It is really turtles all the way down. Imagine you have a complex
+parser:</p>
+<pre class="sh_ruby"><code>
+ class ComplexParser &lt; Parslet::Parser
+ root :lots_of_stuff
+
+ rule(:lots_of_stuff) { ... }
+
+ # and many lines later:
+ rule(:simple_rule) { str('a') }
+ end
+</code></pre>
+<p>Also imagine that the parser (as a whole) fails to consume the &#8216;a&#8217; that
+<code>simple_rule</code> is talking about.</p>
+<p>This kind of problem can very often be fixed by bisecting it into two possible
+problems. Either:</p>
+<ol>
+ <li>the <code>lots_of_stuff</code> rule somehow doesn&#8217;t place <code>simple_rule</code>
+ in the right context or</li>
+ <li>the <code>simple_rule</code> simply (hah!) fails to match its input.</li>
+</ol>
+<p>I find it very useful in this situation to eliminate 2. from our options:</p>
+<pre class="sh_ruby"><code title="rspec">
+ require 'rspec'
+ require 'parslet/rig/rspec'
+
+ class ComplexParser &lt; Parslet::Parser
+ rule(:simple_rule) { str('a') }
+ end
+
+ describe ComplexParser do
+ let(:parser) { ComplexParser.new }
+ context "simple_rule" do
+ it "should consume 'a'" do
+ parser.simple_rule.should parse('a')
+ end
+ end
+ end
+
+ RSpec::Core::Runner.run([])
+</code></pre>
+<p>Output is:
+<pre class="output"></p>
+<p>Example::ComplexParser
+ simple_rule
+ should consume &#8216;a&#8217;</p>
+<p>Finished in 0.00115 seconds
+1 example, 0 failures
+</pre></p>
+<p>Parslet parsers have one method per rule. These methods return valid parsers
+for a subset of your grammar.</p>
+<h2>Error reports</h2>
+<p>If your grammar fails and you&#8217;re aching to know why, here&#8217;s a bit of exception
+handling code that will help you out:</p>
+<pre class="sh_ruby"><code title="exception handling">
+ parser = str('foo')
+ begin
+ parser.parse('bar')
+ rescue Parslet::ParseFailed =&gt; error
+ puts error.cause.ascii_tree
+ end
+</code></pre>
+<p>This should print something akin to:</p>
+<pre class="output">
+Expected "foo", but got "bar" at line 1 char 1.
+</pre>
+<p>These error reports are probably the fastest way to know exactly where you
+went wrong (or where your input is wrong, which is aequivalent).</p>
+<p>And since this is such a common idiom, we provide you with a shortcut: to
+get the above, just:</p>
+<pre class="sh_ruby"><code>
+require 'parslet/convenience'
+parser.parse_with_debug(input)
+</code></pre>
+<h3>Reporter engines</h3>
+<p>Note that there is currently not one, but two error reporting engines! The
+default engine will report errors in a structure that looks exactly like the
+grammar structure:</p>
+<pre class="sh_ruby"><code title="error reporter 1">
+ class P &lt; Parslet::Parser
+ root(:body)
+ rule(:body) { elements }
+ rule(:elements) { (call | element).repeat(2) }
+ rule(:element) { str('bar') }
+ rule(:call) { str('baz') &gt;&gt; str('()') }
+ end
+
+ begin
+ P.new.parse('barbaz')
+ rescue Parslet::ParseFailed =&gt; error
+ puts error.cause.ascii_tree
+ end
+</code></pre>
+<p>Outputs:</p>
+<pre class="output">
+Expected at least 2 of CALL / ELEMENT at line 1 char 1.
+`- Expected one of [CALL, ELEMENT] at line 1 char 4.
+ |- Failed to match sequence ('baz' '()') at line 1 char 7.
+ | `- Premature end of input at line 1 char 7.
+ `- Expected "bar", but got "baz" at line 1 char 4.
+</pre>
+<p>Let&#8217;s switch out the &#8216;grammar structure&#8217; engine (called &#8216;<code>Tree</code>&#8217;)
+with the &#8216;deepest error position&#8217; engine:</p>
+<pre class="sh_ruby"><code title="error reporter 2">
+ class P &lt; Parslet::Parser
+ root(:body)
+ rule(:body) { elements }
+ rule(:elements) { (call | element).repeat(2) }
+ rule(:element) { str('bar') }
+ rule(:call) { str('baz') &gt;&gt; str('()') }
+ end
+
+ begin
+ P.new.parse('barbaz', reporter: Parslet::ErrorReporter::Deepest.new)
+ rescue Parslet::ParseFailed =&gt; error
+ puts error.cause.ascii_tree
+ end
+</code></pre>
+<p>Outputs:</p>
+<pre class="output">
+Expected at least 2 of CALL / ELEMENT at line 1 char 1.
+`- Expected one of [CALL, ELEMENT] at line 1 char 4.
+ |- Failed to match sequence ('baz' '()') at line 1 char 7.
+ | `- Premature end of input at line 1 char 7.
+ `- Premature end of input at line 1 char 7.
+</pre>
+<p>The <code>'Deepest'</code> position engine will store errors that are the
+farthest into the input. In some examples, this produces more readable output
+for the end user.</p>
+<h2>Line numbers from parser output</h2>
+<p>A traditional parser would parse and then perform several checking phases,
+like for example verifying all type constraints are respected in the input.
+During this checking phase, you will most likely want to report screens full
+of type errors back to the user (&#8216;cause that&#8217;s what types are for, right?).
+Now where did that &#8216;int&#8217; come from?</p>
+<p>Parslet gives you slices (Parslet::Slice) of input as part of your tree. These
+are essentially strings with line numbers. Here&#8217;s how to print that error
+message:</p>
+<pre class="sh_ruby"><code>
+ # assume that type == "int"@0 - a piece from your parser output
+ line, col = type.line_and_column
+ puts "Sorry. Can't have #{type} at #{line}:#{col}!"
+</code></pre></div>
+ <div class="copyright"><p><span class="caps">MIT</span> License, 2010-2012, &#169; <a href="http://absurd.li">Kaspar Schiess</a><br/>
+ Logo by <a href="http://floere.github.com">Florian Hanke</a>, <a href="http://creativecommons.org/licenses/by/1.0/">CC Attribution</a> license</p></div>
+ <script type="text/javascript">var _gaq = _gaq || [];
+ _gaq.push(['_setAccount', 'UA-16365074-2']);
+ _gaq.push(['_trackPageview']);
+ (function() {
+ var ga = document.createElement('script'); ga.type = 'text/javascript'; ga.async = true;
+ ga.src = ('https:' == document.location.protocol ? 'https://ssl' : 'http://www') + '.google-analytics.com/ga.js';
+ var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(ga, s);
+ })();</script>
+ </div>
+ </body>
+</html>
Please sign in to comment.
Something went wrong with that request. Please try again.