Skip to content
This repository
Fetching contributors…

Cannot retrieve contributors at this time

file 96 lines (70 sloc) 3.465 kb
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96
INTRODUCTION

A small library that implements a PEG grammar. PEG means Parsing Expression
Grammars [1]. These are a different kind of grammars that recognize almost the
same languages as your conventional LR parser, except that they are easier to
work with, since they haven't been conceived for generation, but for
recognition of languages. You can read the founding paper of the field by
Bryan Ford here [2].

Other Ruby projects that work on the same topic are:
http://wiki.github.com/luikore/rsec/
http://github.com/mjijackson/citrus
http://github.com/nathansobo/treetop

My goal here was to see how a parser/parser generator should be constructed to
allow clean AST construction and good error handling. It seems to me that most
often, parser generators only handle the success-case and forget about
debugging and error generation.

More specifically, this library is motivated by one of my compiler projects. I
started out using 'treetop' (see the link above), but found it unusable. It
was lacking in

  * error reporting: Hard to see where a grammar fails.
  
  * stability of generated trees: Intermediary trees were dictated by the
    grammar. It was hard to define invariants in that system - what was
    convenient when writing the grammar often wasn't in subsequent stages.
    
  * clarity of parser code: The parser code is generated and is very hard
    to read. Add that to the first point to understand my pain.
    
So parslet tries to be different. It doesn't generate the parser, but instead
defines it in a DSL which is very close to what you find in [2]. A successful
parse then generates a parser tree consisting entirely of hashes and arrays
and strings (read: instable). This parser tree can then be converted to a real
AST (read: stable) using a pattern matcher that is also part of this library.

Error reporting is another area where parslet excels: It is able to print not
only the error you are used to seeing ('Parse failed because of REASON at line
1 and char 2'), but also prints what led to that failure in the form of a
tree (#error_tree method).

[1] http://en.wikipedia.org/wiki/Parsing_expression_grammar
[2] http://pdos.csail.mit.edu/~baford/packrat/popl04/peg-popl04.pdf

SYNOPSIS

  require 'parslet'
  include Parslet

  # Constructs a parser using a Parser Expression Grammar like DSL:
  parser = str('"') >>
            (
              str('\\') >> any |
              str('"').absnt? >> any
            ).repeat.as(:string) >>
            str('"')
  
  # Parse the string and capture parts of the interpretation (:string above)
  tree = parser.parse(%Q{
    "This is a \\"String\\" in which you can escape stuff"
  }.strip)

  tree # => {:string=>"This is a \\\"String\\\" in which you can escape stuff"}

  # Here's how you can grab results from that tree, two methods:

  # 1)
  Pattern.new(:string => simple(:x)).each_match(tree) do |dictionary|
    puts "String contents (method 1): #{dictionary[:x]}"
  end

  # 2)
  transform = Parslet::Transform.new do
    rule(:string => simple(:x)) {
      puts "String contents (method 2): #{x}" }
  end
  transform.apply(tree)

COMPATIBILITY

This library should work with both ruby 1.8 and ruby 1.9.

AUTHORS

My gigantous thanks go to the following cool guys and gals that help make this
rock:

Florian Hanke (http://floere.github.com)

STATUS

On the road to 1.0; improving documentation, packaging and upgrading to rspec2.

(c) 2010 Kaspar Schiess
Something went wrong with that request. Please try again.