Skip to content

Commit

Permalink
Initial commit.
Browse files Browse the repository at this point in the history
  • Loading branch information
aarongough committed Oct 15, 2010
0 parents commit 2e581b7
Show file tree
Hide file tree
Showing 4 changed files with 153 additions and 0 deletions.
55 changes: 55 additions & 0 deletions Readme.rdoc
@@ -0,0 +1,55 @@
= Treetop S-Expression Parser

This is a very small & simple demonstration parser built using {Treetop}[http://treetop.rubyforge.org/]. It is meant solely to teach the very basics of implementing a grammar in Treetop, not as an actual parser for S-Expressions (which is why it's not packaged as a gem, and why it has no tests).

A real parser *should* improve in several main areas:

* Test coverage: Unit tests are vital for parsers.
* Performance: This example is slow in the extreme compared to simpler parsing methods, a real parser should be performant.
* Error reporting: Even the best written parser is a complete pain if it doesn't have intelligent error reporting.

=== The Core Grammar

For easy reference the core grammar of the parser is reproduced here:

grammar Sexp

rule expression
space? '(' body ')' space? <Expression>
end

rule body
(expression / identifier / float / integer / string / space )* <Body>
end

rule integer
('+' / '-')? [0-9]+ <IntegerLiteral>
end

rule float
('+' / '-')? [0-9]+ (('.' [0-9]+) / ('e' [0-9]+)) <FloatLiteral>
end

rule string
'"' ([^"\\] / "\\" . )* '"' <StringLiteral>
end

rule identifier
[a-zA-Z\=\*] [a-zA-Z0-9_\=\*]* <Identifier>
end

rule space
[\s]+
end

end

=== More Info

For more info visit the {Treetop website}[http://treetop.rubyforge.org/], and have a read through the source-code of this parser (don't worry, there isn't much of it!).

=== Author & Credits

Author:: {Aaron Gough}[mailto:aaron@aarongough.com]

Copyright (c) 2010 {Aaron Gough}[http://thingsaaronmade.com/] ({thingsaaronmade.com}[http://thingsaaronmade.com/]), released under the MIT license
37 changes: 37 additions & 0 deletions node_extensions.rb
@@ -0,0 +1,37 @@
module Sexp
class IntegerLiteral < Treetop::Runtime::SyntaxNode
def to_array
return self.text_value.to_i
end
end

class StringLiteral < Treetop::Runtime::SyntaxNode
def to_array
return eval self.text_value
end
end

class FloatLiteral < Treetop::Runtime::SyntaxNode
def to_array
return self.text_value.to_f
end
end

class Identifier < Treetop::Runtime::SyntaxNode
def to_array
return self.text_value.to_sym
end
end

class Expression < Treetop::Runtime::SyntaxNode
def to_array
return self.elements[0].to_array
end
end

class Body < Treetop::Runtime::SyntaxNode
def to_array
return self.elements.map {|x| x.to_array}
end
end
end
30 changes: 30 additions & 0 deletions parser.rb
@@ -0,0 +1,30 @@
require 'treetop'

require File.expand_path(File.join(File.dirname(__FILE__), 'node_extensions.rb'))

class Parser

Treetop.load(File.expand_path(File.join(File.dirname(__FILE__), 'sexp_parser.treetop')))
@@parser = SexpParser.new

def self.parse(data)
tree = @@parser.parse(data)

if(tree.nil?)
raise Exception, "Parse error at offset: #{@@parser.index}"
end

self.clean_tree(tree)

return tree.to_array
end

private

def self.clean_tree(root_node)
return if(root_node.elements.nil?)
root_node.elements.delete_if{|node| node.class.name == "Treetop::Runtime::SyntaxNode" }
root_node.elements.each {|node| self.clean_tree(node) }
end

end
31 changes: 31 additions & 0 deletions sexp_parser.treetop
@@ -0,0 +1,31 @@
grammar Sexp

rule expression
space? '(' body ')' space? <Expression>
end

rule body
(expression / identifier / float / integer / string / space )* <Body>
end

rule integer
('+' / '-')? [0-9]+ <IntegerLiteral>
end

rule float
('+' / '-')? [0-9]+ (('.' [0-9]+) / ('e' [0-9]+)) <FloatLiteral>
end

This comment has been minimized.

Copy link
@timcharper

timcharper Jul 5, 2012

Hi Aaron,

Thank you for your blog post tutorial on how to use treetop.

I wasn't able to get this code to work as is... I wonder if there have been updates to Treetop since? Parsing integers work, but when I use a decimal, it's like the integer rule takes precedence, parses the number, and then errors out when reaching to the decimal.

Transposing the integer and float rules fixed it and allowed me to parse the sexp example on your blog post.

I'll send a pull request if you like.

This comment has been minimized.

Copy link
@aarongough

aarongough via email Jul 16, 2012

Author Owner

This comment has been minimized.

Copy link
@timcharper

timcharper Jul 16, 2012

Now that I've learned treetop and I revisit this, I question why I wasn't able to get it to work as is.The order of float / integer is clearly specified in the rule above. I'll take a second look and see why I ran into it.

The pull request is unrelated. I found that treetop provided it's own failure reason, and it turns out to be pretty helpful. I can understand the hesitation to pull something in without testing it yourself. However, I think if you look at the diff you'll see that it's a very small change.

This comment has been minimized.

Copy link
@timcharper

timcharper Jul 16, 2012

I went back to try to reproduce my error, and I think I was deceived into thinking it was a float parse error. I understand how it happened and on what I was tripped up, but probably too lengthy to describe.

The code here works just fine.

This comment has been minimized.

Copy link
@aarongough

aarongough via email Jul 17, 2012

Author Owner

rule string
'"' ([^"\\] / "\\" . )* '"' <StringLiteral>
end

rule identifier
[a-zA-Z\=\*] [a-zA-Z0-9_\=\*]* <Identifier>
end

rule space
[\s]+
end

end

0 comments on commit 2e581b7

Please sign in to comment.