Permalink
Browse files

Added support for occurrence ranges

  • Loading branch information...
cjheath committed Apr 16, 2010
1 parent 4d36d24 commit 3aaf1dfed606ec6dd08914f4ecec6bcf9d152dd3
View
@@ -112,7 +112,16 @@ Any item in a rule may be followed by a '+' or a '*' character, signifying one-o
end
end
-The 'a'* will always eat up any 'a's that follow, and the subsequent 'a' will find none there, so the whole rule will fail. You might need to use lookahead to avoid matching too much.
+The 'a'* will always eat up any 'a's that follow, and the subsequent 'a' will find none there, so the whole rule will fail. You might need to use lookahead to avoid matching too much. Alternatively, you can use an occurrence range:
+
+ # toogreedy.treetop
+ grammar TooGreedy
+ rule two_to_four_as
+ 'a' 2..4
+ end
+ end
+
+In an occurrence range, you may omit either the minimum count or the maximum count, so that "0.. " works like "*" and "1.. " works like '+'.
Negative Lookahead
------------------
@@ -141,24 +150,33 @@ Positive lookahead
Sometimes you want an item to match, but only if the *following* text would match some pattern. You don't want to consume that following text, but if it's not there, you want this rule to fail. You can append a positive lookahead like this to a rule by appending the lookahead rule preceeded by an & character.
+Semantic predicates
+-------------------
+
+Warning: This is an advanced feature. You need to understand the way a packrat parser operates to use it correctly. The result of computing a rule containing a semantic predicate will be memoized, even if the same rule, applied later at the same location in the input, would work differently due to a semantic predicate returning a diffent value. If you don't understand the previous sentence yet still use this feature, you're on your own, test carefully!
+
+Sometimes, you need to run external Ruby code to decide whether this syntax rule should continue or should fail. You can do this using either positive or negative semantic predicates. These are Ruby code blocks (lambdas) which are called when the parser reaches that location. For this rule to succeed, the value must be true for a positive predicate (a block like &{ ... }), or false for a negative predicate (a block like !{ ... }).
+The block is called with one argument, the array containing the preceding syntax nodes in the current sequence. Within the block, you cannot use node names or labels for the preceding nodes, as the node for the current rule does not yet exist. You must refer to preceding nodes using their position in the sequence.
+
+ grammar Keywords
+ rule sequence_of_reserved_and_nonreserved_words
+ ( reserved / word )*
+ end
+
+ rule reserved
+ word &{ |s| symbol_reserved?(s[0].text_value) }
+ end
+
+ rule word
+ ([a-zA-Z]+ [ \t]+)
+ end
+ end
+
+One case where it is always safe to use a semantic predicate is to invoke the Ruby debugger, but don't forget to return true so the rule succeeds! Assuming you have required the 'ruby-debug' module somewhere, it looks like this:
+
+ rule problems
+ word &{ |s| debugger; true }
+ end
-Features to cover in the talk
-=============================
-
-* Treetop files
-* Grammar definition
-* Rules
-* Loading a grammar
-* Compiling a grammar with the `tt` command
-* Accessing a parser for the grammar from Ruby
-* Parsing Expressions of all kinds
-? Left recursion and factorization
- - Here I can talk about function application, discussing how the operator
- could be an arbitrary expression
-* Inline node class eval blocks
-* Node class declarations
-* Labels
-* Use of super within within labels
-* Grammar composition with include
-* Use of super with grammar composition
+When the debugger stops here, you can inspect the contents of the SyntaxNode for "word" by looking at s[0], and the stack trace will show how you got there.
@@ -575,7 +575,7 @@ module DeclarationSequence2
def declarations
[head] + tail
end
-
+
def tail
super.elements.map { |elt| elt.declaration }
end
@@ -955,11 +955,11 @@ module Choice2
def alternatives
[head] + tail
end
-
+
def tail
super.elements.map {|elt| elt.alternative}
end
-
+
def inline_modules
(alternatives.map {|alt| alt.inline_modules }).flatten
end
@@ -1076,17 +1076,17 @@ module Sequence2
def sequence_elements
[head] + tail
end
-
+
def tail
super.elements.map {|elt| elt.labeled_sequence_primary }
end
-
+
def inline_modules
(sequence_elements.map {|elt| elt.inline_modules}).flatten +
[sequence_element_accessor_module] +
node_class_declarations.inline_modules
end
-
+
def inline_module_name
node_class_declarations.inline_module_name
end
@@ -1199,15 +1199,15 @@ module Primary1
def compile(address, builder, parent_expression=nil)
prefix.compile(address, builder, self)
end
-
+
def prefixed_expression
atomic
end
-
+
def inline_modules
atomic.inline_modules
end
-
+
def inline_module_name
nil
end
@@ -1253,19 +1253,19 @@ module Primary5
def compile(address, builder, parent_expression=nil)
suffix.compile(address, builder, self)
end
-
+
def optional_expression
atomic
end
-
+
def node_class_name
node_class_declarations.node_class_name
end
-
+
def inline_modules
atomic.inline_modules + node_class_declarations.inline_modules
end
-
+
def inline_module_name
node_class_declarations.inline_module_name
end
@@ -1285,15 +1285,15 @@ module Primary7
def compile(address, builder, parent_expression=nil)
atomic.compile(address, builder, self)
end
-
+
def node_class_name
node_class_declarations.node_class_name
end
-
+
def inline_modules
atomic.inline_modules + node_class_declarations.inline_modules
end
-
+
def inline_module_name
node_class_declarations.inline_module_name
end
@@ -1422,11 +1422,11 @@ module LabeledSequencePrimary1
def compile(lexical_address, builder)
sequence_primary.compile(lexical_address, builder)
end
-
+
def inline_modules
sequence_primary.inline_modules
end
-
+
def label_name
if label.name
label.name
@@ -1585,15 +1585,15 @@ module SequencePrimary1
def compile(lexical_address, builder)
prefix.compile(lexical_address, builder, self)
end
-
+
def prefixed_expression
elements[1]
end
-
+
def inline_modules
atomic.inline_modules
end
-
+
def inline_module_name
nil
end
@@ -1635,15 +1635,15 @@ module SequencePrimary5
def compile(lexical_address, builder)
suffix.compile(lexical_address, builder, self)
end
-
+
def node_class_name
nil
end
-
+
def inline_modules
atomic.inline_modules
end
-
+
def inline_module_name
nil
end
@@ -1808,15 +1808,15 @@ module NodeClassDeclarations1
def node_class_name
node_class_expression.node_class_name
end
-
+
def inline_modules
trailing_inline_module.inline_modules
end
-
+
def inline_module
trailing_inline_module.inline_module
end
-
+
def inline_module_name
inline_module.module_name if inline_module
end
@@ -1886,8 +1886,13 @@ def _nt_repetition_suffix
if r2
r0 = r2
else
- @index = i0
- r0 = nil
+ r3 = _nt_occurrence_range
+ if r3
+ r0 = r3
+ else
+ @index = i0
+ r0 = nil
+ end
end
end
@@ -1896,6 +1901,94 @@ def _nt_repetition_suffix
r0
end
+ module OccurrenceRange0
+ def min
+ elements[1]
+ end
+
+ def max
+ elements[3]
+ end
+ end
+
+ def _nt_occurrence_range
+ start_index = index
+ if node_cache[:occurrence_range].has_key?(index)
+ cached = node_cache[:occurrence_range][index]
+ if cached
+ cached = SyntaxNode.new(input, index...(index + 1)) if cached == true
+ @index = cached.interval.end
+ end
+ return cached
+ end
+
+ i0, s0 = index, []
+ r2 = _nt_space
+ if r2
+ r1 = r2
+ else
+ r1 = instantiate_node(SyntaxNode,input, index...index)
+ end
+ s0 << r1
+ if r1
+ s3, i3 = [], index
+ loop do
+ if has_terminal?('\G[0-9]', true, index)
+ r4 = true
+ @index += 1
+ else
+ r4 = nil
+ end
+ if r4
+ s3 << r4
+ else
+ break
+ end
+ end
+ r3 = instantiate_node(SyntaxNode,input, i3...index, s3)
+ s0 << r3
+ if r3
+ if has_terminal?('..', false, index)
+ r5 = instantiate_node(SyntaxNode,input, index...(index + 2))
+ @index += 2
+ else
+ terminal_parse_failure('..')
+ r5 = nil
+ end
+ s0 << r5
+ if r5
+ s6, i6 = [], index
+ loop do
+ if has_terminal?('\G[0-9]', true, index)
+ r7 = true
+ @index += 1
+ else
+ r7 = nil
+ end
+ if r7
+ s6 << r7
+ else
+ break
+ end
+ end
+ r6 = instantiate_node(SyntaxNode,input, i6...index, s6)
+ s0 << r6
+ end
+ end
+ end
+ if s0.last
+ r0 = instantiate_node(OccurrenceRange,input, i0...index, s0)
+ r0.extend(OccurrenceRange0)
+ else
+ @index = i0
+ r0 = nil
+ end
+
+ node_cache[:occurrence_range][start_index] = r0
+
+ r0
+ end
+
def _nt_prefix
start_index = index
if node_cache[:prefix].has_key?(index)
@@ -2816,7 +2909,7 @@ module TrailingInlineModule1
def inline_modules
[inline_module]
end
-
+
def inline_module_name
inline_module.module_name
end
@@ -2826,11 +2919,11 @@ module TrailingInlineModule2
def inline_modules
[]
end
-
+
def inline_module
- nil
+ nil
end
-
+
def inline_module_name
nil
end
Oops, something went wrong.

0 comments on commit 3aaf1df

Please sign in to comment.