Permalink
Browse files

Accelerator: Implements str/re and sequences

Please see qed/optimizer.md for details.
  • Loading branch information...
1 parent 8726113 commit 3c8bd504f7cfe6c6ffb495262fd612b0bf01e9ac @kschiess committed Jul 15, 2013
Showing with 173 additions and 1 deletion.
  1. +2 −1 Gemfile
  2. +38 −0 lib/parslet/accelerator.rb
  3. +78 −0 lib/parslet/accelerator/engine.rb
  4. +1 −0 qed/applique/ae.rb
  5. +4 −0 qed/applique/parslet.rb
  6. +50 −0 qed/optimizers.md
View
@@ -5,7 +5,8 @@ gemspec
group :development do
%w(rspec flexmock rdoc sdoc
guard guard-rspec guard-simple_shell
- rb-fsevent growl rake).
+ rb-fsevent growl rake
+ qed ae).
each { |gem_name|
gem gem_name }
end
View
@@ -0,0 +1,38 @@
+module Parslet::Accelerator
+
+ class Expression
+ attr_reader :type
+ attr_reader :args
+
+ def initialize(type, *args)
+ @type = type
+ @args = args
+ end
+
+ def >> other_expr
+ if type == :seq
+ @args << other_expr
+ else
+ Expression.new(:seq, self, other_expr)
+ end
+ end
+ end
+
+module_function
+ def str variable
+ Expression.new(:str, variable)
+ end
+
+ def re variable
+ Expression.new(:re, variable)
+ end
+
+ def match atom, expr
+ engine = Engine.new
+
+ return engine.bindings if engine.match(atom, expr)
+ return false
+ end
+end
+
+require 'parslet/accelerator/engine'
@@ -0,0 +1,78 @@
+
+require 'parslet/atoms/visitor'
+
+module Parslet::Accelerator
+ class Apply
+ def initialize(engine, expr)
+ @engine = engine
+ @expr = expr
+ end
+
+ def visit_parser(root)
+ false
+ end
+ def visit_entity(name, block)
+ false
+ end
+ def visit_named(name, atom)
+ false
+ end
+ def visit_repetition(tag, min, max, atom)
+ false
+ end
+ def visit_alternative(alternatives)
+ false
+ end
+ def visit_sequence(sequence)
+ match(:seq) do |*expressions|
+ return false if sequence.size != expressions.size
+
+ sequence.zip(expressions).all? do |atom, expr|
+ @engine.match(atom, expr)
+ end
+ end
+ end
+ def visit_lookahead(positive, atom)
+ false
+ end
+ def visit_re(regexp)
+ match(:re) do |variable|
+ @engine.try_bind(variable, regexp)
+ end
+ end
+ def visit_str(str)
+ match(:str) do |variable|
+ @engine.try_bind(variable, str)
+ end
+ end
+
+ def match(type_tag)
+ expr_tag = @expr.type
+ if expr_tag == type_tag
+ yield *@expr.args
+ end
+ end
+ end
+
+ class Engine
+ attr_reader :bindings
+
+ def initialize
+ @bindings = {}
+ end
+
+ def match(atom, expr)
+ atom.accept(
+ Apply.new(self, expr))
+ end
+
+ def try_bind(variable, value)
+ if @bindings.has_key? variable
+ return value == @bindings[variable]
+ else
+ @bindings[variable] = value
+ return true
+ end
+ end
+ end
+end
View
@@ -0,0 +1 @@
+require 'ae'
View
@@ -0,0 +1,4 @@
+require 'parslet'
+require 'parslet/accelerator'
+
+include Parslet
View
@@ -0,0 +1,50 @@
+
+# Parslet Optimizers
+
+* Detect patterns in parsers: see section on 'Parser Pattern Detection'
+* Replace with optimized parsers: see section on 'Binding and Actions'
+
+## Parser Pattern Detection
+
+We'll demonstrate how pattern detection is constructed by showing what the smallest parts do first. Let's require needed libraries.
+
+ require 'parslet'
+ require 'parslet/accelerator'
+
+ include Parslet
+
+The whole optimizer code is inside the `Parslet::Accelerator` module. If you read that, read 'particle accelerator', not 'will make my code fast'. It is very possible to make things worse using `Parslet::Accelerator`.
+
+The simplest parser I can think of would be the one matching a simple string.
+
+ atom = str('foo')
+ expression = Accelerator.str(:x)
+ binding = Accelerator.match(atom, expression)
+
+ binding[:x].assert == 'foo'
+
+Note that the above was somewhat verbose, with all these variables and all that. We'll write shorter examples from now on.
+
+Another simple parser is the one that matches against variants of the `match(...)` atom. Multiple forms of this exist, so we'll go through all of them. First comes a simple character range match.
+
+ binding = Accelerator.match(
+ match['a-z'],
+ Accelerator.re(:x))
+
+ binding[:x].assert == '[a-z]'
+
+Note how the internal regular expression value used for the match atom is really bound to :x – we'll keep this as a convention. This also means that some parslet internas are leaked through the Accelerator API here. We consider that a feature, since working with accelerators will bring you into close contact with the atoms internas.
+
+Also the Accelerator matcher for `Parslet.match` is called `Accelerator.re` - the reason for this should be obvious from what stands above.
+
+## Composite Parsers
+
+Let's start assembling these simple parsers into more complex patterns and match those. As our first pattern, we'll consider sequences.
+
+ binding = Accelerator.match(
+ str('a') >> str('b'),
+ Accelerator.str(:x) >> Accelerator.str(:y))
+
+ binding.values_at(:x, :y).assert == %w(a b)
+
+Matching of 'foo' against 'foo'.

0 comments on commit 3c8bd50

Please sign in to comment.