Skip to content

Commit

Permalink
Add roadmap
Browse files Browse the repository at this point in the history
  • Loading branch information
jsalzbergedu committed Jul 23, 2020
1 parent a6485f1 commit 526101d
Show file tree
Hide file tree
Showing 2 changed files with 41 additions and 0 deletions.
37 changes: 37 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,3 +6,40 @@
# PEGREG
A lua library for compiling a subset of PEGs to FSTs.
Requires fst-fast-system (an NFST interpreter) and fst-fast (a lua library wrapping fst-fast-system).


# Roadmap
For arbitrary regular expressions or string literals A and B, this library
turns A/B, AB, and A* into DFAs quite well.
However, this library aims to make A/B and A* _possessive_, and allow
linear-time matching and capture.
Therefore, there are still two major items on the roadmap before this library can
be made a part of Rosie:

1. Possessiveness.
This will require queries that can find the following:
- [X] The DFA of an NFA
- [X] The states that ought to be demoted (their outgoing arrows ignored)
- [ ] The arrows that ought to be ignored
- [ ] The NFA resulting from removing those arrows

And interpreters (AST transformers) that can generate
- [X] NFAs from each subexpression of the AST
- [ ] DFAs from each sub-nfa
- [ ] B substates from each subexpression
- [ ] B states from each subexpression
- [ ] B subarrows from each subexpression
- [ ] B arrows from each subexpression
- [ ] The possessive NFA that results from removing the forbidden b arrows.

2. Matching subexpressions
This library intends to use Danny Dubé and Marc Feely's method of extracting
matches from DFAs, described in these two papers:
[Efficiently building a parse tree from a regular expression](https://www.iro.umontreal.ca/~feeley/papers/DubeFeeleyACTAINFORMATICA00.pdf)
[Automatic construction of parse trees for lexemes](http://www.schemeworkshop.org/2006/14-dube.pdf)
And implemented in [SILex](https://code.call-cc.org/svn/chicken-eggs/release/5/silex/trunk/silex.scm)

to do this, we need

- [ ] A backend that can accept `push`, `snoc`, and `sel`, instead of DFSTs operating from chars to char
- [ ] A frontend that emits those instructions instead of DFSTs from char to char
4 changes: 4 additions & 0 deletions src/pegreg/interpreters/reify.lua
Original file line number Diff line number Diff line change
Expand Up @@ -76,6 +76,10 @@ local function nub(lst)
end


--- Create an NFA from states and arrows
--- @param states any a list of states
--- @param arrows any a list of arrows
--- @return NFA
function reify.create(states, arrows)
-- First, nub and sort
table.sort(states, function (a, b) return a.number < b.number end)
Expand Down

0 comments on commit 526101d

Please sign in to comment.