public
Description: A system for creating fast, reusable parsers
Homepage: http://www.reverberate.org/gazelle/
Clone URL: git://github.com/haberman/gazelle.git
gazelle / ReleaseNotes
100644 98 lines (78 sloc) 4.65 kb
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
 
Gazelle 0.3, released November 30, 2008 =======================================
 
  Overview: This release represents a lot of work on Gazelle's grammar analysis
  and lookahead generation. What this means for users is that many many cases
  where Gazelle would previously hang or generate incorrect grammar have been
  fixed, and there are lots of unit tests to ensure these don't break in the
  future. There is also one new feature: explicit ambiguity resolution.
 
  Core Parsing Algorithm:
  * Gazelle now detects many errors in grammars:
    - left-recursion
    - rules that have no non-recursive alternative
    - grammars that are probably not LL(*) (using a heuristic)
  * Gazelle now supports many LL(*) grammars (that have cyclic GLA's) in
    addition to LL(k). This puts us on par with ANTLR.
  * It is no longer possible to make Gazelle hang by feeding it the wrong
    grammar. Since the problem of whether a grammar is LL(*) is undecidable
    in general, Gazelle gives you two options for how to bound the amount of
    work. You can either use a heuristic to detect cases where the grammar
    is probably not LL(*), or you can specify a bound on k.
  * The generated lookahead can now take EOF into account, which allows us to
    support grammars where we don't know what alternative to take until we see
    EOF.
 
  New Features:
  * Grammar files now support C-style comments: // and /* */.
  * Explicit ambiguity resolution allows grammar authors to specify priorities
    between alternatives that would otherwise be ambiguous. The need for this
    arises in cases like the famous "if-then-else" problem, which is inherently
    ambigous but has a rule for how this ambiguity should be resolved.
 
  Backward Incompatibilities:
  * All regular expressions that are defined within rules (as opposed to being
    named beforehand) must now be named within the rule. For example:
 
      a -> /foo/;
 
    ...is no longer allowed, you must write:
 
      a -> .foo=/foo/;
 
    It's important that regular expressions get a reasonable name. Also,
    unnamed regular expressions conflict with the new syntax for prioritized
    choice.
 
  Caveats:
  * Although the compiler supports EOF in lookahead, proper support has not
    been added to the runtime yet.
  * There's still no way to ignore whitespace yet. This will hopefully come
    in the next release.
 
Gazelle 0.2, released June 29, 2008 ===========================================
 
  Overview: This is a major overhaul of Gazelle's guts. It also has
  significant usability improvements in its documentation and command-line
  tools. It is the first release of Gazelle I would actually recommend
  that people try out.
 
  Core Parsing Algorithm:
  * The compiler now calculates true Strong-LL(k) lookahead. As a result,
    it supports far more grammars than Gazelle 0.1 did.
  * The runtime component now represents all of its state explicitly in its
    stack. As a result, parsing can be interrupted and resumed on arbitrary
    byte boundaries.
  * The "ignore" feature was completely removed, because experience and deeper
    thought revealed that it was not the right abstraction. As a result, this
    version of Gazelle has no way to deal with arbitrary whitespace. A
    replacement feature will come in the future to fill this gap.
 
  Usability:
  * There is a manual (though very much in-progress).
  * The compiler is easier to invoke and has useful --help.
  * The compiler is capable of dumping a visual representation of the grammar
    to HTML if graphviz is installed.
  * gzlparse can dump the parse tree into JSON format.
 
  Caveats (all of these will be fixed sooner or later):
  * The language and all APIs are still very much subject to change.
  * The current release cannot handle whitespace/comments/etc. in languages.
  * Parse errors will exit the program, instead of being reported to the API.
  * There are lots of edge cases the compiler doesn't properly deal with yet.
    Some will make the compiler go into an infinite loop, others will cause
    it to generate incorrect output.
 
Gazelle 0.1, released December 16, 2007 =======================================
 
  Overview: This is a preliminary release of Gazelle, and is not very usable.
  It can parse only a very small class of languages, and all tools/APIs are
  very rough around the edges.
 
  Core Parsing Algorithm:
  * The supported lookahead is weak: only Simple-LL(1) parsing.
  * The runtime is reasonably fast and has a very low memory footprint.
  * The compiler and runtime support loading and storing the compiled version
    of the grammar to bytecode.