Permalink
Find file
Fetching contributors…
Cannot retrieve contributors at this time
141 lines (117 sloc) 6.87 KB
Gazelle 0.4, released January 21, 2009 =========================================
Overview: This release addresses all of the most pressing issues that were
preventing Gazelle from being usable. The biggest gap in Gazelle's
capabilities was a lack of whitespace handling; this feature has been added
in this release. A number of other capabilities have been added; notably, the
API can now return parse errors gracefully to the client application instead
of doing assert(false) or exit(1).
New Features:
* Gazelle can now handle languages with whitespace that can appear in many
places in the grammar. You create a terse specification of where the
whitespace can appear using the @allow command. The whitespace becomes
part of the parse tree, and therefore a round-trip to and from the parse
tree is lossless.
* Gazelle can return parse errors to the client application.
* Line/column information is now available, in addition to the raw byte
offset that was previously available. This makes it easier to print
useful messages for things like parse errors.
* There is support for resource limits on the parse stack (memory) and the
lookahead buffer (memory). Input that attempts to exceed these limits
will cause Gazelle to report the resource overage to the client.
* The runtime properly supports both hard EOF (grammar reaches a state where
it cannot accept any more characters) and soft EOF (the input reaches EOF,
and the grammar was in a state where EOF was valid).
* There is a buffering layer that can read a file sequentially but preserve
data for currently unfinished terminals.
Packaging:
* There is now a "make install" target. This will install:
- gzlc (the Gazelle compiler) into $PREFIX/bin, in a standalone binary that
has Lua linked and all the Lua source embedded. This means the binary
isn't dependent on being able to find a bunch of Lua source files via
LUA_PATH.
- gzlparse into $PREFIX/bin
- headers into $PREFIX/include
- libraries into $PREFIX/lib
* The header files are now in a gazelle/ directory, so that they can be
sanely installed in /usr/include on a UNIX system.
* The C API now has its types, functions, and constants prefixed with gzl_
or GZL_, to keep from polluting the global namespace with types like
"terminal."
Gazelle 0.3, released November 30, 2008 =======================================
Overview: This release represents a lot of work on Gazelle's grammar analysis
and lookahead generation. What this means for users is that many many cases
where Gazelle would previously hang or generate incorrect grammar have been
fixed, and there are lots of unit tests to ensure these don't break in the
future. There is also one new feature: explicit ambiguity resolution.
Core Parsing Algorithm:
* Gazelle now detects many errors in grammars:
- left-recursion
- rules that have no non-recursive alternative
- grammars that are probably not LL(*) (using a heuristic)
* Gazelle now supports many LL(*) grammars (that have cyclic GLA's) in
addition to LL(k). This puts us on par with ANTLR.
* It is no longer possible to make Gazelle hang by feeding it the wrong
grammar. Since the problem of whether a grammar is LL(*) is undecidable
in general, Gazelle gives you two options for how to bound the amount of
work. You can either use a heuristic to detect cases where the grammar
is probably not LL(*), or you can specify a bound on k.
* The generated lookahead can now take EOF into account, which allows us to
support grammars where we don't know what alternative to take until we see
EOF.
New Features:
* Grammar files now support C-style comments: // and /* */.
* Explicit ambiguity resolution allows grammar authors to specify priorities
between alternatives that would otherwise be ambiguous. The need for this
arises in cases like the famous "if-then-else" problem, which is inherently
ambigous but has a rule for how this ambiguity should be resolved.
Backward Incompatibilities:
* All regular expressions that are defined within rules (as opposed to being
named beforehand) must now be named within the rule. For example:
a -> /foo/;
...is no longer allowed, you must write:
a -> .foo=/foo/;
It's important that regular expressions get a reasonable name. Also,
unnamed regular expressions conflict with the new syntax for prioritized
choice.
Caveats:
* Although the compiler supports EOF in lookahead, proper support has not
been added to the runtime yet.
* There's still no way to ignore whitespace yet. This will hopefully come
in the next release.
Gazelle 0.2, released June 29, 2008 ===========================================
Overview: This is a major overhaul of Gazelle's guts. It also has
significant usability improvements in its documentation and command-line
tools. It is the first release of Gazelle I would actually recommend
that people try out.
Core Parsing Algorithm:
* The compiler now calculates true Strong-LL(k) lookahead. As a result,
it supports far more grammars than Gazelle 0.1 did.
* The runtime component now represents all of its state explicitly in its
stack. As a result, parsing can be interrupted and resumed on arbitrary
byte boundaries.
* The "ignore" feature was completely removed, because experience and deeper
thought revealed that it was not the right abstraction. As a result, this
version of Gazelle has no way to deal with arbitrary whitespace. A
replacement feature will come in the future to fill this gap.
Usability:
* There is a manual (though very much in-progress).
* The compiler is easier to invoke and has useful --help.
* The compiler is capable of dumping a visual representation of the grammar
to HTML if graphviz is installed.
* gzlparse can dump the parse tree into JSON format.
Caveats (all of these will be fixed sooner or later):
* The language and all APIs are still very much subject to change.
* The current release cannot handle whitespace/comments/etc. in languages.
* Parse errors will exit the program, instead of being reported to the API.
* There are lots of edge cases the compiler doesn't properly deal with yet.
Some will make the compiler go into an infinite loop, others will cause
it to generate incorrect output.
Gazelle 0.1, released December 16, 2007 =======================================
Overview: This is a preliminary release of Gazelle, and is not very usable.
It can parse only a very small class of languages, and all tools/APIs are
very rough around the edges.
Core Parsing Algorithm:
* The supported lookahead is weak: only Simple-LL(1) parsing.
* The runtime is reasonably fast and has a very low memory footprint.
* The compiler and runtime support loading and storing the compiled version
of the grammar to bytecode.