Permalink
Browse files

More release notes, set today as the release day, cleanups, fixes, an…

…d documentation for 0.4!
  • Loading branch information...
1 parent c10143b commit c37c1072dea21594720fb92e4a9531e7d6299539 Joshua Haberman committed Jan 21, 2009
Showing with 63 additions and 30 deletions.
  1. +9 −1 ReleaseNotes
  2. +0 −1 compiler/grammar.lua
  3. +1 −1 compiler/gzlc
  4. +42 −18 docs/manual.txt
  5. +11 −9 runtime/include/gazelle/parse.h
View
@@ -27,7 +27,15 @@ Gazelle 0.4, released January XX, 2009 =========================================
* There is a buffering layer that can read a file sequentially but preserve
data for currently unfinished terminals.
- C API packaging:
+ Packaging:
+ * There is now a "make install" target. This will install:
+ - gzlc (the Gazelle compiler) into $PREFIX/bin, in a standalone binary that
+ has Lua linked and all the Lua source embedded. This means the binary
+ isn't dependent on being able to find a bunch of Lua source files via
+ LUA_PATH.
+ - gzlparse into $PREFIX/bin
+ - headers into $PREFIX/include
+ - libraries into $PREFIX/lib
* The header files are now in a gazelle/ directory, so that they can be
sanely installed in /usr/include on a UNIX system.
* The C API now has its types, functions, and constants prefixed with gzl_
View
@@ -64,7 +64,6 @@ function Grammar:add_allow(what_to_allow, start_nonterm, end_nonterms)
local children_func = function(rule_name)
if not end_nonterms:contains(rule_name) then
- print(rule_name)
local rtn = self.rtns:get(rule_name)
if not rtn then
error(string.format("Error computing ignore: rule %s does not exist", rule_name))
View
@@ -19,7 +19,7 @@ require "ll"
require "pp"
-version = "Gazelle v0.3"
+version = "Gazelle v0.4"
usage = string.format([[
gzlc -- Gazelle grammar compiler.
%s http://www.reverberate.org/gazelle/
View
@@ -1,7 +1,7 @@
Gazelle Manual
==============
Joshua Haberman <joshua@reverberate.org>
-v0.3, October 2008
+v0.4, January 2009
:toc:
Gazelle is a system for parsing formal languages. A formal language is any
@@ -39,8 +39,8 @@ An Introductory Tour
--------------------
This section offers a quick tour of Gazelle and its capabilities. It assumes
-you have already compiled Gazelle successfully; the instructions for doing so
-are in the README.
+you have already compiled and instaled Gazelle successfully; the instructions
+for doing so are in the README.
Hello, World
~~~~~~~~~~~~
@@ -79,8 +79,7 @@ finite automata (state machines) that can do the real parsing. It writes
this compiled version of the grammar to bytecode.
------------------------------------------------
-$ . lua_path
-$ ./compiler/gzlc --help
+$ gzlc --help
gzlc -- Gazelle grammar compiler.
Gazelle v0.3 http://www.reverberate.org/gazelle/
@@ -108,7 +107,7 @@ Usage: gzlc [options] input-file
output statistics.
--version dump Gazelle version
-$ ./compiler/gzlc hello.gzl
+$ gzlc hello.gzl
$
------------------------------------------------
@@ -125,7 +124,7 @@ Tallis:gazelle joshua$ ls -l hello*
If you want to see a bit more verbose output, run `gzlc` with `-v`.
------------------------------------------------
-$ ./compiler/gzlc -v hello.gzl
+$ gzlc -v hello.gzl
Gazelle v0.3
Opening input file 'hello.gzl'...
Parsing grammar...
@@ -147,7 +146,7 @@ valid input text for this language, then use `gzlparse` to actually parse it.
------------------------------------------------
$ echo -n '((((500))))' > hello_text
-$ ./utilities/gzlparse --help
+$ gzlparse --help
gzlparse -- A command-line tool for parsing input text.
Gazelle 0.3 http://www.reverberate.org/gazelle/.
@@ -158,7 +157,7 @@ Input file can be '-' for stdin.
--dump-total When parsing finishes, print the number of bytes parsed.
--help You're looking at it.
-$ ./utilities/gzlparse hello.gzc hello_text
+$ gzlparse hello.gzc hello_text
$
------------------------------------------------
@@ -168,7 +167,7 @@ If you want to see more output, the flags `--dump-json` and `--dump-total` will
print a parse tree in JSON format and a total byte count, respectively.
------------------------------------------------
-$ ./utilities/gzlparse --dump-json --dump-total hello.gzc hello_text
+$ gzlparse --dump-json --dump-total hello.gzc hello_text
{"parse_tree":
{"rule":"hello", "start": 0, "children": [
{"terminal": "(", "slotname": "(", "slotnum": 0, "offset": 0, "len": 1},
@@ -204,7 +203,7 @@ Doing this HTML dump requires that Graphviz is installed. If ImageMagick is
installed, this will also be used to create thumbnails.
------------------------------------------------
-$ ./compiler/gzlc -d hello.gzl
+$ gzlc -d hello.gzl
------------------------------------------------
Now open `html/index.html` in a web browser to view the HTML dump. You will
@@ -572,6 +571,31 @@ in the grammar is the start symbol. The syntax of this command is simply:
`@start` is not required, and may not appear more than once per grammar.
+`@allow`
+^^^^^^^^
+
+The `@allow` command lets you tell Gazelle about syntax constructs that can
+appear in many different places througout your grammar. Its primary use case
+is whitespace -- it lets you say that whitespace can appear between any two
+components of certain rules in your grammar. Its syntax is:
+
+-------------------------------------------
+@allow nonterm_to_allow in start_nonterm...end_nonterm[, other_end_nonterm]*;
+-------------------------------------------
+
+For example:
+
+-------------------------------------------
+@allow whitespace in program...string, number;
+-------------------------------------------
+
+`start_nonterm` is very likely to be the top-level rule in your grammar -- the
+rule you specified to `@start`. `end_nonterm` specifies what rule or rules
+do 'not' allow whitespace (their child rules are automatically excluded also).
+All rules that are sub-rules `start_nonterm` (either directly or indirectly),
+but are 'not' sub-rules of `end_nonterm`, will allow `nonterm_to_allow` in
+between any two components of the rule.
+
[[X3]]
Ambiguity Resolution
~~~~~~~~~~~~~~~~~~~~
@@ -895,13 +919,13 @@ them will be in future versions of the manual.
The C Runtime
-----------
-This section will document the API for the parsing runtime, and explained
-how it can be used for event-based parsing, syntax trees, ASTs, and
-whitespace-preserving transformations. It will also discuss best
-practices for wrapping the C runtime in other languages. It will
-'not' offer documentation about each language's wrappers -- I think
-that belongs in separate manuals, but I may change my mind about that.
-
+The C Runtime is documentated in the header files in `runtime/include/gazelle`.
+You can browse them online at GitHub:
+http://github.com/haberman/gazelle/tree/v0.4/runtime/include/gazelle[the
+header files for the latest version of Gazelle]. For an example of
+how to use the runtime, see `utilities/gzlparse.c`,
+http://github.com/haberman/gazelle/tree/v0.4/utilities/gzlparse.c[which
+you can also view online at GitHub].
The Gazelle Algorithm
---------------------
@@ -21,7 +21,7 @@
#include "gazelle/dynarray.h"
#include "gazelle/grammar.h"
-#define GAZELLE_VERSION "0.3"
+#define GAZELLE_VERSION "0.4"
#define GAZELLE_WEBPAGE "http://www.reverberate.org/gazelle/"
#ifdef __cplusplus
@@ -194,24 +194,26 @@ struct gzl_parse_state
* input file or stream at offset s->offset.
*
* Return values:
- * - GZL_PARSE_STATUS_OK: the entire buffer has been consumed successfully,
- * and "state" represents the state of the parse as of the last byte of the
+ * - GZL_STATUS_OK: the entire buffer has been consumed successfully, and
+ * "state" represents the state of the parse as of the last byte of the
* buffer. You may continue parsing this file by calling gzl_parse() again
* with more data, or you may call gzl_finish_parse() if the input has
* reached EOF.
- * - GZL_PARSE_STATUS_ERROR: there was a parse error in the input. The parse
- * state is as it immediately before the erroneous character or token was
+ * - GZL_STATUS_ERROR: there was a parse error in the input. The parse state
+ * is as it immediately before the erroneous character or token was
* encountered, and can therefore be used again if desired to continue the
* parse from that point. state->offset will reflect how far the parse
* proceeded before encountering the error.
- * - GZL_PARSE_STATUS_CANCELLED: a callback that was called inside of
- * gzl_parse() requested that parsing halt. state is now invalid (this may
- * change for the better in the future).
- * - GZL_PARSE_STATUS_EOF: all or part of the buffer was parsed successfully,
+ * - GZL_STATUS_CANCELLED: a callback that was called inside of gzl_parse()
+ * requested that parsing halt. state is now invalid (this may change for
+ * the better in the future).
+ * - GZL_STATUS_HARD_EOF: all or part of the buffer was parsed successfully,
* but a state was reached where no more characters could be accepted
* according to the grammar. state->offset reflects how many characters
* were read before parsing reached this state. The client should call
* gzl_finish_parse() if it wants to receive final callbacks.
+ * - GZL_STATUS_RESOURCE_LIMIT_EXCEEDED: a resource limit like maximum stack
+ * depth or maximum lookahead limit was exceeded.
*/
enum gzl_status {
GZL_STATUS_OK,

0 comments on commit c37c107

Please sign in to comment.