Permalink
Browse files

Updated docs (not yet complete)

  • Loading branch information...
1 parent 86aea40 commit d31ff7468e9f4caf8b004d15a30be007a18aae78 @mjackson committed Jun 9, 2010
Showing with 182 additions and 18 deletions.
  1. +70 −9 README
  2. +4 −5 doc/background.rdoc
  3. +1 −3 doc/index.rdoc
  4. +1 −1 doc/license.rdoc
  5. +106 −0 doc/usage.rdoc
View
@@ -49,7 +49,7 @@ you can write powerful parsers that are simple to understand and easy to create.
In Citrus, there are three main types of objects: rules, grammars, and matches.
-=== Rules
+== Rules
Rules are objects that specify some matching behavior on a string. There are
two types of rules: terminals and non-terminals. Terminals can be either Ruby
@@ -68,7 +68,7 @@ Rule objects may also have semantic information associated with them in the form
of Ruby modules. These modules contain methods that will be used to extend any
matches created by the rule with which they are associated.
-=== Grammars
+== Grammars
A grammar is a container for rules. Usually the rules in a grammar collectively
form a complete specification for some language, or a well-defined subset
@@ -78,7 +78,7 @@ normally used to create more complex grammars. Any grammar rule with the same
name as a rule in an included grammar may access that rule with a mechanism
similar to Ruby's +super+ keyword.
-=== Matches
+== Matches
Matches are created by Rule objects when they match on the input. Matches
contain the string of text that made up the match as well as its offset in the
@@ -102,8 +102,72 @@ and any submatches.
The most straightforward way to compose a Citrus grammar is to use Citrus' own
custom grammar syntax. The syntax borrows heavily from Ruby, so it should
-already be familiar to most Ruby programmers. Below is an example of a simple
-calculator that respects operator precedence.
+already be familiar to most Ruby programmers.
+
+== Terminals
+
+Terminals may be represented by a string or a regular expression. Both follow
+the same rules as Ruby string and regular expression literals.
+
+ 'abc'
+ "abc\n"
+ /\xFF/
+
+Character classes and the dot (match anything) symbol are supported as well for
+compatibility with other parsing expression implementations.
+
+ [a-z0-9] # match any lowercase letter or digit
+ [\x00-\xFF] # match any octet
+ . # match anything, even new lines
+
+== Repetition
+
+Quantifiers may be used with any expression to specify a number of times it must
+match. The universal form of a quantifier is N*M where N is the minimum and M is
+the maximum number of times the expression may match. The + and ? operators are
+supported as well for the common cases of 1* and *1 respectively.
+
+ 'abc'1*2 # match "abc" a minimum of one, maximum
+ # of two times
+ 'abc'1* # match "abc" at least once
+ 'abc'+ # same
+ 'abc'*1 # match "abc" a maximum of twice
+ 'abc'? # same
+
+== Lookahead
+
+Both positive and negative lookahead are supported in Citrus. Use the & and !
+operators to indicate that an expression either should or should not match. In
+neither case is any input consumed.
+
+ &'a' 'b' # Match a "b" preceded by an "a"
+ !'a' 'b' # Match a "b" that is not preceded by an "a"
+ !'a' . # Match any character except for "a"
+
+== Sequences
+
+Sequences of expressions may be separated by a space to indicate that the rules
+should match in that order.
+
+ 'a' 'b' 'c' # match "a", then "b", then "c"
+ 'a' [0-9] # match "a", then a numeric digit
+
+== Choices
+
+Ordered choice is indicated by a vertical bar that separates two expressions.
+Note that any operator binds more tightly than the bar.
+
+ 'a' | 'b' # match "a" or "b"
+ 'a' 'b' | 'c' # match "a" then "b" (in sequence), or "c"
+
+== Super
+
+When including a grammar inside another, all rules in the child that have the
+same name as a rule in the parent also have access to the super keyword.
+
+== An Example
+
+Below is an example of a simple calculator that respects operator precedence.
grammar Calculator
rule additive
@@ -126,17 +190,14 @@ calculator that respects operator precedence.
Several things to note about the above example are:
* Grammar and rule declarations end with the "end" keyword
-
* Rules may refer to other rules in their own definitions by simply using the
other rule's name
-
* A Sequence of rules is created by separating expressions with a space.
Likewise, ordered choice may be represented with a vertical bar
-
* Any expression may be followed by a quantifier which specifies the number
of times that expression should match
-=== Interpretation
+== Interpretation
This simple grammar is able to parse mathematical expressions such as "1+2" and
"4+5*(1+2)", but it does not yet have enough semantic information to be able to
View
@@ -1,5 +1,4 @@
-== Background
-
+= Background
In order to be able to use Citrus effectively, you must first understand the
difference between syntax and semantics. Syntax is a set of rules that govern
@@ -23,7 +22,7 @@ you can write powerful parsers that are simple to understand and easy to create.
In Citrus, there are three main types of objects: rules, grammars, and matches.
-=== Rules
+== Rules
Rules are objects that specify some matching behavior on a string. There are
two types of rules: terminals and non-terminals. Terminals can be either Ruby
@@ -42,7 +41,7 @@ Rule objects may also have semantic information associated with them in the form
of Ruby modules. These modules contain methods that will be used to extend any
matches created by the rule with which they are associated.
-=== Grammars
+== Grammars
A grammar is a container for rules. Usually the rules in a grammar collectively
form a complete specification for some language, or a well-defined subset
@@ -52,7 +51,7 @@ normally used to create more complex grammars. Any grammar rule with the same
name as a rule in an included grammar may access that rule with a mechanism
similar to Ruby's +super+ keyword.
-=== Matches
+== Matches
Matches are created by Rule objects when they match on the input. Matches
contain the string of text that made up the match as well as its offset in the
View
@@ -2,9 +2,7 @@ Citrus is a compact and powerful parsing library for Ruby that combines the
elegance and expressiveness of the language with the simplicity and power of
parsing expressions.
-
-== Installation
-
+= Installation
Via RubyGems:
View
@@ -1,4 +1,4 @@
-== License
+= License
Copyright 2010 Michael Jackson
View
@@ -0,0 +1,106 @@
+= Usage
+
+The most straightforward way to compose a Citrus grammar is to use Citrus' own
+custom grammar syntax. The syntax borrows heavily from Ruby, so it should
+already be familiar to most Ruby programmers.
+
+== Terminals
+
+Terminals may be represented by a string or a regular expression. Both follow
+the same rules as Ruby string and regular expression literals.
+
+ 'abc'
+ "abc\n"
+ /\xFF/
+
+Character classes and the dot (match anything) symbol are supported as well for
+compatibility with other parsing expression implementations.
+
+ [a-z0-9] # match any lowercase letter or digit
+ [\x00-\xFF] # match any octet
+ . # match anything, even new lines
+
+== Repetition
+
+Quantifiers may be used with any expression to specify a number of times it must
+match. The universal form of a quantifier is N*M where N is the minimum and M is
+the maximum number of times the expression may match. The + and ? operators are
+supported as well for the common cases of 1* and *1 respectively.
+
+ 'abc'1*2 # match "abc" a minimum of one, maximum
+ # of two times
+ 'abc'1* # match "abc" at least once
+ 'abc'+ # same
+ 'abc'*1 # match "abc" a maximum of twice
+ 'abc'? # same
+
+== Lookahead
+
+Both positive and negative lookahead are supported in Citrus. Use the & and !
+operators to indicate that an expression either should or should not match. In
+neither case is any input consumed.
+
+ &'a' 'b' # Match a "b" preceded by an "a"
+ !'a' 'b' # Match a "b" that is not preceded by an "a"
+ !'a' . # Match any character except for "a"
+
+== Sequences
+
+Sequences of expressions may be separated by a space to indicate that the rules
+should match in that order.
+
+ 'a' 'b' 'c' # match "a", then "b", then "c"
+ 'a' [0-9] # match "a", then a numeric digit
+
+== Choices
+
+Ordered choice is indicated by a vertical bar that separates two expressions.
+Note that any operator binds more tightly than the bar.
+
+ 'a' | 'b' # match "a" or "b"
+ 'a' 'b' | 'c' # match "a" then "b" (in sequence), or "c"
+
+== Super
+
+When including a grammar inside another, all rules in the child that have the
+same name as a rule in the parent also have access to the super keyword.
+
+== An Example
+
+Below is an example of a simple calculator that respects operator precedence.
+
+ grammar Calculator
+ rule additive
+ multiplicative '+' additive | multiplicative
+ end
+
+ rule multiplicative
+ primary '*' multiplicative | primary
+ end
+
+ rule primary
+ '(' additive ')' | number
+ end
+
+ rule number
+ [0-9]+
+ end
+ end
+
+Several things to note about the above example are:
+
+ * Grammar and rule declarations end with the "end" keyword
+ * Rules may refer to other rules in their own definitions by simply using the
+ other rule's name
+ * A Sequence of rules is created by separating expressions with a space.
+ Likewise, ordered choice may be represented with a vertical bar
+ * Any expression may be followed by a quantifier which specifies the number
+ of times that expression should match
+
+== Interpretation
+
+This simple grammar is able to parse mathematical expressions such as "1+2" and
+"4+5*(1+2)", but it does not yet have enough semantic information to be able to
+actually interpret these expressions.
+
+

0 comments on commit d31ff74

Please sign in to comment.