Autumn is a library for building recursive descent parsers using combinators. It provides a set of basic parsers for parsing individual characters and common groupings of characters, primitives to collect parsed values, and combinators for combining and modifying parsers.
A parser is an item that implements the Parser
trait. The most simple
way to implement the trait is with a function given the following signature:
fn parser(source: &str, location: Span) -> ParseResult<String> {
/* ... */
}
This is a function that takes an input string slice its location in some original source text
and will parse a String
value starting at that location.
The following is an example that implements a parser of C-like identifiers. These begin with an ASCII letter or underscore and are followed by any number of ASCII digits, letters, or underscores.
/// Parse a single alphabetic character or underscore
fn identifier_prefix(source: &str, location: Span) -> ParseResult<Span> {
alphabetic.or("_").parse(source, location)
}
/// Parse a zero or more letters, digits, and underscores
fn identifier_body(source: &str, location: Span) -> ParseResult<Span> {
identifier_prefix.or(digit).multiple().maybe().parse(source, location)
}
/// Accumulate the characters for a single identifier into a string
fn identifier(source: &str, location: Span) -> ParseResult<String> {
identifier_prefix
.and(identifier_body)
.copy_string()
.parse(source, location)
}
As the functions above each implement the Parser
trait,
combinators can be used directly on the functions.
A number of the most common combinators are shown in the example above.
-
or
will take a parser and additionally try an alternative parser at the same location. This can produce a result from both parsers if they ar successful. -
and
will take a parser that produces aList
and append the result of another parser that produces the same type of list. -
multiple
will take a parser that produces aList
and attempt to apply that parser one or more times in succession. -
maybe
will take a parser that produces aList
and attempt to apply that parser zero or one times. When using bothmultiple
andmaybe
to achieve zero or more repetitions,multiple().maybe()
must be used;maybe().multiple()
can find an infinite number of ways to apply any parser on even an empty string. -
map
can be used to transform the type produced by a particular parser. -
end
will only produce a successful parse if there is no input remaining after the parser.
A few of the provided parsers have also been used above.
-
alphabetic
will parse a single character that is an ASCII alphabetic character and produce aList<char>
. -
digit
will parse a single character that is an ASCII digit character and produce aList<char>
. -
Any
&str
orString
can be used to parse itself and produce aList<char>
.
When invoking a parser the source must be provided as a string slice and the current position
must be provided as a Span
. An intial span can be provided for the start
of any string using new_location
but a span associated with a path to
a file should be specified using path_location
.
The ParseResult
produced by a parser contains all valid parsings
of the input string.
If an ambiguous parser is constructed, every unique manner in which that parser could be used
to process the input will be produced;
single_parse
can be used to check that only a
single valid parse was produced.
Every successful parse is encapsulated in a Meta
that associates it with
a Span
that reflects the range within the source that contains the
characters parsed for that result. The meta
combinator can be used to
obtain the location associated with a value during parsing.
A parser may also need to produce errors while parsing. The parser will discard errors that are not associated with a valid parse so some value must be associated with the error and the error will be associated with a particular location in the parse.
If there are errors associated with a ParseResult
their type must be passed as the third
type argument to the type constructor.
If any valid parse of the source produced errors
is_success
will return false
.
The following example will either parse the first five letters of the alphabet in lower-case or will produce an error associated with the alphabetic characters starting at the same location.
/// Parses the first 5 letters of the alphabet in order
fn alphabet_parse() -> impl Parser<Span, &'static str> {
"abcde"
.on_none(
alphabetic
.multiple()
.maybe()
.and_then(|text| error(text, "Not in alphabetical order"))
)
}
The on_none
combinator can be used to
provide an alternative parser that is used only if the original parser was unable to find any
valid parse (including parsers that resulted in an error).
The on_failure
combinator can be used
to provide an alternative parser that is used when the original parser produced an error or if
no valid parse was found or if a valid parse associated with an error was found.
The error
parser will produce a valid parse with the provided value
but will also produce the provided error. This allows parsing to continue and find further
errors.
The ParseResult
produced by a parser contains all errors produced
by all valid parsings of the input. Each error
will be associated
with the location in the source where the error was produced.
Exceptions can be used to associate errors with an exact span of input rather than a single
location. The throw
parser can be used exactly like the
error
parser except it creates an exception, rather than an error,
at the same location. The catch
combinator
can then be used to convert the exception into an error and extend the start of the span
associated with the error all the way back to the start of the original parse.
The following example shows how the error produced by the alphabet_parse
function above can
be associated with the span of source code that was consumed to produce the error.
/// Parses the first 5 letters of the alphabet in order
fn alphabet_parse() -> impl Parser<Span, &'static str> {
"abcde"
.on_none(
alphabetic
.multiple()
.maybe()
.and_then(|text| throw(text, "Not in alphabetical order"))
)
.catch()
}
Current version: 0.4.3
License: ISC