Every repository with this icon (
Every repository with this icon (
| name | age | message | |
|---|---|---|---|
| |
.be/ | ||
| |
MANIFEST | ||
| |
README.markdown | ||
| |
examples/ | ||
| |
picoparse/ | ||
| |
setup.py | ||
| |
test.py | ||
| |
tests/ |
picoparse
"Yo dawg, I heard you like parsing…"
Picoparse is a very small parser / scanner library for Python. It is built to make constructing parsers straight forward, and without the complications regular expressions bring to the table. The design is inspired by the Parsec library from Haskell.
Picoparse will lazily scan input as needed and provides smart backtracking facilities. The core library is not specific to text and will work over any iterable type.
Contents
- The
picoparsepackage is the core of the parser library. picoparse.textcontains a few useful tools for building text oriented parsers.examples/xml.pyis an example implementation of a parser for a reasonable subset of xml.examples/calculator.pyis an example implementation of infix arithmetic parser and evaluator.test.pyruns all the test cases found intests
Installing
You can install the latest release with easy_install or (presumably) pip; eg:
easy_install picoparse
Alternatively, you can get the source as a tarball or with git and install with:
python setup.py install
Using Picoparse
Parsers are functions that consume input by calling smaller parsers, and returning a value.
run_parser is used to call your top-level parser and provide it with an input stream eg:
run_parser(my_toplevel_parser, file(input_file).read())
It is recommended that you examine examples/xml.py to see a worked example.
An important idea with Picoparse is 'specialising' an existing parser by using functools.partial to generate a new parser function. Eg, to create a parser that consumes an 'a':
a = partial(one_of, 'a') # roughly equivalent to a = lambda: one_of('a')
and to make a parser that accepts many 'a's:
many_as = partial(many, a)
How do I…
Heres is a glossary of parsers and combinators by outcome to help learn to use Picoparse. Look at doc comments for specifics.
Match input
Matching input is done by calling primitive parsers.
picoparse.one_ofMatch one item in the input stream that is equal to the argument. Requires that the item supports==.picoparse.not_one_ofMatch one item that is not equal to the argument.picoparse.satisfiesMatches one item that satisfies a guard function.picoparse.any_tokenMatches one of any item.picoparse.eofMatches the end of input.picoparse.text.whitespace_charMatches a single whitespace characterpicoparse.text.newlineMatches a single newline characterpicoparse.text.quoteMatch one single or double quote
Match multiple items
Matching multiple items is achieved by passing a primitive parser to a combinator function. These parsers are all greedy
picoparse.manyMatches a parser zero or more times.picoparse.many1Matches a parser one or more times.picoparse.many_untilMatches a parser zero or more times until the terminating parser matches.picoparse.many_until1Matches a parser one or more times until the terminating parser matches.picoparse.sepMatches a parser zero or more times, with a separator being matched between each pair.
picoparse.sep1Matches a parser one or more times, with a separator being matched between each pair.
picoparse.optionalMatches a parser zero or one times.picoparse.text.whitespaceMatch zero or more whitespace characterspicoparse.text.whitespace1Match one or more whitespace characters
Make a choice
Choosing between possible parsers is achieved with the choice combinator, often assisted by tri.
picoparse.choiceMatch one of the parsers passed in order. If none match, the choice fails to match as well.picoparse.trishould decorate a parser that consumes multiple pieces of input. You canpicoparse.committo the choice if you know that no other choice should succeed. (see calculator and xml examples for this in use.)triwill automatically commit if it reaches the end of the decorated parser.
Match a sequence
picoparse.stringwill applyone_ofin for each item in the iterable argument. Note that this is any string of matches, not just for character parsing.picoparse.text.caseless_stringwill match the given str without regard for upper or lower case (text only)picoparse.cuefor something ignorable that cues you to what you really want to match. eg, matching "#x" in a hex parserpicoparse.followfor something ignorable that follows what you really want to match. eg, matching "l" after a long integer.picoparse.seqfor matching a specific set of (possibly named) parsers.
Matching something wrapped
picoparse.text.lexemematch a parser with optional whitespace on either side.picoparse.text.quotedMatch a parser with in single or double quote characters until a matching quote is found.
Tracking development and reporting bugs
This project is tracked with GIT via GitHub, and uses a Bugs Everywhere bug tracker for tracking bugs.
See also
Also implementing similar ideas in python are PyParsing and Pysec.








