Haskell-Chart-Parser

A simple chart parsing library for Haskell. The parser is generic over what it parses into, so it can produce parse trees, evaluations, anything.

Chart Parsing

Chart parsing consists of iteratively adding more and more information to a parsing chart until the chart is "completed". Chart parsing is ideal for parsing ambiguous grammars, as it can produce all possible parses.

A chart is a graph where the edges are labeled by either unparsed tokens or by parsed items. The nodes of the chart represent the inter-token positions in the input, and an edge labeled l from node i to node j represents that the input from position i to j can be parsed as l.

For example, consider the BNF grammar for ambiguous arithmetic expressions

<exp> ::= "x" | "y" | "z" | <exp> "+" <exp> | <exp> "-" <exp>

Prior to adding parse edges, the expression

x + y - z

corresponds to the chart

We parse the expression by progressively adding labels to the chart according to the grammar. For instance, we can add a label from 0 to 1 because the token x can be parsed as an <exp>:

After adding some more edges, we will have recognized that both spans x + y and y - z can be parsed as <exp>s:

This sort of ambiguity is ok in a chart parser. In this particular chart, we're labeling with just non-terminal symbols (specifically, just <exp>), so the end result will not distinguish the two parses, but a richer label type could be chosen, for instance, an AST.

When we're done adding edges to the chart, it looks like this:

Because there's an edge from the beginning of the chart to the end of the chart, we know that the whole of the input parses. In this case, it's an <exp>.

In general, chart parsers can add edges to the chart however they like. The chart parser in this library, however, uses a left-corner technique to parse the input one word at a time. We start out in a parse state with an empty chart, and a list of tokens waiting to be added to the chart.

We then read a token by adding an edge to the graph labeled by that token.

We then add to the chart all the edges we can that end after the newly added token. In this library, this is called saturating the chart.

Saturating like this ensures that the chart has as much information as possible after reading each token, so that parsing happens incrementally. After reading a few more tokens, the chart looks like this:

And finally we end up with the final chart we got before. The evolution of the parse state looks like this:

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
examples		examples
img		img
src/ChartParser		src/ChartParser
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Haskell-Chart-Parser

Chart Parsing

About

Releases

Packages

Languages

BekaValentine/Haskell-Chart-Parser

Folders and files

Latest commit

History

Repository files navigation

Haskell-Chart-Parser

Chart Parsing

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages