Skip to content
Build recursive descent parsers
Go Shell
Branch: master
Clone or download
shivamMg Minor fixes to README
Also, use chroma.Iterator's Tokens method.
Latest commit 7b55400 Dec 20, 2018
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
examples Minor fixes to README Dec 20, 2018
.gitignore Add .idea to gitignore Jul 10, 2018
LICENSE Update README Dec 19, 2018
README.md Minor fixes to README Dec 20, 2018
builder.go Update README Dec 19, 2018
builder_test.go Improve PL/0 example Nov 12, 2018
go.mod Add whitebox tests Oct 19, 2018
go.sum Add whitebox tests Oct 19, 2018
internals.go Make generic improvements Nov 10, 2018
pre-commit.sh Update README Dec 19, 2018
types.go Minor improvements + bugfix Nov 16, 2018

README.md

rd godoc License: MIT

rd is a small library to build hand-written recursive descent parsers. Besides exposing convenient methods to parse tokens it features automatic parse tree generation and flow tracing for debugging.

Recursive descent parsers can imitate their grammar quite well. For instance, for the following grammar:

A → aB
B → b

(where A and B are non-terminals, and a and b are terminals), a recursive descent parser might look like:

func A() bool {
    if nextToken() == "a" {
        return B()
    }
    pushback("a")
    return false
}

func B() bool {
    if nextToken() == "b" {
        return true
    }
    pushback("b")
    return false
}

Parser for the same grammar written using rd:

import "github.com/shivamMg/rd"

func A(b *rd.Builder) (ok bool) {
    b.Enter("A")
    defer b.Exit(&ok)

    return b.Match("a") && B(b)
}

func B(b *rd.Builder) (ok bool) {
    defer b.Enter("B").Exit(&ok)

    return b.Match("b")
}

A builder object keeps track of the current token and exposes convenient methods to write a parser. Match method, for instance, matches a token to the current token, and resets the original state in case of an unsuccessful match; there's no need for a manual pushback. As non-terminal functions are called, terminal matches, and calls to other non-terminal functions are done. A parse tree is generated in the builder object using these matches and calls:

In case of a successful match the terminal is added to parse tree under the current non-terminal (the one in which Match was called). Same goes in case of a non-terminal function call: the non-terminal, if it exits successfully, is added to parse tree under the current non-terminal. You can imagine this process being repeated recursively.

Argument to Enter is what is added to parse tree as a symbol for the non-terminal. Argument to Exit determines if function call was successful or not.

ParseTree method returns the parse tree which is, well, a tree data structure. It can be pretty-printed.

tokens := []rd.Token{"a", "b"}
b := rd.NewBuilder(tokens)
if ok := A(b); ok {
    fmt.Print(b.ParseTree())
}

The above snippet will print:

A
├─ a
└─ B
   └─ b

A debug tree is also maintained which, unlike the parse tree, contains all matches and calls (not just the successful ones). It's helpful if you want to debug a parsing failure. It can be retrieved using the DebugTree method.

fmt.Print(b.DebugTree())

The above snippet will print:

A(true)
├─ a = a
└─ B(true)
   └─ b = b

Parsing errors can be retrieved using Err method. For tokens := []rd.Token{"a", "c"} the following statements:

fmt.Println(b.Err())
fmt.Print(b.DebugTree())

will print:

parsing error
A(false)
├─ a = a
└─ B(false)
   └─ c ≠ b

Examples

Arithmetic expression parser

go get github.com/shivamMg/rd/examples/arithmetic   # requires go modules support (go1.11+)
arithmetic -expr='3.14*4*(6/3)'  # hopefully $GOPATH/bin is in $PATH
arithmetic -expr='3.14*4*(6/3)' -backtrackingparser

Parser and grammar for it can be found inside examples/arithmetic/parser. There's another parser written for a different grammar that also parses arithmetic expressions. This parser can be found inside examples/arithmetic/backtrackingparser. It uses backtracking - notice the use of b.Backtrack().

This example lexer built using chroma.

PL/0 programming language parser

go get github.com/shivamMg/rd/examples/pl0
cd examples/pl0/
pl0 square.pl0
pl0 multiply.pl0
pl0 prime.pl0

Parser and grammar can be found inside examples/pl0/parser. Grammar has been taken from en.wikipedia.org/wiki/PL/0#Grammar. It also uses lexer built using chroma.

Domain name parser

go get github.com/shivamMg/rd/examples/domainname
domainname www.google.co.uk

Grammar has been taken from www.ietf.org/rfc/rfc1035.txt. The lexer is hand-written.

Licence

MIT

Contribute

Contribute through bug fixes, improvements, new examples, etc. Lucky PR submitters get to walk home with a brand new CRT and an Audi 1987.

You can’t perform that action at this time.