This repository has been archived by the owner. It is now read-only.


joshua-choi edited this page Dec 21, 2010 · 182 revisions

Note for new people: FnParse 2 is completely useable and will remain so indefinitely. But in the next month or so, I’ll be splitting FnParse into two different, backwardly incompatible libraries, so be warned. I think, though, it’ll be worth it. Check the news page to see the updates.

FnParse is a library for creating functional parsers in the Clojure programming language. It presents an easy, functional way to create parsers from EBNF rules and was inspired by the paper Using Functional Parsing to Achieve Quality in Software Maintenance.


A parser is a function that receives a state object containing tokens, consumes the first few tokens, and creates some new, corresponding structure depending on the parser’s semantic intent. The concept of FnParse is breaking down a parser into a bunch of rules that nicely correspond to the rules in its grammar.

Rules are functions (one argument) that takes a state object (which is a hash-map by default) that contains a sequential collection of tokens.

  • If a sequence of tokens is invalid for a rule, the rule simply returns nil.
  • If a sequence of tokens is valid for a rule, the rule returns a 2-sized vector containing:
    • An object called the result’s product, the product of the tokens that the rule consumed
    • An object (usually a map) called the result’s state that includes information about the result.
      • Most importantly, the state must include a sequence of the remaining tokens. This sequence is called the result’s remainder.
      • In addition, the state can carry other information too, called the result’s info. Info could be useful for keeping track of the current line and column.

FnParse contains many functions to help facilitate creating useful rules. A function that returns a new rule is a rule maker. For more information, check the reference of FnParse 2.

States and how to call rules

What sort of state objects your parser uses is your choice. By default, it is a hash map containing a :remainder key containing the state’s remainder. If you stick with the default, you could call a parser like this: (root-rule {:remainder [\A \B \C]}).

Alternatively, you can use the rule-matcher metafunction, which provides an easy way to completely match a root rule to a state. It calls one function if the rule fails, and it calls another function if the rule fails to consume all the tokens. For more information on rule-matcher, check its docs or the reference of FnParse 2.

After you get used to FnParse for a while, for speed you might want to use a struct for your parser’s states instead. In addition, you can in fact use any arbitrary data structure for your state, as long as it contains the remaining tokens. You can change how FnParse’s functions access a state’s remainder by rebinding the *remainder-accessor* and/or *remainder-setter* variables. For more information, check the API wiki page, the sample JSON parser, or the variables’ docs.

Library usage

FnParse requires clojure-contrib, the Clojure standard library.

FnParse’s distribution has src and test folders. To use FnParse, download this distribution and include its src folder in your program’s classpath—for instance, java -cp $CLOJURE_PATH:$CLOJURE_CONTRIB_PATH:path-to-fnparse/src/ …

FnParse’s namespace is name.choi.joshua.fnparse. FnParse’s unit tests are stored in the tests folder.

For a little example of what FnParse can do, check out the sample JSON parser.

Note that FnParse is not necessarily a lexer. It’s mean to parse an already-created collection of tokens. FnParse is meant to be used in conjunction with a lexer, and it can build parsers based on any sort of token you wish. However, it might be easiest if you decide that Characters are your language’s tokens and the clojure.core/seq function be your lexer; if you do this, strings could be fed directly into the parser as a sequence of tokens. The sample JSON parser is an example of using Characters as tokens.