Skip to content
parse common form markup
JavaScript Yacc
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Type Name Latest commit message Commit time
Failed to load latest commit information.

Parse Common Form markup, returning an object containing a form and an array of path-to-blank mappings.


Common Form markup utilizes rarely-used symbols you can type with your keyboard to structure agreements and indicate definitions, uses of defined terms, fill-in-the-blanks, and cross-references between provisions. For example:

This Agreement (this ""Agreement"") is made effective as of
[Effective Date] by and between [Seller's Legal Name] (""Seller"") and
[Buyer's Legal Name] (""Buyer"").

    \Definitions\  For purposes of this <Agreement>, the following terms
have the following meanings:

        \\  ""Capital Stock"" means the capital stock of the Company,
    including, without limitation, the <Common Stock> and the <Preferred

        \\  ""Dissolution Event"" means:

            \\  a voluntary termination of operations pursuant to {Voluntary

            \\  a general assignment for the benefit of the <Company>'s
    creditors or

            \\  any other liquidation, dissolution or winding up of the
    <Company> (excluding a <Liquidity Event>), whether voluntary or

Each subdivision of the form begins with \\, indented by four spaces. If the provision has a heading, it goes within the slashes, like \Definitions\ ....

Within a provision, terms being defined are set in ""double quotation marks"". Defined terms being used are typed <within angle brackets>. A cross-reference to a provision with a {Particular Heading} is with braces. [Blanks to be filled in] use square brackets.

The Parser

var parse = require('commonform-markup-parse')
parse(stringOfMarkup); // => {form: Object, directions: Array}

The parser is made of several components:

  1. a hand-coded context-tracking tokenizer (or lexer) that emits tokens for indentation and outdentation, in addition to content tokens

  2. an LALR(1) parser generated by Jison from a Bison-like BNF grammar

  3. commonform-fix-strings to convert the parser's AST to a valid common form by fixing various string-related validation issues, like extra space

  4. a tiny algorithm that removes the hints text from fill-in-the-blanks within the form, and emits path-to-hint mappings instead

The parser passes the commonform-markup-tests test suite.

If you'd like to write a parser in a different language, the test suite and this package are best places to start. Your language probably already has a Bison clone or a BNF-compatible parser combinators library.

You can’t perform that action at this time.