Bracketup is a simple generic markup language which provides a framework for defining application-specific markup languages.
I have designed it primarily to allow creation of structured content that supports specific interaction models in a web browser.
It has evolved from the application-specific markup language https://github.com/pdorrell/correspondence-markup, which is implemented in Ruby.
But, you can still run it server-side if you want to, for example using Node.
Perhaps unsurprisingly, given this history, the one current working example of an application-specific bracketup-based markup language is correspondence-bracketup, which provides markup to support Correspondence. An actual page example is at http://pdorrell.github.io/bracketup/correspondence/rhoScriptExample.html, and the markup can be seen inside the source for that web page.
The syntax rules for Bracketup are fairly simple:
- A Bracketup document consists of one or more elements, where each element consists of the following:
- Opening square bracket, i.e. [.
- Zero or more comma-separated identifiers (with no whitespace before or after the commas)
- Optional whitespace following the identifiers. (If the identifiers are followed by alphabet plain text, then there has to be at least one whitespace character between the last identifier and the beginning of the plain text.)
- One or more children.
- Closing square bracket, i.e. ].
Children can be either of the following:
- Plain text
Plain text consists of any characters, but the characters "]", "[" and "" must be quoted by the backslash character "".
Identifiers may contain any alphabetic characters, digits, or the characters "" "-". The character "" at the beginning of an identifier is given a special meaning. Identifiers are suitable for specifying things such as HTML element ids and CSS class names. Application data requiring a larger character set, for example URLs, can be nested within appropriate child elements.
Every element has a function, which is specified by the first identifier in the comma-separated list. Any other identifiers in the list are passed as arguments to the functions.
However, a function may specify a default function for child elements, in which case this function does not need to be specified, and any identifiers given in child elements are treated as arguments to the default function.
Where there is a default child element function, a function can be specified explicitly by prepending "" to the function name. This initial "" will be removed, and the remainder of the name specifies the actual function name. (It follows that "_" should not be used as a prefix for argument values, as this will lead to confusion in certain cases.)
The syntax and markup structure given above does not specify any particular interpretation of the data. However the Bracketup code provided in this project does support a specify type of interpretation.
(I call the initial map of function names the "top-level" map, because it is provided to the compiler which uses it to interpret the function name on the top-level element. Inner elements can provide their own maps, which will map functions relating to their own child elements, but which will typically fall back to the top-level map to map functions that might occur anywhere within the source.)
Parsing and Compiling
Compilation of bracketup markup occurs in four main stages:
- Scanning of source into tokens. There are four types of token:
- "[" + comma-separated list of identifiers + optional whitespace
- Backslash-quoted plain text
- All other plain text
- Closing "]"
- Compilation of tokens into elements.
- Compilation of elements into application-specific objects.
- Generation of HTML DOM elements from application-specific objects.
An essential element of any parsing or compiling application is the handling of errors.
If something goes wrong, we want to know what went wrong, and where it went wrong.
To support this, bracketup.js retains precise information about the location of source code when initially parsed into tokens, and preserves this information as the tokens are parsed into elements and then into application-specific objects, so that errors can be properly reported, at whichever stage of compilation they occur.
A Worked Example
The following is a cut-down version of the example at http://pdorrell.github.io/bracketup/correspondence/rhoScriptExample.html:
[correspondence [_title Queens Puzzle] [rhoscript [_languageTitle rhoScript] [A [1 8] [2 range]] [B [1 permutations]] [C ([1 with-index]] [D ...] [K [1 keep-maxes-by]] ] [english [_languageTitle English] [A For [2 the sequence of numbers from 0 to 1 less than] [1 8],] [B consider [1 all possible permutations].] [C For a given permutation, [1 specify X and Y co-ordinates for 8 queens] as an array of arrays of the form \[[b x], [b y]\]] [D ...] [K [1 Having done that for each [a [href http://en.wikipedia.org/wiki/Permutation]permutation], keep those permutations that have the maximum number of diagonals occupied.]] ] ]
Here is some explanation of what is going on in this example:
- The Correspondence object defines a default child function
englishare the first arguments passed to this function in each case.
_titlestarts with the "_" character, so it specifies a non-default function
- The default child function for a Text object is
sentence, which maps to Sentence.
languageTitleis a non-default child function that maps to a LanguageTitleAttribute which sets a languageTitle property on the parent Text object, which uses that value (if provided) to output a child <div> element displaying the language title.
- The Sentence constructor takes one id argument.
- A Sentence object has a default child function
word, which maps to the Word class, which accepts a single id argument in its constructor.
i, but that doesn't appear in this example.)
amaps to the Link class which has a child function
hrefwhich maps to HrefAttribute, which sets the "href" attribute on the HTML link DOM element.
- The text
\[[b x], [b y]\]]includes a backslash-quoted "[" and a backslash-quoted "]", so that these characters appear in the final output. (The inner "x" and "y" characters are displayed as bold in the final output.)
Bracketup CoffeeScript Classes
Three base classes are provide to support the most common use cases:
- BaseNode which represents an object that outputs a DOM from the createDom method.
- TextElement - an object representing the standard implementation of plain text added to a BaseNode.
- BaseAttribute - an object representing a child element which acts on the parent element by setting an attribute value on the DOM element created by the parent.
Other classes defined in bracketup.js which are relevant to implementing application-specific markup languages are:
- BracketupCompiler constructed with a map of function names to constructor functions, this object compiles elements into the application-specific objects
- Document this is a wrapper for the browser document object which provides two convenience methods
for creating DOM nodes:
- addTextNode(dom, text) which adds a text node to a DOM element.
- createNode(tag,options) which creates a DOM element with the given tag, and the following options:
- parent the parent DOM element to append the new DOM element onto
- className the CSS class name
- attributes a map of attributes to set as attribute values on the DOM element
- text text for a text node to be added to the DOM element
- Bold, Italic, Link - classes representing HTML nodes of type <b>, <i> and <a> respectively.
- HrefAttribute - an object which, as a child element of a Link object, sets the href attribute of the <a> element.
Other classes defined in bracketup.js are, by category:
- SourceFileName representation of the name of a source file (which may be a URL, or an id of a DOM element within HTML source, or whatever)
- EndOfSourceFilePosition representation of the end of a source file, for errors where source code is missing closing brackets
- SourceLine representation of a line of source code and its location
- SourceLinePosition representation of a specific character position within a line of source code
- TextNode plain text, as parsed, with associated source location information.
- EndOfLineNode end of a line, as parsed, with associated source location information ( bracketup.js parses source line-by-line, so line endings are parsed separately from other plain text, and implementation objects can easily give special treatment to line endings).
- ElementNode an element, as parsed, with associated source location information.
CustomError a base class that supports defining specific custom error classes that can have source location information added to them.
NodeParseException an exception when parsing the bracketup markup.
CompileError base class for errors that occur when compiling (i.e. when interpreting parsed elements).
Scanning and Compiling
- NodeParser the object which receives tokens from BracketupScanner and compiles them into elements.
- NodeCompiler the object which, given a function-to-constructor map, compiles a parsed element into an application-specific object.
- TestTokenReceiver a test object which receives tokens from BracketupScanner and displays them in a readable fasion.