Skip to content
Karl R. Wilcox edited this page Sep 29, 2018 · 5 revisions

The DrawShield Parser

The parser is responsible for reading the blazon and converting to an intermediate form - parse tree conforming to the custom designed XML schema called BlazonML.

The parser is coded as a PHP object which is invoked for the duration of the parsing and can be deleted and garbage collected at the end of the parsing phase.

It is a top-down parser implemented by a finite state machine. Top-down parsers are straight-forward to write and understand but not very good at error recovery - Drawshield is a classic example of this!

As suggested by the name, top-down parsers start by looking for the top level element (a shield description) and looks for the lower level elements that it is composed of - in this case a shield description is composed of a field (mandatory), zero or more ordinaries or charges, and, optionally, something to go “overall”. Hence the parser will try to find a shield description by looking for a field. The field itself can take the form of simple tincture, a divided field, a treatment or a quartered shield, so it will look for each of these in turn. (Another name for top-down parser is “Recursive descent parser). Note that at this point the parser does not look for a charge, so if the user has forgotten to include a field in their they simply get a message about words not being recognised, rather than anything more useful.

Methods in the parser are named for the element that they are searching for, so the field() method will call in turn the tincture, division and other methods to look for the lower level elements. In general, the methods return null if they don’t find what they are looking for (which is not necessarily an error), or they return an XML node from the BlazonML schema.

The Tokeniser

The parser does not operate on the raw input, it operates on a “normalised” version of the input that is created by a tokeniser which runs as the first stage of parsing. The whole input is converted to “tokens” in one operation.