To use the TreeMapper class:
-
Define a mapping function (a regular function)
-
Use function treeFromBlock() to get the tree
-
Create a subclass of TreeWalker, overriding method visit(superNode, level) and optionally endVisit(superNode, level)
-
Pass the tree to the TreeWalker constructor, call the walk() method, then fetch the mapped data. Often, you'll want to create a method in your class to fetch the mapped data but you may also simply get it directly from a property of your class.
As an example, let's create a limited mathematical language. The primitive items in this language includes:
- numbers - a string of one or more digits
- identifiers - a string of one or more letters or digits
beginning with a letter
- operators - arbitrary non-whitespace strings
From these, we can build the following expressions:
<program> ::= <expr>+
<atom> ::= <number> | <ident> | <op>
<expr> ::=
'expr' <atom>* => <expr>*
| 'group' <op>{0,2} => <expr>+
| 'sub' => <expr>{2}
| 'SIGMA' => <expr>{3,}
| <atom>
Everything before a => is on one line, and everything
after the => is on a separate line, at one higher level,
i.e. everything after the => is a child of the target element.
expr X + 3
should produce MathML that displays:
X + 3group
expr X + 3
should produce MathML that displays:
( X + 3 )sub
X
2
should produce MathML that displays:
X 2 -----------------------------------------------SIGMA
0
10
sub
X
2
should produce MathML that displays:
Σ 0 10 X 2group
SIGMA
0
10
sub
X
2
should produce MathML that displays:
( Σ 0 10 X 2 )- Define a mapping function
To implement this, we first create a mapping function. The critical thing to understand is that your mapping function does not need to deal with nested items, i.e. child nodes. It should only map simple strings, i.e. with no embedded newline or carriage return characters. Later, when you write your TreeWalker subclass, you will get a chance to deal with a node's children. Here is a mapping function for this language:
export mathMapper = (line) ->
if isEmpty(line) then return undef
lWords = line.split(/\s+/) # split on whitespace
return getNode(lWords[0], lWords.slice(1))
isEmpty()
returns true if str is undefined or consists of only
whitespace characters. Returning undef indicates that this line
should be ignored.
getNode()
receives a command name and an array of strings as
input and returns a hash with key 'cmd' and 'lAtoms' if there
is something besides the command name on the line.
Each input string is mapped to one of the following:
{
cmd: 'expr'
lAtoms: [<atom>, ... ]
}
{
cmd: 'group'
lAtoms: [<op>, <op>]
}
{
cmd: 'sub'
}
{
cmd: 'SIGMA'
}
The lAtoms key will not be present if there are no atoms
(currently only applies to cmd 'expr'). A node's children
(e.g. subtree) does not appear in the above, i.e. not handled
by your mathMapper() function. Note that you can supply 0..3
arguments to the group
command, but if less than 2, default
values will be supplied.
Each atom in lAtoms, where it exists, is one of the following:
{
cmd: 'ident'
value: str
}
{
cmd: 'number'
value: str
}
{
cmd: 'op'
value: str
}
A number consists of a string of one or more digits. Any string that starts with a digit, but is not just a string of digits, is an error.
An identifier is anything that starts with a letter or underscore and is followed by zero or more letters, underscores or digits.
Anything else is considered an operator.
- Use function treeFromBlock() to get the tree
Execute the following code:
code = """
SIGMA
0
10
sub
X
2
"""
result = treeFromBlock(code, mathMapper)
LOG 'result', result
The output will be:
------------------------------------------
result:
---
-
lineNum: 1
node:
cmd: SIGMA
subtree:
-
lineNum: 2
node:
cmd: expr
lAtoms:
-
type: number
value: '0'
-
lineNum: 3
node:
cmd: expr
lAtoms:
-
type: number
value: '10'
-
lineNum: 4
node:
cmd: sub
subtree:
-
lineNum: 5
node:
cmd: expr
lAtoms:
-
type: ident
value: X
-
lineNum: 6
node:
cmd: expr
lAtoms:
-
type: number
value: '2'
------------------------------------------
You can check that the structure is correct, but remember that none of your code will rely on the key names 'node', 'children' or 'lineNum' - we're just ensuring that the correct tree will be used in the next step.
- Create a subclass of TreeWalker
The critial methods to override are visit() and endVisit().
The visit() method receives parameters node and level. The behavior of our visit() method won't depend on the level, so we'll concern ourselves only with the node parameter. This parameter will be set to whatever our mathMapper() function returned - in this case, a hash with key cmd, and also key lAtoms if anything beside the command name appeard on the line.
While our tree is being "walked", we want to build up a string of MathML code. For that purpose, our TreeWalker subclass will define a property named @mathml that will initially be set to the empty string, which will be appended to as nodes are visited.
Our subclass of TreeWalker is:
export class MathTreeWalker extends TreeWalker
constructor: (tree) ->
super tree
@mathml = ''
visit: (superNode) ->
debug "enter visit()"
node = superNode.node
switch node.cmd
when 'expr'
debug "cmd: #{node.cmd}"
@mathml += "<mrow>"
for atom in node.lAtoms
switch atom.type
when 'number'
@mathml += "<mn>#{atom.value}</mn>"
when 'ident'
@mathml += "<mi>#{atom.value}</mi>"
when 'op'
@mathml += "<mo>#{atom.value}</mo>"
when 'group'
debug "cmd: #{node.cmd}"
@mathml += "<mrow>"
@mathml += node.lAtoms[0].value
when 'sub'
debug "cmd: #{node.cmd}"
@mathml += "<msub>"
when 'SIGMA'
debug "cmd: #{node.cmd}"
@mathml += "<munderover>"
@mathml += "<mo class='large'> Σ </mo>"
else
croak "visit(): Not a command: '#{node.cmd}'"
debug "return from visit()"
return
endVisit: (superNode) ->
debug "enter endVisit()"
node = superNode.node
switch node.cmd
when 'expr'
debug "cmd: #{node.cmd}"
@mathml += "</mrow>"
when 'group'
debug "cmd: #{node.cmd}"
@mathml += node.lAtoms[1].value
@mathml += "</mrow>"
when 'sub'
debug "cmd: #{node.cmd}"
@mathml += "</msub>"
when 'SIGMA'
debug "cmd: #{node.cmd}"
@mathml += "</munderover>"
else
croak "endVisit(): Not a command: '#{node.cmd}'"
debug "return from endVisit()"
return
getMathML: () ->
debug "CALL getMathML()"
return "<math displaystyle='true'> #{@mathml} </math>"
- Pass the tree to the TreeWalker constructor, call the walk() method, then fetch the mapped data.
code = """
SIGMA
0
10
sub
X
2
"""
tree = treeFromBlock(code, mathMapper)
walker = new MathTreeWalker(tree)
walker.walk()
mathml = walker.getMathML()
LOG 'mathml', mathml
This will output:
------------------------------------------
mathml:
---
<math displaystyle='true'> <munderover><mo class='large'> Σ </mo
><mrow><mn>0</mn></mrow><mrow><mn>10</mn></mrow></munderover><msub><mro
w><mi>X</mi></mrow><mrow><mn>2</mn></mrow></msub> </math>
------------------------------------------