Skip to content

Mathmode to SSML

Taichen Rose edited this page May 24, 2021 · 44 revisions

When the main LaTeX parser flags an argument as a LaTeX mathmode object it will be fed into this parser, which generates LaTeX mathmode to a SymPy object, then a SymPy object to SSML.
Please see the SymPy documentation for an explanation of the overall structure of SymPy objects, and what each SymPy function does. Our pronunciation of these functions was based on the SymPy documentation for SymPy version 1.8.

Files

Converting LaTeX to SymPy:
tex_to_sympy.py
Converting SymPy to finalized text:
sympytossml.py
Defines how math operations should be pronounced:
static\sympy_funcs.xml

Implementation

Building off of augustt198’s latex2sympy parser on github, we are able to use ANTLR which is a parser generator to create a tree to convert LaTeX commands into Sympy objects. The Sympy to SSML portion of this mathmode parser uses XML and the sympy library to parse each object. Documentation in regards to the XML structure can be shown below. When parenthesis are encountered within each sympy math object, we add start parenthesis and end parentheses within the document to ensure that our users can comprehend a math equation.

Issues and Refactoring

There are still a large amount of math commands that our XML document does not currently contain. Hopefully we will expand to higher level math for example calculus equations such as derivatives in being read more fluently. The ANTLR parser generator also has a limitation on certain mathmode equations that it recognizes from LaTeX, with a revamped XML structure to incorporate a larger range of Sympy math objects, it needs to also be addressed that there ANTLR may need to be expanded to support other math features it does not currently include.

Sympytossml XML Structure

Each sympy object supported by sympytossml is represented by an entry in static/sympy_funcs.xml. The tag of this entry must be the name of the sympy class, which can be found on docs.sympy.org. The program will use this entry to parse through the function in a linear fashion. It parses through the XML tags and the args array of the Sympy class in parallel, so the first instance of in the XML element is always the first argument in the Sympy args array, unless it is after a <repeat /> tag.

XML Tags:

<text>
Anything inside this tag will be appended to the final string in its entirety. There should be no spaces before or after the text
<arg \>
An argument from the sympy function.
<subarg \>
An argument from a subarray within the sympy function's main arg array. See Sum for an example.
<repeat \>
Sets the repeat point. If this tag is present, when the parser reaches the last XML element, it will loop back here until the end of the Sympy args array is reached.

As a convention, text at the beginning of a function does NOT start with "the". For example: <text>integral of</text>, NOT <text>the integral of</text>

Example:

<!--Name of sympy class that represents addition-->  
<Add>  
    <!--Allows an arbitrary number of arguments-->    
    <arg />  
    <!--Everything below this tag will be repeated until there are no more args-->  
    <repeat />  
    <!--No spaces before or after 'plus'-->  
    <text>plus</text>  
    <arg />  
</Add>  

Usage of SympyToSSML

The user should call convert_sympy_ssml(<sympy object>, <quantity mode>).

Quantity Modes

The quantity modes determine how parentheses will be stated out loud.
QuantityModes.NO_INDICATOR
Parentheses will not affect the final audio.
QuantityModes.QUANTITY
Each set of begin and end parentheses will start with the words "begin quantity" and end with "end quantity".
QuantityModes.QUANTITY_NUMBERED
Each set of begin and end parentheses will start with the words "begin nth quantity" and end with "end nth quantity", where n is the depth of the current quantity. Depth refers to the number of quantities surrounding the current quantity.
QuantityModes.PARENTHESES
Same as QuantityModes.QUANTITY , but with "begin parentheses" and "end parentheses".
QuantityModes.PARENTHESES_NUMBERED
Same as QuantityModes.QUANTITY_NUMBERED , but with "begin nth parentheses" and "end nth parentheses".

Future Work

SymPy is not an ideal solution. For example, it does not support the "plus or minus" operator! If there is a way to use a different solution without taking on an enormous workload, do it.
The meaning of a math problem is never changed during processing, but the current system tends to shuffle terms around from their original form. This is due to SymPy trying to simplify the equation. It would be better if the math equation was changed only minimally.

There are some sympy functions we did not implement due to a lack of mathematical knowledge. These are noted in static/sympy_funcs.xml. Search the document for "TODO" to find and implement them.

sympytossml.py parses through a sympy object based on the pronunciation supplied in sympy_funcs.xml. We support a variable number of arguments through the tag. If a sympy object has fewer args than tags in its xml entry, the program goes into an infinite loop.
EXPANSION: Related to above bug. some SymPy functions may need drastically different pronunciations depending on the number of arguments.

Many LaTeX documents contain math mode portions which are not actually proper math expressions and thus are not able to be converted by our process. We were unable to find a good solution for this issue; we simply added an if statement at the end of tex_to_sympy.py which prevents these subscript strings from going through the mathmode to ssml process. Future developers may want to adopt a different approach instead of assuming all math mode strings are proper math equations.

It might be better for the user if math mode which doesn't work with our system was rendered as raw text, rather than an error message.

Known Mathmode Bugs:

In Proper Math Mode Input Bug

Our math mode implementation was ensuring that users would enter 'proper' math mode conditions. However, from listening to documents it was realized that users use other characters and commands in math mode for formatting. Sometimes, this confuses our parser which results in incorrect audio for that math expression or a math did not render output. An example of this is, some users would write $s_2$ which would be read as s underscore two. What was quite common in current LaTeX files, was users would instead write s$_2$.

This is quite common with other math mode commands, where the current parser for math mode that is implemented does not recognize one sided expressions. These expressions include, but are not limited to: $_2$, $^2$, $> 2$, $< 2$, $= 2$.

Since our current implementation only supports basic math mode commands, harder math mode equations sometimes result in an error in the math mode parser which could be expanded upon.

Infinite Loop Error - Issue

This has been happening very rarely, out of 13,000 files only 2 were found to have an infinite loop. This had to do with mathmode \gcd{} components. Current commands that we know infinite loop: 

$\gcd(n, r) r$ and $\gcd(C) := \gcd(T_1, \dotsc, T_k)$.

Limit Function

When testing limit functions, extra characters are being added to the end of the mathmode rendering. This is due to the Sympy library.