# Working With Equations

One of the asset types that is represented within OU-XML is an `<Equation>` type. This element can be used to represent mathematical and chemical equations.

The element using MathML to describe equation items, which are rendered in the VLE using Mathjax (I think? I'm not sure that the conversion process is?) and via LaTex for PDF print publications. Browsers such as Firefox are also capable of rendering MathML directly.

One of the problems with MathML as a structure is that it is not the sort of thing you would write by hand, and as such, it may be difficult to discover via simple search. (A simpler way of writing equations is to use LaTeX, for example.)

## Preparing the Ground

As ever, we need to set up a database connection:

In [1]:
from sqlite_utils import Database

# Open database connection
dbname = "all_openlean_xml.db"
db = Database(dbname)

And get a sample XML file, selecting one that we know contains structurally marked up equation items:

In [2]:
import pandas as pd

pd.read_sql("SELECT * FROM xml WHERE xml LIKE '%<Equation>%'",
                           con=db.conn)

Unnamed: 0,code,name,xml,id
0,T212,An introduction to electronics,"b'<?xml version=""1.0"" encoding=""utf-8""?>\n<?sc...",e70841f12a908401ab9e6a69923bdb684928c888
1,,An introduction to geology,"b'<?xml version=""1.0"" encoding=""utf-8""?>\n<?sc...",4c8058285a4de53528f646ee2742dc8394fd4e38
2,S276,An introduction to minerals and rocks under th...,"b'<?xml version=""1.0"" encoding=""utf-8""?>\n<?sc...",6bff78840be5165329dda278418bbbd54c909047
3,T193,"Assessing risk in engineering, work and life","b'<?xml version=""1.0"" encoding=""utf-8""?>\n<?sc...",11e5486d113eebd6c01126c9c65b91591c211b9b
4,SK299,Blood and the respiratory system,"b'<?xml version=""1.0"" encoding=""UTF-8""?>\n<?dc...",904a100e4d41cf1a696b547eec1b2f625fc5bd78
5,,Discovering chemistry,"b'<?xml version=""1.0"" encoding=""utf-8""?>\n<?sc...",884164a46f4066c6b26894c812484c74ab2e8531
6,,Mathematics for science and technology,"b'<?xml version=""1.0"" encoding=""utf-8""?>\n<?sc...",84fea7b4cf86cdd4e31e3272572372972fb81fe2
7,s315,Metals in medicine,"b'<?xml version=""1.0"" encoding=""UTF-8""?>\n<Ite...",c2c90459369d82e28e768dfd9072047eab95be4d
8,SM123,Particle physics,"b'<?xml version=""1.0"" encoding=""utf-8""?>\n<?sc...",4095122554b7cc3cff824f31c3cf531087e63b2c
9,S112,Scales in space and time,"b'<?xml version=""1.0"" encoding=""UTF-8""?>\n<?sc...",75a013ae7e703481e8f0e05bde38d6d71fa732b6


In [3]:
from lxml import etree
import pandas as pd

# Grab an OU-XML file that is known to contain equation items
# Maybe also: Teaching mathematics
equation_xml_raw = pd.read_sql("SELECT xml FROM xml WHERE name='Discovering chemistry'",
                           con=db.conn).loc[0, "xml"]

# Parse the XML into an xml object
root = etree.fromstring(equation_xml_raw)

## Extracting Equation Items

We can trically extract equation items from a single OU-XML XML document object:

In [4]:
from xml_utils import unpack

def get_equation_items(root):
    """Extract equations from an OU-XML XML object."""
    return [unpack(eq) for eq in root.xpath('//Equation')]

What do we get?

In [5]:
get_equation_items(root)[:3]

[b'<Equation xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"><MathML><math xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mmultiscripts><mrow><mi>X</mi></mrow><mprescripts/><mrow><mi>Z</mi></mrow><mrow><mi>A</mi></mrow></mmultiscripts></mrow></math></MathML></Equation>',
 b'<Equation xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"><Image>K<sup>+</sup>, Ca<sup>2+</sup>, Al<sup>3+</sup>, S<sup>2-</sup>, F<sup>-</sup> and Br<sup>-</sup></Image></Equation>',
 b'<Equation xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"><MathML><math xmlns="http://www.w3.org/1998/Math/MathML"><mstyle mathvariant="normal"><mrow><mstyle mathvariant="normal"><mrow><mi>C</mi><mi>u</mi><mo>(</mo><mi>s</mi><mo>)</mo><mo>+</mo><msub><mrow><mn>4</mn><mi>H</mi><mi>N</mi><mi>O</mi></mrow><mrow><mn>3</mn></mrow></msub><mo>(</mo><mi>a</mi><mi>q</mi><mo>)</mo></mrow></mstyle><mo>=</mo><msub><mrow><msub><mrow><mstyle mathvariant="normal"><mrow><mi>C</mi><mi>u</mi></mrow></mstyle><mo>(</mo><mstyle math

The equation is represented using MathML.

Let's just get the `<math>` part from one of the equations:

In [6]:
import re

# Get an example equation element 
eq = get_equation_items(root)[2].decode()

# Extract the <math>...</math> component
eq = re.findall(r'.*<MathML>(.*)</MathML>.*', eq)[0]

eq

'<math xmlns="http://www.w3.org/1998/Math/MathML"><mstyle mathvariant="normal"><mrow><mstyle mathvariant="normal"><mrow><mi>C</mi><mi>u</mi><mo>(</mo><mi>s</mi><mo>)</mo><mo>+</mo><msub><mrow><mn>4</mn><mi>H</mi><mi>N</mi><mi>O</mi></mrow><mrow><mn>3</mn></mrow></msub><mo>(</mo><mi>a</mi><mi>q</mi><mo>)</mo></mrow></mstyle><mo>=</mo><msub><mrow><msub><mrow><mstyle mathvariant="normal"><mrow><mi>C</mi><mi>u</mi></mrow></mstyle><mo>(</mo><mstyle mathvariant="normal"><mrow><mi>N</mi><mi>O</mi></mrow></mstyle></mrow><mrow><mn>3</mn></mrow></msub><mo>)</mo></mrow><mrow><mn>2</mn></mrow></msub><mstyle mathvariant="normal"><mrow><mo>(</mo><mi>a</mi><mi>q</mi><mo>)</mo></mrow></mstyle><mo>+</mo><msub><mrow><mstyle mathvariant="normal"><mrow><mn>2</mn><mi>N</mi><mi>O</mi></mrow></mstyle></mrow><mrow><mn>2</mn></mrow></msub><mstyle mathvariant="normal"><mrow><mo>(</mo><mi>g</mi><mo>)</mo></mrow></mstyle><mo>+</mo><msub><mrow><mn>2</mn><mi>H</mi></mrow><mrow><mn>2</mn></mrow></msub><mi>O</mi><mo>

In Firefox at least, we can render the `<math>` *MathML* markup text directly:

In [7]:
from IPython.display import HTML

# This works in firefox at least
HTML(eq)

To explore:

- https://github.com/bowang/mathml2latex ?
- https://www.geeksforgeeks.org/html5-mathml-display-attribute/ ?
- other MathML parsers?