Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MathML -> Tex conversion, mfenced element: ( | #177

Closed
felix-smashdocs opened this issue Aug 13, 2021 · 5 comments
Closed

MathML -> Tex conversion, mfenced element: ( | #177

felix-smashdocs opened this issue Aug 13, 2021 · 5 comments

Comments

@felix-smashdocs
Copy link

Hello texmath-team,

this issue is about converting the -Element from MathML to Tex.
I have the following example:

initial_formula

which is represented in MathML as:

<math>
    <mfenced separators="|">
        <mrow>
            <mi>A</mi>
        </mrow>
        <mrow>
            <mi>B</mi>
        </mrow>
    </mfenced>
    <mi></mi>
    <mfenced close="|" open="|">
        <mrow>
            <mi>C</mi>
            <mfenced separators="|">
                <mrow>
                    <mi>D</mi>
                </mrow>
                <mrow>
                    <mi>E</mi>
                </mrow>
            </mfenced>
        </mrow>
    </mfenced>
</math>

When I convert this MathML to tex in https://johnmacfarlane.net/texmath.html, I get as result:

\left( A \middle| B \right)\left| {C\left| D \middle| E \right|} \right|

which looks like

result_formula

The parentheses () around D and E have been transformed in the end to vertical lines ||. But expected is that the formula looks like the initial version.

Is this a bug in the texmath lib?

Best greetings,

Felix

@jgm
Copy link
Owner

jgm commented Aug 13, 2021

It took me a while to figure out why the parser was doing this!

% texmath -f mathml -t native
<math>
     <mfenced separators="|">
                <mrow>
                    <mi>D</mi>
                </mrow>
                <mrow>
                    <mi>E</mi>
                </mrow>
            </mfenced>

</math>
[EDelimited "(" ")" [Right (EIdentifier "D"),Left "|",Right (EIdentifier "E")]]

(here we get the correct parentheses for open and close), but when we embed this in the outer mfenced:

% texmath -f mathml -t native
<math>
    <mfenced separators="|">
        <mrow>
            <mi>A</mi>
        </mrow>
        <mrow>
            <mi>B</mi>
        </mrow>
    </mfenced>
    <mi></mi>
    <mfenced close="|" open="|">
        <mrow>
            <mi>C</mi>
            <mfenced separators="|">
                <mrow>
                    <mi>D</mi>
                </mrow>
                <mrow>
                    <mi>E</mi>
                </mrow>
            </mfenced>
        </mrow>
    </mfenced>
</math>
[EDelimited "(" ")" [Right (EIdentifier "A"),Left "|",Right (EIdentifier "B")],EIdentifier "",EDelimited "|" "|" [Right (EGrouped [EIdentifier "C",EDelimited "|" "|" [Right (EIdentifier "D"),Left "|",Right (EIdentifier "E")]])]]

Now we get | for open and close!

The reason, it seems, is that the MathML reader stores the attributes of outer elements in state and uses them when parsing children in some cases. The relevant code is here (Text.TeXMath.Readers.MathML, line 599):

findAttrQ :: String -> Element -> MML (Maybe T.Text)
findAttrQ s e = do
  inherit <- asks (lookupAttrQ s . attrs)
  return $ fmap T.pack $
    findAttr (QName s Nothing Nothing) e
      <|> inherit

So what's happening here is that the inner mfenced is inheriting the outer one's open and close attributes, since it doesn't explicitly specify them. Clearly, that's not what should be happening: rather, open and close should receive default values.

This code was added long ago by @mpickering - I wonder if he can remember why we have this inherit?

@jgm
Copy link
Owner

jgm commented Aug 13, 2021

When I change this inheritance so it doesn't accumulate attributes from parents, I see a number of test failures, e.g. in munder5.mml

    <munder accentunder="false">
      <mi>x</mi> 
      <mo> &#x02DC;</mo> 
    </munder> 

we get ESymbol Accent "\732" instead of ESymbol Ord "\732", apparently because the attribute accentunder="false" is not getting seen. On a quick glance, all the test failures are like this, so maybe this was the reason for the inheritance? Something more limited should then work?

@jgm
Copy link
Owner

jgm commented Aug 13, 2021

Looking at the spec, it seems that many attributes inherit, but these don't.

@jgm jgm closed this as completed in 901c10c Aug 13, 2021
@felix-smashdocs
Copy link
Author

Cool, thanks a lot @jgm for the answer and solution! Works for me.
Interesting, that the inheritance behaviour is defined in a different way for different attributes.

@jgm
Copy link
Owner

jgm commented Aug 18, 2021

Yeah, there may be other problems of this kind, but I couldn't find a handy table of which attributes inherit and which don't -- and I didn't have time to comb through the whole spec exhaustively.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants