New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LaTeXML generates invalid content MML #928

Closed
AndreG-P opened this Issue Jan 13, 2018 · 5 comments

Comments

Projects
None yet
3 participants
@AndreG-P

AndreG-P commented Jan 13, 2018

Just as an example: \mbox{x} + y will generate the following MML tree:

<math xmlns="http://www.w3.org/1998/Math/MathML" id="p1.1.m1.1" class="ltx_Math" alttext="\mbox{x}+y" display="inline">
  <semantics id="p1.1.m1.1a">
    <mrow id="p1.1.m1.1.4" xref="p1.1.m1.1.4.cmml">
      <mtext id="p1.1.m1.1.1" xref="p1.1.m1.1.1.cmml">x</mtext>
      <mo id="p1.1.m1.1.2" xref="p1.1.m1.1.2.cmml">+</mo>
      <mi id="p1.1.m1.1.3" xref="p1.1.m1.1.3.cmml">y</mi>
    </mrow>
    <annotation-xml encoding="MathML-Content" id="p1.1.m1.1b">
      <apply id="p1.1.m1.1.4.cmml" xref="p1.1.m1.1.4">
        <plus id="p1.1.m1.1.2.cmml" xref="p1.1.m1.1.2"/>
        <mtext id="p1.1.m1.1.1.cmml" xref="p1.1.m1.1.1">x</mtext>
        <ci id="p1.1.m1.1.3.cmml" xref="p1.1.m1.1.3">𝑦</ci>
      </apply>
    </annotation-xml>
    <annotation encoding="application/x-tex" id="p1.1.m1.1c">\mbox{x}+y</annotation>
  </semantics>
</math>

However, the mtext tag is considered as invalid for the apply element in mathml documents by
https://www.w3.org/Math/DTD/mathml3/mathml3.dtd

@AndreG-P

This comment has been minimized.

AndreG-P commented Jan 13, 2018

For the DRMF we use the following workaround:

<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet exclude-result-prefixes="ltx" version="1.0" xmlns:ltx="http://dlmf.nist.gov/LaTeXML"
                xmlns:m="http://www.w3.org/1998/Math/MathML" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
                xpath-default-namespace="http://www.w3.org/1998/Math/MathML">
    <!-- Include all LaTeXML to xhtml modules -->
    <xsl:import href="LaTeXML-common.xsl"/>
    <xsl:import href="LaTeXML-math-xhtml.xsl"/>
    <xsl:template match="m:csymbol[@cd='ambiguous' and text() = 'superscript']">
        <xsl:element name="power" namespace="{$mml_ns}">
            <xsl:attribute name="id">
                <xsl:value-of select="@xml:id"/>
            </xsl:attribute>
            <xsl:copy-of select="@xref"/>
        </xsl:element>
    </xsl:template>
    <xsl:template match="m:apply//m:mtext">
        <xsl:element name="ci" namespace="{$mml_ns}">
            <xsl:attribute name="id">
                <xsl:value-of select="@xml:id"/>
            </xsl:attribute>
            <xsl:copy-of select="@xref"/>
            <!-- merges subsequent text elements -->
            <xsl:apply-templates/>
        </xsl:element>
    </xsl:template>
    <xsl:template match="m:csymbol[@cd='ambiguous' and text() = 'subscript']">
        <!-- interpret subscripts as parameters -->
    </xsl:template>
    <xsl:template match="m:csymbol[@cd='dlmf' and text() = 'apply-upper-index']"/>
    <xsl:template match="m:csymbol[@cd='dlmf' and text() = 'apply-infix-operator']"/>
</xsl:stylesheet>

@dginev dginev added this to the LaTeXML-0.8.4 milestone Jan 13, 2018

@brucemiller

This comment has been minimized.

Owner

brucemiller commented Jan 31, 2018

Are you replacing the mtext by ci or wrapping it? It seems the former, but that is making too many assumptions about the intended meaning. It would seem that where mtext (or other random pmml) arises, you'd want to wrap in a ci to be (more) valid, but it still defers the whole question of meaning. In these cases, you've got pretty much no handle on the meaning, though

@AndreG-P

This comment has been minimized.

AndreG-P commented Feb 7, 2018

I agree, our workaround isn't right. However, we needed to solve invalid content MathML quickly and came up with this idea. It looks like it's a more complicated problem than we thought.

@brucemiller

This comment has been minimized.

Owner

brucemiller commented Feb 20, 2018

Well, there's no real "right" without knowing the semantic intent; wrapping in an m:ci is pretty much the only solution (but I think better to wrap than to replace). The patch was actually quite easy. Figuring out the semantics is a bit harder :> Thanks for the report!

@brucemiller

This comment has been minimized.

Owner

brucemiller commented Feb 20, 2018

Incidentally, while I can see that from a quick-n-dirty programming point of view, that it's very convenient to build into the XSL stylesheets your (rather extreme) assumptions about superscripts being powers, it seems me rather dangerous (easily forgotten) and not really the design approach I'd take. I'd rather think of examining LaTeXML's XML tree, looking for candidate assumptions (with the possibility to then actually look at each case to determine the quality of the guess) and then upgrading the meaning, as appropriate. Considerably more programming, for sure, but seems more powerful in the long run.

@dginev dginev modified the milestones: LaTeXML-0.8.4, LaTeXML-0.8.3 Feb 25, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment