# MathML alt-text with TeX issue on IE #378

Closed
opened this issue Jan 21, 2013 · 20 comments

Projects
None yet
4 participants
Member

### pkra commented Jan 21, 2013

 Our friends at Project Euclid have found a weird bug. On this page equation 0.1 is not rendered on IE8 (they reported it for IE9 but I didn't check). Instead of a rendering, the TeX-code from the alt-text is displayed as if that failed to render. The MathJax menu will actually try to give you TeX source, which indicates some kind of mix up. Strangely enough, that equation alone has no issues.
Member

### dpvc commented Jan 23, 2013

 The page has TeX code as its input format, so there is no MathML or alt-text initially (at least for me, but I haven't tried in IE yet, and perhaps they are sending different content depending on the browser). I don't see any way that MathJax would turn the alt-text of a MathML element into a TeX element jax, so I'm wondering if you are misinterpreting the situation? On the other hand, the equation isn't rendering, so something is going on. I just checked the equation, and the problem is the line breaks, which they have as \\ [-9pt]. First, the space shouldn't be there (it should be \\[-9pt]), but since they are using MathJax v2.0, which didn't handle the space properly, it is being treated as though the space isn't there. On the other hand, v2.0 didn't allow negative dimensions in the brackets, so it is the -9pt that is causing the equation to not be displayed. Version 2.1 resolves the negative dimension problem (and the space problem), so updating to v2.1 would allow the equation to be displayed. Though they would still need to remove the spaces after \\ to get the effect they seem to want.
Member Author

### pkra commented Jan 23, 2013

 The page has TeX code as its input format, so there is no MathML or alt-text initially (at least for me, but I haven't tried in IE yet, and perhaps they are sending different content depending on the browser). This is not what I see. The html source of the link above contains MathML code, the MathJax menu (on Chrome) offers only MathML code. The equation I mentioned reads  Ω| u|p|x| αpdxΛn ,p,αΩ|u (x)|p|x |nA1(|x |)pdx +CΩ| u(x)|p|x |nA1(|x |)pA2 (|x|)2 dx[/itex]  MathJax formatting is  Ω | u | p | x | α p d x Λ n , p , α Ω | u ( x ) | p | x | n A 1 ( | x | ) p d x + C Ω | u ( x ) | p | x | n A 1 ( | x | ) p A 2 ( | x | ) 2 d x [/itex] 
Member

### dpvc commented Jan 23, 2013

 OK, it looks like they are sending different content to different browsers. Not sure why. I see TeX source in Safari, but MathML in Chrome. Note that MathJax uses the ALTTEXT value as the preview for MathML input, and so that is why you are seeing TeX code in the page. But you should not have a TeX menu item. The only thing I can think of that might be happening is the following: they use a custom configuration file that loads both TeX and MathML input (regardless of which they are using), and so it may be that if the MathML input jax runs first, it will insert the TeX code as the preview, and the TeX input jax is picking that up and treating that as additional math. I'll check into that possibility to see if that could be it. If that is it, it would be easy to add MathJax_Preview to the ignoreClass list, so that math inside previews would be ignored.
Member

### dpvc commented Jan 23, 2013

 I've checked the code, and it does look like the issue could be due to the mml2jax preprocessor running before the tex2jax one. In that case, the tex2jax preprocessor will try to process \begin{}...\end{} blocks found inside the previews generated by mml2jax. There are two ways around this: Use ignoreClass: "tex2jax_ignore|MathJax_Preview" in the tex2jax section of the configuration file, or Use preview: "none" or a fixed preview string in the mml2jax section of the configuration.
Member Author

### pkra commented Jan 23, 2013

 Thanks, Davide. That explains things (and how weird is that Safari vs Chrome setup...) I'll let them know.
Member

### dpvc commented Jan 24, 2013

 I tried it in IE9 and IE8. In IE9, I get sent the TeX version, so the issue there is the incorrect use of \\ [-9pt]. Removing the space and updating to Mathjax v2.1 would resolve that. This accounts for the boxed TeX code in place of the displayed equation. In IE8 I get sent the MathML (as you did), but I guess the tex2jax and mml2jax preprocessors ran in the other order (it depends on which arrives over the network first), and I did see the displayed equation. However, it wasn't displayed as expected. This is because they used self-closing tags, e.g., , and non-HTML5 browsers like IE8 don't handle these properly. The strange thing is that your version above doesn't use self-closing tags, so I'm not sure why I got them and you didn't. I really am not sure what the deal is with different browsers getting different versions of the code (unless they have multiple servers and you are assigned one at random and they don't all have the same versions of the pages). Perhaps you can find out what is going on with that. The MathML versions are not as good as the TeX versions, since they don't properly isolate their stretchy delimiters, and so they get oversized parentheses and absolute values.
Member Author

### pkra commented Jan 25, 2013

 Doesn't MathJax sanitize MathML? I remember grabbing a MathML-heavy page for my epub experiments a while ago, and some MathML didn't validate -- yet MathJax's "show mathml" did. [Or in other words: I used the MathJax menu to get the above, not the page source code]

### PaulTopping commented Jan 25, 2013

 This sanitization is a natural result of MathJax parsing the MathML and throwing away anything that it doesn't understand. When MathML is generated from the tree, the bad MathML in the input will be missing. When MathML is the input to MathJax, there is really two MathML instances: original MathML and as-rendered MathML. It could be argued that what the user should see in Show MathML is the original MathML when MathJax's input is MathML. After all, as-rendered MathML will only contain the subset of MathML that MathJax understands. Even if MathJax rendered all MathML constructs, there would still be differences between these two MathML chunks.
Member

### dpvc commented Jan 25, 2013

 Yes, but only when it gets valid MathML to start with. The problem is that pre-HTML5 browsers don't handle self-closing tags properly: they treat them as open tags and how they handle the "missing" close tag varies from browser to browser. So, for example,  will be treated as  with no closing tag, and that means rather than get an empty tag, the browser will get an  with content that is the following tags. If there is a following , that can be improperly nested (or the browser may recognize the missing close tag at this point and supply it). The end result is that self-closing tags can lead to improper MathML tree structure within the DOM. That occurs long before MathJax views the page, and so it receives a damaged DOM to work with. It can't tell what tags were originally self-closing and where the damage is, so really can't fix this itself. It does apply some heuristics to try to undo the damage, but it is not always possible to tell, and tables are one of the places where things go wrong in ways that MathJax can't undo. Your page shows  in the original page source, so doesn't have the self-closing problem, so MathJax was able to get the correct MathML structure. The page I received had  and so got completely messed up. I don't know why we each got a different version, nor why they are shipping different versions at all, but that seems to be what is happening.
Member

### dpvc commented Jan 25, 2013

 @PaulTopping: MathJax provides both the original MathML (as it sees it in the DOM) and its internal version in the Show Math menu. The main difference between the two is the line breaking and indenting in the latter which makes it easier to read (most source MathML that I've seen is one long string). The latter will also have the namespace prefix (if any) removed, and will add the xmlns attribute to the [itex] tag, but other than that, they should be identical. The real issue underlying Peter's comment is not a difference between the original and the internal, but a difference between the source HTML file's text and the parsed DOM when an HTML page contains self-closing tags and is viewed in a non-HTML5 browser.
Member Author

### pkra commented Jan 25, 2013

 @dpvc ah, thanks. I didn't understand that that's the problem. What would I do without IE to "learn" from ;)

### PaulTopping commented Jan 25, 2013

 Davide, Thanks for the explanation. I was focusing on Peter's comment in isolation rather than the problem he was trying to solve. My mistake. MathPlayer has to deal with the same problem due to its input being the already-parsed DOM rather than the original MathML text from the HTML source. Does MathJax retain unknown/bad attributes and tags within the MathML and pass them through in its internal form? How about MathML it doesn't yet implement or things like and which, presumably, it doesn't care about? Just wondering. Paul
Member

### dpvc commented Jan 25, 2013

 MathJax passes unknown and bad attributes through to the internal format. It also passes bad tree structure (in the sense of wrong number of children, missing children, wrong type of children) when the nodes are types that it understands. So, for example, if the original had an  that had a child that wasn't an , that would be preserved. MathJax knows about , , and  and handles them properly. If the MathML includes nodes that MathJax doesn't understand, however, it generates an error. That could be handled differently, and I've been thinking about changing that, in particular since HTML5 allows HTML nodes in  elements, which MathJax currently doesn't support.

Contributor

### fred-wang commented Apr 4, 2013

 Hi all, so finally if we exclude the errors in their TeX source and the IE self-closing tag bug, what does it remain here? Would it be possible to have a minimal testcase?
Member

### dpvc commented Apr 4, 2013

 I believe the only issue left is the tex2jax preprocessor possibly processing TeX code that has been placed in the MathJax_Preview for the MathML from the [itex] tag's alttext attribute (when the mml2jax preprocessor runs first). The easy solution to this is to set ignoreClass to "tex2jax_ignore|MathJax_Preview" in the tex2jax configuration (and default.js) so that previews will not be processed. (A similar change should be made to all three preprocessors). A better solution would be to add the MathJax_Preview when the ignore pattern is created in tex2jax. This would allow you to add the preRemoveClass rather than a hard-coded MathJax_Preview, since that is something that can be configured by the page author. Similar changes should be made to all three preprocessors, so none will try to process previews created by any of the others (or by the page author).
Contributor

### fred-wang commented Apr 5, 2013

 OK, I'm not able to reproduce the problem but I've added a test for alttext MathMLToDisplay/issue378.html ==> In testsuite
Member

### dpvc commented Apr 17, 2013

 The issue378 branch of my fork of MathJax includes the fix to prevent tex2jax from processing the previews generated by mml2jax (or any previews for that matter). A test case would be to include \begin{array}...\end{array} and see if it gets processed or not. (It should not.)
Contributor

### fred-wang commented Apr 19, 2013

 My windows virtual is very slow at the moment so I won't test it locally and just update the test MathMLToDisplay/issue378.html. It will compare \begin{array}...\end{array}  with \begin{array}...\end{array} 
Contributor

### dpvc pushed a commit to dpvc/MathJax that referenced this issue Apr 19, 2013

 Merge branch 'issue378' into develop 
Resolves issue mathjax#378.
 6e2b806 
Member

### dpvc commented Apr 19, 2013

 Test looks good. => Merged