latex -> html creates invalid {aligned*} from eqnarray, align environments #423

Closed
opened this Issue Feb 19, 2012 · 8 comments

Projects
None yet
3 participants

ketch commented Feb 19, 2012

 When converting latex to html, pandoc turns \begin{align} x=y \end{align}  into 

\begin{aligned*} x=y \end{aligned*}

 But aligned* is not a valid environment. It does the same thing to eqnarray. It should use align* or aligned instead. Indeed, there seems to be no reason to change the environment. It should just produce 

\begin{align} x=y \end{align}



 It's worth noting that this fixes the issue (so that, for example, MathJax can render it using its AMSMath library), but I'm not sure what it breaks: diff --git a/src/Text/Pandoc/Readers/LaTeX.hs b/src/Text/Pandoc/Readers/LaTeX.hs index 5e69347..f470910 100644 --- a/src/Text/Pandoc/Readers/LaTeX.hs +++ b/src/Text/Pandoc/Readers/LaTeX.hs @@ -682,16 +682,16 @@ environments = M.fromList , ("displaymath", mathEnv Nothing "displaymath") , ("equation", mathEnv Nothing "equation") , ("equation_", mathEnv Nothing "equation_") , ("gather", mathEnv (Just "gathered") "gather") , ("gather_", mathEnv (Just "gathered") "gather_") , ("multiline", mathEnv (Just "gathered") "multiline") , ("multiline_", mathEnv (Just "gathered") "multiline_") , ("eqnarray", mathEnv (Just "aligned*") "eqnarray") , ("eqnarray_", mathEnv (Just "aligned_") "eqnarray*") , ("align", mathEnv (Just "aligned*") "align") , ("align_", mathEnv (Just "aligned_") "align*") , ("alignat", mathEnv (Just "aligned*") "alignat") , ("alignat_", mathEnv (Just "aligned_") "alignat*") , ("gather", mathEnv (Just "gather") "gather") , ("gather_", mathEnv (Just "gather_") "gather*") , ("multiline", mathEnv (Just "multiline") "multiline") , ("multiline_", mathEnv (Just "multiline_") "multiline*") , ("eqnarray", mathEnv (Just "eqnarray") "eqnarray") , ("eqnarray_", mathEnv (Just "eqnarray_") "eqnarray*") , ("align", mathEnv (Just "align") "align")
Owner

jgm commented Feb 20, 2012

 Note that \begin{align} x=y \end{align}  is not valid LaTeX (with ams extensions). It may work in mathjax, but it won't compile with latex. The reason is that the align environment is a top-level math environment, not something that can be included inside $..$. The aligned environment, by contrast, can be used inside $...$. Since all pandoc display math elements are rendered in LaTeX inside $...$, we used aligned instead of align. Hope that makes sense. You are right, though, that aligned* does not exist. That is a mistake I'll fix.

ketch commented Feb 20, 2012

 Thanks for getting to this so quickly. If one is converting to html for use with MathJax, then it is possible to configure MathJax to render \begin{align} x=y \end{align} (which is of course valid LaTeX) without any need for [ ]. It would be nice if pandoc gave users the option to go this route. Note that the various environments that are all being converted to aligned behave in different ways, so something is lost in translation. On Mon, Feb 20, 2012 at 7:57 AM, John MacFarlane < reply@reply.github.com wrote: Note that \begin{align} x=y \end{align}  is not valid LaTeX (with ams extensions). It may work in mathjax, but it won't compile with latex. The reason is that the align environment is a top-level math environment, not something that can be included inside $..$. The aligned environment, by contrast, can be used inside $...$. Since all pandoc display math elements are rendered in LaTeX inside $...$, we used aligned instead of align. Hope that makes sense. You are right, though, that aligned* does not exist. That is a mistake I'll fix. Reply to this email directly or view it on GitHub: #423 (comment)

Owner

jgm commented Feb 20, 2012

 @ketch: Yes, I'm aware that something is lost in translation here. It can't be helped, though, with the current pandoc architecture. We parse a display math environment into Math DisplayMath x, and render it in LaTeX as $x$, and in other formats in different ways...so we need to convert align into aligned so it can go inside $...$. It may be that more radical changes are needed to the way math is stored in the pandoc AST, but if you want to suggest that, it would be best to open a separate issue.

 @jgm (cc @ketch) - I noticed that the functionality David and I are really looking for is the ability to pass raw LaTex on through to HTML output formats, particularly when we are working with gitit and Markdown or LaTeX input formats. Would it make sense to allow this as an option in the gitit header (basically, asking pandoc to "pass through" anything that looks like raw TeX instead of trying to parse it?)
Owner

jgm commented May 19, 2012

 You could use a very simple gitit plugin based on the following function convertRawTeX :: Pandoc -> Pandoc convertRawTex = bottomUp convertRawTeXInline . bottomUp convertRawTeXBlock convertRawTeXInline :: Inline -> Inline convertRawTeXInline (RawInline "tex" x) = RawInline "html" x convertRawTeXInline x = x convertRawTeXBlock :: Block -> Block convertRawTeXBlock (RawBlock "tex" x) = RawBlock "html" x convertRawTeXBlock x = x +++ ahmadia [May 19 12 06:05 ]: @jgm (cc @ketch) - I noticed that the functionality David and I are really looking for is the ability to pass raw LaTex on through to HTML output formats, particularly when we are working with gitit and Markdown or LaTeX input formats. Would it make sense to allow this as an option in the gitit header (basically, asking pandoc to "pass through" anything that looks like raw TeX instead of trying to parse it?) Reply to this email directly or view it on GitHub: #423 (comment)
Owner

jgm commented May 22, 2012

 Have you tried --parse-raw? On Sat, May 19, 2012 at 6:05 AM, ahmadia < reply@reply.github.com wrote: @jgm (cc @ketch) - I noticed that the functionality David and I are really looking for is the ability to pass raw LaTex on through to HTML output formats, particularly when we are working with gitit and Markdown or LaTeX input formats. Would it make sense to allow this as an option in the gitit header (basically, asking pandoc to "pass through" anything that looks like raw TeX instead of trying to parse it?) Reply to this email directly or view it on GitHub: #423 (comment)