latex -> html creates invalid {aligned*} from eqnarray, align environments #423

Closed
ketch opened this Issue Feb 19, 2012 · 8 comments

Comments

Projects
None yet
3 participants

ketch commented Feb 19, 2012

When converting latex to html, pandoc turns

\begin{align}
x=y
\end{align}

into

<p>
\[\begin{aligned*}
x=y
\end{aligned*}\]
</p>

But aligned* is not a valid environment. It does the same thing to eqnarray. It should use align* or aligned instead. Indeed, there seems to be no reason to change the environment. It should just produce

<p>
\[\begin{align}
x=y
\end{align}\]
</p>

ahmadia commented Feb 19, 2012

It's worth noting that this fixes the issue (so that, for example, MathJax can render it using its AMSMath library), but I'm not sure what it breaks:

diff --git a/src/Text/Pandoc/Readers/LaTeX.hs b/src/Text/Pandoc/Readers/LaTeX.hs
index 5e69347..f470910 100644
--- a/src/Text/Pandoc/Readers/LaTeX.hs
+++ b/src/Text/Pandoc/Readers/LaTeX.hs
@@ -682,16 +682,16 @@ environments = M.fromList
, ("displaymath", mathEnv Nothing "displaymath")
, ("equation", mathEnv Nothing "equation")
, ("equation_", mathEnv Nothing "equation_")

  • , ("gather", mathEnv (Just "gathered") "gather")
  • , ("gather_", mathEnv (Just "gathered") "gather_")
  • , ("multiline", mathEnv (Just "gathered") "multiline")
  • , ("multiline_", mathEnv (Just "gathered") "multiline_")
  • , ("eqnarray", mathEnv (Just "aligned*") "eqnarray")
  • , ("eqnarray_", mathEnv (Just "aligned_") "eqnarray*")
  • , ("align", mathEnv (Just "aligned*") "align")
  • , ("align_", mathEnv (Just "aligned_") "align*")
  • , ("alignat", mathEnv (Just "aligned*") "alignat")
  • , ("alignat_", mathEnv (Just "aligned_") "alignat*")
  • , ("gather", mathEnv (Just "gather") "gather")
  • , ("gather_", mathEnv (Just "gather_") "gather*")
  • , ("multiline", mathEnv (Just "multiline") "multiline")
  • , ("multiline_", mathEnv (Just "multiline_") "multiline*")
  • , ("eqnarray", mathEnv (Just "eqnarray") "eqnarray")
  • , ("eqnarray_", mathEnv (Just "eqnarray_") "eqnarray*")
  • , ("align", mathEnv (Just "align") "align")
Owner

jgm commented Feb 20, 2012

Note that

\[\begin{align}
x=y
\end{align}\]

is not valid LaTeX (with ams extensions). It may work in mathjax, but it won't compile with latex. The reason is that the align environment is a top-level math environment, not something that can be included inside \[..\].

The aligned environment, by contrast, can be used inside \[...\]. Since all pandoc display math elements are rendered in LaTeX inside \[...\], we used aligned instead of align. Hope that makes sense.

You are right, though, that aligned* does not exist. That is a mistake I'll fix.

ketch commented Feb 20, 2012

Thanks for getting to this so quickly.

If one is converting to html for use with MathJax, then it is possible to
configure MathJax to render

\begin{align}
x=y
\end{align}

(which is of course valid LaTeX) without any need for [ ]. It would be
nice if pandoc gave users the option to go this route. Note that the
various environments that are all being converted to aligned behave in
different ways, so something is lost in translation.

On Mon, Feb 20, 2012 at 7:57 AM, John MacFarlane <
reply@reply.github.com

wrote:

Note that

\[\begin{align}
x=y
\end{align}\]

is not valid LaTeX (with ams extensions). It may work in mathjax, but it
won't compile with latex. The reason is that the align environment is a
top-level math environment, not something that can be included inside
\[..\].

The aligned environment, by contrast, can be used inside \[...\].
Since all pandoc display math elements are rendered in LaTeX inside
\[...\], we used aligned instead of align. Hope that makes sense.

You are right, though, that aligned* does not exist. That is a mistake
I'll fix.


Reply to this email directly or view it on GitHub:
#423 (comment)

@jgm jgm closed this in 24e3a65 Feb 20, 2012

Owner

jgm commented Feb 20, 2012

@ketch: Yes, I'm aware that something is lost in translation here. It can't be helped, though, with the current pandoc architecture. We parse a display math environment into Math DisplayMath x, and render it in LaTeX as \[ x \], and in other formats in different ways...so we need to convert align into aligned so it can go inside \[...\].

It may be that more radical changes are needed to the way math is stored in the pandoc AST, but if you want to suggest that, it would be best to open a separate issue.

ahmadia commented May 19, 2012

@jgm (cc @ketch) - I noticed that the functionality David and I are really looking for is the ability to pass raw LaTex on through to HTML output formats, particularly when we are working with gitit and Markdown or LaTeX input formats. Would it make sense to allow this as an option in the gitit header (basically, asking pandoc to "pass through" anything that looks like raw TeX instead of trying to parse it?)

Owner

jgm commented May 19, 2012

You could use a very simple gitit plugin based on the following
function

convertRawTeX :: Pandoc -> Pandoc
convertRawTex = bottomUp convertRawTeXInline . bottomUp convertRawTeXBlock

convertRawTeXInline :: Inline -> Inline
convertRawTeXInline (RawInline "tex" x) = RawInline "html" x
convertRawTeXInline x = x

convertRawTeXBlock :: Block -> Block
convertRawTeXBlock (RawBlock "tex" x) = RawBlock "html" x
convertRawTeXBlock x = x

+++ ahmadia [May 19 12 06:05 ]:

@jgm (cc @ketch) - I noticed that the functionality David and I are really looking for is the ability to pass raw LaTex on through to HTML output formats, particularly when we are working with gitit and Markdown or LaTeX input formats. Would it make sense to allow this as an option in the gitit header (basically, asking pandoc to "pass through" anything that looks like raw TeX instead of trying to parse it?)


Reply to this email directly or view it on GitHub:
#423 (comment)

Owner

jgm commented May 22, 2012

Have you tried --parse-raw?

On Sat, May 19, 2012 at 6:05 AM, ahmadia <
reply@reply.github.com

wrote:

@jgm (cc @ketch) - I noticed that the functionality David and I are really
looking for is the ability to pass raw LaTex on through to HTML output
formats, particularly when we are working with gitit and Markdown or LaTeX
input formats. Would it make sense to allow this as an option in the gitit
header (basically, asking pandoc to "pass through" anything that looks like
raw TeX instead of trying to parse it?)


Reply to this email directly or view it on GitHub:
#423 (comment)

ahmadia commented Jun 3, 2012

Sorry, we haven't tried it yet. We're moving to a production server within KAUST, and I don't get nearly enough time to play with the fun stuff :(

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment