Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Converting .tex to .docx causes equation numbers to be lost, and more #742

Closed
lanceboyle opened this issue Feb 8, 2013 · 10 comments
Closed

Comments

@lanceboyle
Copy link

When converting from LaTeX to docx,
pandoc -o somefile.docx anotherfile.tex
pandoc does not properly convert numbered (display) equations. There are two variations:

(1) The LaTeX equation is numbered but has no label. (I'm not a LaTeX expert—I use LyX—but it seems that a label is required in order to create a cross reference to the equation.) In this case, the equation is rendered correctly in docx but the equation number is lost.

(2) The LaTeX equation is both numbered and has a label (and a cross reference elsewhere in the document). In this case, in the docx file, the equation is _not_rendered but instead the raw LaTeX code appears in its place, followed by the label text, all enclosed in $...$; that is, the code appears in the place where the corresponding display equation should have appeared. For example, with Einstein's famous equation with the label MassEnergy, this line appears in docx:
$e=mc^{2}\label{eq:MassEnergy}$
Also, the equation number is lost, and the cross reference is not "rendered" meaning that the raw text appears. For example, if the equation had a label MassEnergy, then the cross reference is shown in docx as the text, ([eq:MassEnergy]).

(1 and 2) In both of these cases, there is an extra blank line above and below each equation in docx.

Unnumbered display equations and inline equations get translated without problems, and there is no extra white space. All equations that are correctly rendered (i.e., not raw LaTeX) are editable in the docx document.

I suspect that other labels are also similarly problematic. For instance, if a label is attached to a section title, then in docx, where the section title is displayed, the label is also displayed alongside, as text.

I am using pandoc 1.9.4.2. (I know it's not the latest but I had trouble with the OS X .dmg installer last time and I'm not eager to update just yet.)

I am using:
OS X 10.7.5
docx is interpreted by Word for Mac 2011.

As far as I'm concerned pandoc is some sort of magic. I hope that these important problems in translating to Word format aren't too hard to fix.

I don't see a way to upload files here (except apparently an image file). If there is a way to do so please let me know—I have very short example in .tex, .docx, and a .pdf rendered from each of them that show the problem.

Jerry

@jgm
Copy link
Owner

jgm commented Feb 8, 2013

No version of pandoc handles equation numbering. (You can use
the pandoc-markdown example list syntax to get numbered equations
that you can refer back to; but pandoc doesn't yet handle numbering
when converting from LaTeX.)

As for the problem with \label, that should be fixed in recent
versions.

+++ lanceboyle [Feb 08 13 01:55 ]:

When converting from LaTeX to docx,
pandoc -o somefile.docx anotherfile.tex
pandoc does not properly convert numbered (display) equations. There
are two variations:

(1) The LaTeX equation is numbered but has no label. (I'm not a LaTeX
expert--I use LyX--but it seems that a label is required in order to
create a cross reference to the equation.) In this case, the equation
is rendered correctly in docx but the equation number is lost.

(2) The LaTeX equation is both numbered and has a label (and a cross
reference elsewhere in the document). In this case, in the docx file,
the equation is _not_rendered but instead the raw LaTeX code appears in
its place, followed by the label text, all enclosed in $...$; that is,
the code appears in the place where the corresponding display equation
should have appeared. For example, with Einstein's famous equation with
the label MassEnergy, this line appears in docx:
$e=mc^{2}\label{eq:MassEnergy}$
Also, the equation number is lost, and the cross reference is not
"rendered" meaning that the raw text appears. For example, if the
equation had a label MassEnergy, then the cross reference is shown in
docx as the text, ([eq:MassEnergy]).

(1 and 2) In both of these cases, there is an extra blank line above
and below each equation in docx.

Unnumbered display equations and inline equations get translated
without problems, and there is no extra white space. All equations that
are correctly rendered (i.e., not raw LaTeX) are editable in the docx
document.

I suspect that other labels are also similarly problematic. For
instance, if a label is attached to a section title, then in docx,
where the section title is displayed, the label is also displayed
alongside, as text.

I am using pandoc 1.9.4.2. (I know it's not the latest but I had
trouble with the OS X .dmg installer last time and I'm not eager to
update just yet.)

I am using:
OS X 10.7.5
docx is interpreted by Word for Mac 2011.

As far as I'm concerned pandoc is some sort of magic. I hope that these
important problems in translating to Word format aren't too hard to
fix.

I don't see a way to upload files here (except apparently an image
file). If there is a way to do so please let me know--I have very short
example in .tex, .docx, and a .pdf rendered from each of them that show
the problem.

Jerry

--
Reply to this email directly or [1]view it on GitHub.
[J6T91GIPIyhU-8ti4GCGP7AlC2fiocPKodp06RQqyLwmGsc-kZcpNqfyJb9B_OjH.gif]

References

  1. Converting .tex to .docx causes equation numbers to be lost, and more #742

@lanceboyle
Copy link
Author

OK. Thanks.
Jerry

@lanceboyle
Copy link
Author

Thanks for the new version of pandoc, 1.10.1, which I have now installed. Numbered (labeled, I guess) equations are now being rendered in tex to docx conversions as you know. Thanks!

I'd like to report these minor bugs, as the converted docx files are rendered in Word for Mac 2011:

  • Display equations, whether with or without labels, are rendered in docx in their own paragraph (which is OK but at variance with how Word does it, but indicates that a line terminator is being inserted both before and after the equation code) but are in the "inline" style so that with one of my test equations which is a summation the limits are displaced to the lower right and upper right of the sigma rather than fully below and above it.
  • In the case of an equation which had a label in tex, in docx it has an extra blank paragraph above and below it, creating extra white space. This indicates two extra line terminators or pargarph marks being inserted before and after the equation. In the case of the unlabeled tex equation, there are no extra blank paragraphs.

FWIW, my tex is exported from LyX and looks different for the two cases of the same equation, one with label and one without. I don't understand TeX deeply and so the problems I mentioned above could be with LyX. So here is the TeX that LyX is exporting, first for the...

Labeled equation:

\begin{equation}
X\left(k\right)=\sum_{n=0}^{N-1}x\left(n\right)\exp\left(\frac{-j2\pi nk}{N}\right)\label{eq:DFT}
\end{equation}

and the unlabeled equation:

[
X\left(k\right)=\sum_{n=0}^{N-1}x\left(n\right)\exp\left(\frac{-j2\pi nk}{N}\right)
]

Finally, as a suggestion, I wonder if it makes sense to you to display equation labels in the docx side as literal strings. For example, I gave the above equation the label "DFT" so that its full label is "eq:DFT". I also had a cross-reference to it, so in the docx file that cross-reference survived as the literal string "([eq:DFT])" in which I have included the () for normal equation referencing style. So if that string were also displayed alonside the labeled equation, in Word, one could do a one-time global search on that string and replace it a desired equation number. Just a thought.

Jerry

jgm pushed a commit that referenced this issue Feb 27, 2013
Display math inside a paragraph is now put in a separate
paragraph, so it will render properly (centered and without
extra blank lines around it).

Partially addresses #742.
@jgm
Copy link
Owner

jgm commented Feb 27, 2013

I'm going to close this bug. I haven't acted on the "suggestion" (@lanceboyle). Given the way pandoc interacts with the texmath library, it's a bit hard to see how to do that. Some time in the future I'd like to support equation numbering, but it will require more extensive changes to both pandoc and texmath.

@jgm jgm closed this as completed Feb 27, 2013
@lanceboyle
Copy link
Author

Thanks for looking into this, John.

Jerry

@pengyu
Copy link

pengyu commented Jun 20, 2013

I just want to make the current version does not support the reference to a number equation like ([eq:MassEnergy]), right?

pandoc 1.11.1
Compiled with citeproc-hs 0.3.8, texmath 0.6.1.3, highlighting-kate 0.5.3.8.
Syntax highlighting is supported for the following languages:
    actionscript, ada, apache, asn1, asp, awk, bash, bibtex, boo, c, changelog,
    clojure, cmake, coffee, coldfusion, commonlisp, cpp, cs, css, curry, d,
    diff, djangotemplate, doxygen, doxygenlua, dtd, eiffel, email, erlang,
    fortran, fsharp, gnuassembler, go, haskell, haxe, html, ini, java, javadoc,
    javascript, json, jsp, julia, latex, lex, literatecurry, literatehaskell,
    lua, makefile, mandoc, matlab, maxima, metafont, mips, modula2, modula3,
    monobasic, nasm, noweb, objectivec, objectivecpp, ocaml, octave, pascal,
    perl, php, pike, postscript, prolog, python, r, relaxngcompact, rhtml, ruby,
    rust, scala, scheme, sci, sed, sgml, sql, sqlmysql, sqlpostgresql, tcl,
    texinfo, verilog, vhdl, xml, xorg, xslt, xul, yacc, yaml
Default user data directory: /Users/py/.pandoc
Copyright (C) 2006-2013 John MacFarlane
Web:  http://johnmacfarlane.net/pandoc
This is free software; see the source for copying conditions.  There is no
warranty, not even for merchantability or fitness for a particular purpose.

@jgm
Copy link
Owner

jgm commented Jun 20, 2013

+++ pengyu [Jun 19 13 21:51 ]:

I just want to make the current version does not support the reference
to a number equation like ([eq:MassEnergy]), right?

Right.

@pengyu
Copy link

pengyu commented Jun 21, 2013

On Thu, Jun 20, 2013 at 2:47 PM, John MacFarlane
notifications@github.com wrote:

+++ pengyu [Jun 19 13 21:51 ]:

I just want to make the current version does not support the reference
to a number equation like ([eq:MassEnergy]), right?

Right.

Thanks. I think that this feature is important especially for people
who write math a lot. When will this feature be added in pandoc?
Thanks.

Regards,
Peng

@arthur-e
Copy link

arthur-e commented Mar 8, 2015

Hi; this feature is still not supported?

@arthur-e
Copy link

arthur-e commented Mar 8, 2015

This feature can be simulated easily enough with the \tag{} command and \eqref{} will still work. Example:

\begin{equation}\label{myeqn}\tag{1}
R_{i} = \sum_{j=1}^m f_{ij} r_{ij} + \epsilon_i
\end{equation}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants