Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug parsing \@ in LaTeX #9555

Closed
dan-reznik opened this issue Mar 10, 2024 · 5 comments
Closed

Bug parsing \@ in LaTeX #9555

dan-reznik opened this issue Mar 10, 2024 · 5 comments
Labels

Comments

@dan-reznik
Copy link

dan-reznik commented Mar 10, 2024

Using "pandoc main.tex -o main.docx" on the .tex below, we identify
two potential deviations from a latex-generated PDF:

a) a footnote following a right-to-left string (e.g., hebrew) appears
correctly on its right side on the PDF and incorrectly on its left
side on the .docx
b) when an abbreviation such as "It." is followed by "\@ ", e.g.,
"It.\@ ", so as to avoid a long space after the period, in the PDF it
behaves correctly, however, in the .docx that space is omitted.
Currently one can "~" use "It.\@~next" as a workaround, which is not
TeX practice.

Cheers

\documentclass[letter]{article}
\title{Right-To-Left Footnote problem}
\author{}

\usepackage{fontspec}
\setmainfont{FreeSerif}
\setsansfont{FreeSans}
\setmonofont{FreeMono}
\usepackage{polyglossia}
\setdefaultlanguage[variant=american]{english}
\setotherlanguages{hebrew}
\newfontfamily\hebrewfont[Script=Hebrew]{Hadasim CLM}

\begin{document}
\maketitle

Two Pandoc bugs in going from \texttt{.tex} to \texttt{.docx}:

\begin{itemize}
\item A footnote on a right-to-left string, e.g.,
\texthebrew{טוֹב}\footnote{A footnote.} appears incorrectly on the
left side of the string.
\item After an abbreviated such as It.\@ hello you don't get a space
after the period. It works if I use a tilde after the abbreviated
word. E.g., It.\@~hello.
\end{itemize}

\end{document}
@dan-reznik dan-reznik added the bug label Mar 10, 2024
@jgm
Copy link
Owner

jgm commented Mar 10, 2024

Could you split this into two separate bug reports, with descriptive titles for each? That would really help us to keep track of these issues.

@jgm
Copy link
Owner

jgm commented Mar 10, 2024

I'm not really sure we can do anything about \@. We need to convert the LaTeX into a Pandoc AST, and there is nothing in the AST that could really represent \@. (We could parse it as raw LaTeX, I suppose, but this would only be passed through to LaTeX output, and if you already have a LaTeX file then I don't see why you'd be using pandoc to convert it to a PDF.)

@jgm
Copy link
Owner

jgm commented Mar 10, 2024

Clarification: we seem to parse the whole control sequence \@ plus any following spaces as an empty string, and I guess that's what's problematic.

In LaTeX a\@b\@ c will look like ab c. In pandoc it will come out as abc.

@jgm jgm changed the title two pandoc bugsm pandoc v.3.1.12.2 Bug parsing \@ in LaTeX Mar 10, 2024
@jgm
Copy link
Owner

jgm commented Mar 10, 2024

I've rebranded this issue around the second issue (should have a fix soon). Please submit a new issue about the footnote.

@jgm jgm closed this as completed in 812f82a Mar 10, 2024
@dan-reznik
Copy link
Author

I've rebranded this issue around the second issue (should have a fix soon). Please submit a new issue about the footnote.

I just resubmitted the other bug as a separate issue, thanks

jgm added a commit that referenced this issue Mar 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants