Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can pandoc deal with ifdefined or skip certain parts with special LaTeX macros. #6096

Closed
physikerwelt opened this issue Jan 30, 2020 · 23 comments

Comments

@physikerwelt
Copy link

pandoc has difficulties to parse certain commands, for example from the ACM template like \begin{CCSXML}. In general, I would like to have the opportunity to not compile certain parts of the document. Therefore I was using \ifdefined\somecommand \standardTex \else \pandocTex \fi. Is there another build-in feature to achieve that I could not find in the documentation, or is it bug in pandoc that it tries to interpret content within not applicable branches of if statements.

Might be related to #3689

@jgm
Copy link
Owner

jgm commented Jan 30, 2020

Pandoc's LaTeX reader currently doesn't support \ifdefined -- and I don't see how it could. Pandoc knows which commands it knows how to parse, but it has no way of knowing which commands are defined (since they may be defined in external packages).

One option is to simply include some code that redefines the environment in question to be a no-op. E.g. create a shim.tex with

\newenvironment{CCSXML}{}{}

and then run

pandoc shim.tex your-real-file.tex

Pandoc will parse the newenvironment and apply it when it comes to CCSXML, skipping the whole thing.

@nathanieltagg
Copy link

Pandoc also doesn't seem to be able to deal with

\newif\ifsolutions
\solutionsfalse

 \ifsolutions
Solutions!
\else
Not solutions!
\fi

What I get as output is
Solutions! Not Solutions!

It seems that the only way to cope with this would be to define two environments (one for true, one for false) and shim each?

@nathanieltagg
Copy link

The other issue is that it's very hard to write a 'no-op' environment. I want to print nothing in one case.

One way of doing that is the comment package, but that is apparently not supported either..

@jgm
Copy link
Owner

jgm commented Aug 21, 2020

@nathanieltagg We ought to add support for \newif; I don't think that would be too hard, but please submit a new issue requesting this.

\begin{comment}..\end{comment} is supported as a no-op...is there something else you have in mind?

@nathanieltagg
Copy link

The comment package includes a command

   \excludecomment{soln}

which treats that environment as a comment. Supporting that switch would have also done the job, although that's rather specific. When I first posted this, I assumed that pandoc was a cut-down latex compiler, not a ground-up parser.

@Fifis
Copy link

Fifis commented Jun 9, 2021

Pandoc's LaTeX reader currently doesn't support \ifdefined -- and I don't see how it could. Pandoc knows which commands it knows how to parse, but it has no way of knowing which commands are defined (since they may be defined in external packages).

One option is to simply include some code that redefines the environment in question to be a no-op. E.g. create a shim.tex with

\newenvironment{CCSXML}{}{}

and then run

pandoc shim.tex your-real-file.tex

Pandoc will parse the newenvironment and apply it when it comes to CCSXML, skipping the whole thing.

That does not eliminate the contents of the environment. E.g. if I want to ignore all tabulars (before exporting to Grammarly via TeX → ODF), then re-defining the {tabular} environment in this manner keeps all the contents of the environment. However, it would be awfully nice if, for example, one could specify ‘ignore tabular’ because there is still some text (legends) in the final floating table environment that is useful for editors / proof-readers because they have to go through all the text.

@jgm
Copy link
Owner

jgm commented Jun 9, 2021

Note that \newif is now supported. (But not \ifdefined -- we can't support that, really.)

@annlia
Copy link

annlia commented Jan 4, 2022

Just experiencing similar problems with \newif within Rmarkdown, where the following fails to compile to pdf with a pandoc error, it used to work before the latest update, any ideas?

``` {r, include = F}
solution <- TRUE 
```
\newif\ifsol
\sol`r ifelse(solution, 'true', 'false')`

\ifsol
Text to conditionally skip
\fi

@jgm
Copy link
Owner

jgm commented Jan 4, 2022

@annlia - important to know what pandoc version you're using, what output you get, and what you expect.
\newif works.
But I don't understand the syntax after \sol in your example.

@annlia
Copy link

annlia commented Jan 4, 2022

Here is the full version of the .Rmd file I am using

With pandoc 2.13 it compiles fine, with later versions it doesn't. The correct output is attached.

---
title: "R Markdown to pdf"
output:
  pdf_document:
    keep_tex: yes
---

``` {r, include = F}
solution <- TRUE ```

\newif\ifsol
\sol`r ifelse(solution, 'true', 'false')`

\ifsol

Text to conditionally skip

\fi

minTest.pdf

Just added further details including a workaround to use a previous version of pandoc here

https://community.rstudio.com/t/r-markdown-with-newif-fails-to-compile-to-pdf-after-update/125603/2

@jgm
Copy link
Owner

jgm commented Jan 4, 2022

This contains things that are meant to be processed by RMarkdown, so it won't work with pandoc by itself.
But pandoc works fine on this:

% pandoc -f latex -t native
\newif\ifsol
\soltrue 

\ifsol

Text to conditionally skip

\fi
^D
[ Para
    [ Str "Text"
    , Space
    , Str "to"
    , Space
    , Str "conditionally"
    , Space
    , Str "skip"
    ]
]

I suspect that your issue is with something in RMarkdown, so maybe you should check with someone who is knowledgeable about that.

@annlia
Copy link

annlia commented Jan 5, 2022

That is correct, it is an Rmarkdown file, and indeed I am processing it through rmarkdown (in RStudio) and not with pandoc by itself. In fact, I do not know where the communication between the two fails, but by (only) replacing the current version of pandoc with version 2.13 the file compiles correctly. Specifically, within RMarkdown it is sufficient to use

rmarkdown::find_pandoc(version='2.13')

So perhaps something has changed in the way they interact in the latest versions...

@jgm
Copy link
Owner

jgm commented Jan 5, 2022

The example above shows that the \if works properly with pandoc.

I'm not sure why this would ever have worked, but it's probably due to a bug in a previous version!

The \if macros are resolved at parse time, and at parse time we don't have \iftrue or \iffalse. Substituting the code inline marked r with something else in the AST (which is what RMarkdown is doing, I assume) won't affect parsing and hence won't affect the resolution of the \if. But an RMarkdown developer would be a better person to discuss this with.

@annlia
Copy link

annlia commented Jan 5, 2022

If that is the case, I am grateful for the bug, since it is the only thing that allows me to compile my files... As you can see from above I tried to post on other forums where hopefully RMarkdown experts could chime in, but unfortunately, I did not receive a response.

Thanks again for taking an interest in this issue.

@tarleb
Copy link
Collaborator

tarleb commented Jan 5, 2022

I can reproduce the issue when converting the RMarkdown-generated intermediary Markdown:

\newif\ifsol
\soltrue

\ifsol

Text to conditionally skip

\fi

Just for completeness, here's the output from the current version and pandoc 2.13 for pandoc -f markdown -t native < test.md.

Current version
  [ RawBlock
      (Format "tex") "\\newif\\ifsol\n\\def\\iffalse\\iftrue"
  , RawBlock (Format "tex") "\\iftrue"
  , Para
      [ Str "Text"
      , Space
      , Str "to"
      , Space
      , Str "conditionally"
      , Space
      , Str "skip"
      ]
  , RawBlock (Format "tex") "\\fi"
  ]
pandoc 2.13
[ RawBlock (Format "tex") "\\newif\\ifsol\n\\soltrue"
, RawBlock (Format "tex") "\\ifsol"
, Para
    [ Str "Text"
    , Space
    , Str "to"
    , Space
    , Str "conditionally"
    , Space
    , Str "skip"
    ]
, RawBlock (Format "tex") "\\fi"
]

@tarleb
Copy link
Collaborator

tarleb commented Jan 5, 2022

@annlia you can work around this by using raw blocks:

```{=latex}
\newif\ifsol
\soltrue

\ifsol
```

Text to conditionally skip

```{=latex}
\fi
```

I haven't tried it, but I believe this should also work when including r ... snippets.

@jgm
Copy link
Owner

jgm commented Jan 5, 2022

I was testing with -f latex, which gives expected results.
If rmarkdown is generating an intermediary markdown file, then what I said above may not apply. (I thought it was using filters.)

rmarkdown should certainly use the raw attribute syntax when doing this, to avoid funny results.

But, why do we get bad results without it? Do you understand that, @tarleb? The native output is almost the same.

@technocrat
Copy link

technocrat commented Jan 6, 2022

This is a FWIW comment on the underlying logic of rmarkdown code chunks and inline code in relation to tex macros.

@annlia was trying to preserve an existing workflow when I originally saw this issue on community.rstudio.com. As I could find no way to do that under the current pandoc version, I worked through several ways that it might be done alternatively in an rmarkdown file. So, this is for those more interested in the general question of conditionally sending values of R objects to stdout via some tex macro.

There are two ways of returning objects to stdout: in a chunk and inline. In the motivating example, a logical object "solution" was defined as TRUE in a chunk and was then to be used as an argument:

`\sol`r ifelse(solution, 'true', 'false')`

The inline R code tests whether the value of solution is TRUE. In my view, explicitly creating and calling a macro is preferable to \sol

`\def\VAR {``r solution}`
\VAR`

A error arises, however if the R object is not currently in the environment. To get around that

`\def\VAR{[``r if(!is.na(mget(“solution”, ifnotfound = NA)[[1]])) solution``}
\VAR`

will fail silently, sending nothing to stdout or stderr. That seems unhelpful, though, considering that one does not generally want to be referring to non-existing objects. To do so should be an error and should fail noisily.

The exercise leads me to wonder if macros are fitted for use as variables, rather than constants. The purpose of rendering a document that contains executed code is so that the text in stdout faithfully reflects the return value of the object in question within the R environment. Using a defined macro, however, raises the problem of the person with two watches never being sure which is correct. Unless the macro is redefined each time the R object is evaluated (and doing so is managed somehow) errors become likely.

@tarleb
Copy link
Collaborator

tarleb commented Jan 6, 2022

(I thought it was using filters.)

RMarkdown does an R preprocessing step and then passes the resulting Markdown to pandoc. However, there is a new project, @quarto-dev, which does use filters.

I don't fully understand what's going on, but it appears that something werid happens during macro expansion: somehow \soltrue is expanded to \def\iffalse\iftrue, which throws LaTeX completely off-balance.

Another alternative to the raw attribute syntax mentioned above is hence to disable LaTeX macro extension (-latex_macros). I believe this can be done in RMarkdown by setting the md_extensions property for an output format, but haven't tried.

@jgm
Copy link
Owner

jgm commented Jan 21, 2022

First note: this bug only occurs with -f markdown, not -f latex.

Second note: from the source of T.P.Readers.LaTeX.Macros, we see

-- \newif\iffoo' defines:
-- \iffoo to be \iffalse
-- \footrue to be a command that defines \iffoo to be \iftrue
-- \foofalse to be a command that defines \iffoo to be \iffalse

So this is probably why we are getting "\\newif\\ifsol\n\\def\\iffalse\\iftrue".
First, pandoc is treating \newif\ifsol as

\def\ifsol\iffalse
\def\soltrue{\def\ifsol\iftrue}
\def\solfalse{\def\ifsol\iffalse}

Then when it gets to \soltrue, it is expanding this as \def\ifsol\iftrue, but also expanding the \ifsol in this to \iffalse, so we get \def\iffalse\iftrue (yeah, that's gonna cause problems!).

This extra expansion doesn't occur with -f latex, which may mean that this problem concerns rawLaTeXParser.

@jgm
Copy link
Owner

jgm commented Jan 21, 2022

rawLaTeXParser involves a retokenization phase which applies macros; this is probably where the mischief is happening.

@jgm
Copy link
Owner

jgm commented Jan 21, 2022

I tried removing the retokenize step altogether in rawLaTeXParser, and all the tests still passed. This case then parses as

[ RawBlock
    (Format "tex") "\\newif\\ifsol\n\\def\\ifsol\\iftrue"
, RawBlock (Format "tex") "\\iftrue"
, Para
    [ Str "Text"
    , Space
    , Str "to"
    , Space
    , Str "conditionally"
    , Space
    , Str "skip"
    ]
, RawBlock (Format "tex") "\\fi"
]

which looks good.
Maybe we can get rid of that? Or is there some important intended behavior we're not testing?

jgm added a commit that referenced this issue Jan 21, 2022
This was causing serious problems with `newif` commands.
See #6096.  And it didn't seem to make any difference for
the tests; I assume that, unless there's some untested
behavior, this is something that has now become unnecessary.
@jgm
Copy link
Owner

jgm commented Jan 21, 2022

While this is better, it still causes a LaTeX error because of the

\def\ifsol\iftrue

Ideally we'd just parse the raw tex as

\newif\ifsol
\ifsol

without resolving \ifsol to \def\ifsol\iftrue.
Either that, or omit the \newif\ifsol. But both of them together causes problems.

@jgm jgm closed this as completed in 67f2b25 Jan 23, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants