Skip to content
This repository has been archived by the owner on Apr 30, 2021. It is now read-only.

Can pandoc-citeproc take multiple input .bib bibliography files? #220

Closed
friendly opened this issue Mar 24, 2016 · 5 comments
Closed

Can pandoc-citeproc take multiple input .bib bibliography files? #220

friendly opened this issue Mar 24, 2016 · 5 comments

Comments

@friendly
Copy link

I recently posted a question to SO, http://stackoverflow.com/questions/36202023/rmarkdown-how-to-use-multiple-bibliographies-for-a-document, regarding how to get pandoc-citeproc to accept a specification for two or more input BibTeX files.

My workflow involves R Studio, rmakdown and knitr, which eventually calls pandoc as follows, and results in an error when I try to include more than one .bib file in the YAML header:

  ---
  title: "Notes on Testing Equality of Covariance Matrices"
  author: "Michael Friendly"
  date: '`r format(Sys.time(), "%B %d, %Y")`'
  output:
    pdf_document:
      fig_caption: yes
      keep_tex: yes
      number_sections: yes
  csl: apa.csl
  bibliography: [statistics.bib, graphics.bib]
  ---

It looks like both of these are passed to pandoc, so I wonder if this is a citeproc problem or one that stems from rmardown/R Studio.

"C:/Program Files/RStudio/bin/pandoc/pandoc" +RTS -K512m -RTS EqCov.utf8.md --to latex --from markdown+autolink_bare_uris+ascii_identifiers+tex_math_single_backslash --output EqCov.pdf --template "C:\R\R-3.2.1\library\rmarkdown\rmd\latex\default-1.15.2.tex" --number-sections --highlight-style tango --latex-engine pdflatex --variable graphics=yes --variable "geometry:margin=1in" --bibliography statistics.bib --bibliography graphics.bib --filter pandoc-citeproc 
output file: EqCov.knit.md

pandoc-citeproc.exe: "stdin" (line 50, column 12):
unexpected ":"
expecting letter, digit, white space or "="
pandoc.exe: Error running filter pandoc-citeproc
Filter returned error status 1
Error: pandoc document conversion failed with error 83
@jgm
Copy link
Owner

jgm commented Mar 24, 2016

It should be possible to include multiple bib files in the
way you're doing. To see if it's a pandoc problem, try
running pandoc on your input file.

pandoc -s --filter pandoc-citeproc input.md

I'm not sure why knitr (or rmarkdown) is adding the

--bibliography graphics.bib --bibliography statistics.bib

to the pandoc command line. It shouldn't be necessary,
since these are already in the YAML metadata. But this
shouldn't hurt.

My guess is that the error message is from parsing the
bibtex file. Are you sure these are valid? (Do you perhaps
have a colon where an '=' is needed?)

+++ Michael Friendly [Mar 24 16 10:11 ]:

I recently posted a question to SO,
[1]http://stackoverflow.com/questions/36202023/rmarkdown-how-to-use-mul
tiple-bibliographies-for-a-document, regarding how to get
pandoc-citeproc to accept a specification for two or more input BibTeX
files.

My workflow involves R Studio, rmakdown and knitr, which eventually
calls pandoc as follows, and results in an error when I try to include
more than one .bib file in the YAML header:


title: "Notes on Testing Equality of Covariance Matrices"
author: "Michael Friendly"
date: 'r format(Sys.time(), "%B %d, %Y")'
output:
pdf_document:
fig_caption: yes
keep_tex: yes
number_sections: yes
csl: apa.csl
bibliography: [statistics.bib, graphics.bib]


It looks like both of these are passed to pandoc, so I wonder if this
is a citeproc problem or one that stems from rmardown/R Studio.
"C:/Program Files/RStudio/bin/pandoc/pandoc" +RTS -K512m -RTS EqCov.utf8.md --to
latex --from markdown+autolink_bare_uris+ascii_identifiers+tex_math_single_back
slash --output EqCov.pdf --template "C:\R\R-3.2.1\library\rmarkdown\rmd\latex\de
fault-1.15.2.tex" --number-sections --highlight-style tango --latex-engine pdfla
tex --variable graphics=yes --variable "geometry:margin=1in" --bibliography stat
istics.bib --bibliography graphics.bib --filter pandoc-citeproc
output file: EqCov.knit.md

pandoc-citeproc.exe: "stdin" (line 50, column 12):
unexpected ":"
expecting letter, digit, white space or "="
pandoc.exe: Error running filter pandoc-citeproc
Filter returned error status 1
Error: pandoc document conversion failed with error 83


You are receiving this because you are subscribed to this thread.
Reply to this email directly or [2]view it on GitHub

References

  1. http://stackoverflow.com/questions/36202023/rmarkdown-how-to-use-multiple-bibliographies-for-a-document
  2. Can pandoc-citeproc take multiple input .bib bibliography files? #220

@friendly
Copy link
Author

It turns out, AFAICS, that this is a pandoc-citeproc parsing problem. I tracked this down to my graphics.bib file, where I have a set of @string{}s that include a : in the key:

@STRING{pub-sas:adr = {Cary, NC}}
@STRING{pub-sv = {Springer-Verlag}}
@STRING{pub-sv:adr = {New York, NY}}
@STRING{pub-wiley = {John Wiley and Sons}}
@STRING{pub-wiley:adr = {New York, NY}}

This is legal in bibtex, but causes pandoc-citeproc to choke at the first such problem. The problem only became apparent when I limited it to the one file graphics.bib and could then examine the (line XX, coulmn YY) in the error message --- cryptic, because it refers only to "stdin".

It would be nice to either document this limitation, or provide a bit more context in the error message
to allow this to be tracked down.

@jgm jgm closed this as completed in a1cc419 Mar 24, 2016
@jgm
Copy link
Owner

jgm commented Mar 24, 2016

+++ Michael Friendly [Mar 24 16 11:51 ]:

It would be nice to either document this limitation, or provide a bit
more context in the error message
to allow this to be tracked down.

I'd rather just fix it. Thanks!

@friendly
Copy link
Author

FYI: I also discovered that pandoc-citeproc chokes on all non-ASCII characters in my graphics.bib file --- an encoding issue. I found these in R, and fixed them by hand-editing via

> file <- "graphics.bib"
> tools::showNonASCIIfile(file)
5227:   title = {Aux sources de la s<e9>miologie graphique},
5233:   organization = {Comit{\'e} fran<e7>ais de Cartographie},
7009:   copies = {1 brochure in-8<f8>, 4 ex.},
9398:   author = {Jos<e9> C. Pinheiro and Douglas M. Bates},
9494:                As has often been recognized, several characteristics of APL make it especially well suited for programming statistical applications: APL treats arrays<97>vectors, matrices, and higher dimensional arrays<97>as data structures that can be processed without reference to their elements; 

bibtex never had problems with these, and of course, pandoc-citeproc tries to do much more.
Would it be possible to be more forgiving (warnings rather than errors) on such problems?

@jgm
Copy link
Owner

jgm commented Mar 24, 2016

As is documented, pandoc and pandoc-citeproc expect input to
be UTF-8 encoded. I suspect yours was not. Converting to
UTF-8 using iconv should do the trick.

+++ Michael Friendly [Mar 24 16 13:03 ]:

FYI: I also discovered that pandoc-citeproc chokes on all non-ASCII
characters in my graphics.bib file --- an encoding issue. I found these
in R, and fixed them by hand-editing via

file <- "graphics.bib"
tools::showNonASCIIfile(file)
5227: title = {Aux sources de la smiologie graphique},
5233: organization = {Comit{'e} franais de Cartographie},
7009: copies = {1 brochure in-8, 4 ex.},
9398: author = {Jos C. Pinheiro and Douglas M. Bates},
9494: As has often been recognized, several characteristics of AP
L make it especially well suited for programming statistical applications: APL t
reats arrays<97>vectors, matrices, and higher dimensional arrays<97>as data stru
ctures that can be processed without reference to their elements;

bibtex never had problems with these, and of course, pandoc-citeproc
tries to do much more.
Would it be possible to be more forgiving (warnings rather than errors)
on such problems?


You are receiving this because you modified the open/close state.
Reply to this email directly or [1]view it on GitHub

References

  1. Can pandoc-citeproc take multiple input .bib bibliography files? #220 (comment)

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants