Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rmd build error when some citation keys are missing from Zotero #15

Closed
florisvdh opened this issue Sep 4, 2020 · 7 comments
Closed

Comments

@florisvdh
Copy link
Contributor

florisvdh commented Sep 4, 2020

When providing an existing bibliography file in the YAML header of Rmd file, e.g. bibliography: references.yaml, while the Rmd file contains citation keys that are missing from that file, pandoc-citeproc does not error but mentions the missing keys - see extract from build message below. And in the rendered (html) result, the missing citation will just show as ???.

pandoc-citeproc: reference incollection-a3e3 not found

Output created: output/index.html

However, when doing the same with rbbt (i.e. instead using the YAML header key, displayed in the documentation of bbt_write_bib(), see further), in which case the above missing citation incollection-a3e3 can be found in Zotero, but some others are missing in Zotero (they're only in the above mentioned references.yaml), the build errors:

Quitting from lines 2-39 (mijn-rapport.Rmd) 
Error: franklin_mapping_2009, Kish_1965 not found
Execution halted

Exited with status 1.

A further side-effect is that the rbbt-generated file is emptied at this stage - while before, it still had the incollection-a3e3 record (from an earlier experiment), because that record is present in Zotero.

Note that the following YAML header key was used in this:

bibliography: "`r rbbt::bbt_write_bib('rbbt-bibliography.json', overwrite = TRUE)`"

It would be nice if citations that are missing from Zotero could just be skipped, the ones that are found could be written to the output file, and then let pandoc-citeproc handle the messaging about missing citations.

This was encountered while I was searching for ways to combine multiple bibliography files, which pandoc supports. For this example case however, combining the sources using the below key in the YAML header did not work because of the above error, even though the missing Zotero references are in references.yaml.

bibliography: ["references.yaml", "`r rbbt::bbt_write_bib('rbbt-bibliography.json', overwrite = TRUE)`"]
@paleolimbot
Copy link
Owner

I find this annoying as well, but this is a limitation of the API endpoint ( https://retorque.re/zotero-better-bibtex/exporting/json-rpc/ ).

I do this on an ad-hoc basis to find these references...it could be modified to

walk(
  rbbt::bbt_detect_citations("~/Desktop/thesis/01-chapter-introduction.Rmd"), 
  ~try(rbbt::bbt_bib(.x, .action = rbbt::bbt_return))
)

Arguably, bbt_write_bib() could fail more gracefully in these circumstances, but it's hard to know what it should do (should it write a bib file at all?). More usefully, I could provide bbt_find_missing() that would loop though citations individually. This takes a while but could be called interactively.

@florisvdh
Copy link
Contributor Author

@paleolimbot thanks for following things up, sorry to have postponed my response.

it's hard to know what it should do (should it write a bib file at all?)

I think this depends on the philosophy - which of course I leave up to you. Personally, it think it would be good to have a default behaviour that would suit most usecases. IMHO, that would be the approach taken by pandoc - I'd regard consistency with pandoc's take on this as a plus. I.e., generate warnings and some replacement like '???', and handle existing citations as it would do otherwise. If bbt_write_bib() would ignore missing citations (or give warnings), then that is what would happen. If you like to also offer other behaviour, maybe it is doable via an argument (e.g. strict = TRUE, requiring everything to be present)?

Are such things now made easier by the more specific error message of the json-rpc API, discussed in #11 ?

@retorquere any further opinion / ideas?

@retorquere
Copy link

What we're discussing here is a single issue from the perspective of BBT. I'd prefer it if would could discuss it in a single place.

@paleolimbot
Copy link
Owner

I don't think this is a bbt issue... It provides an excellent API already, especially with the latest addition.

Why don't you just surround the bbt_write() call with try()?

@florisvdh
Copy link
Contributor Author

@paleolimbot From your example with purrr::walk() I can see what you mean (that is for bbt_bib()). If however bbt_write_bib() is implemented (e.g. in the Rmd YAML header) and should actually write the non-duplicated items to the bibliography file, the approach would need to be implemented in the bbt_write_bib() source code (e.g. with lapply()), i.e. in:

readr::write_file(
    bbt_bib(setdiff(keys, ignore), translator, library_id = library_id, .action = bbt_return),
    path
  )

However this would cause as many calls to the API (by bbt_bib()) as there are citekeys, maybe this is less efficient.

When one does use the below YAML header key (try() surrounding bbt_write_bib()), essentially the same happens as I described before, i.e. previous contents of the json-file will be wiped before the error turns up, and the error is printed. The only difference is that the result will compile, with pandoc complaining it could not find any citekey in the json (as it is wiped).

bibliography: "`r try(rbbt::bbt_write_bib('rbbt-bibliography.json', overwrite = TRUE))`"

IMO it would be better if the bibliography file could be populated with non-duplicated keys. I agree that your suggestion to provide bbt_find_missing() could help here. It might extract problematic citekeys from BBT's error message (1st call), this could be used to filter the result of bbt_detect_citations(), and the result could be fed to the keys argument of rbbt::bbt_write_bib() (which does a 2nd API call). Am I right?

Then, the question is whether this step best sits in rbbt or in BBT. If you find this behaviour is something for the user to decide on, then (s)he could do that manually by using a bbt_find_missing() function.

@paleolimbot
Copy link
Owner

I hear you, but with the new error I think it's best to leave it at that: users have will have all the information they need to sort out missing references. Perhaps this can be revisited after RStudio's new citation connections is released such that rbbt can fit with that!

@florisvdh
Copy link
Contributor Author

Thanks for the feedback @paleolimbot. I'm aware of the RStudio VME, see also this tentative comparative table.

I made a request at RStudio to port the insert-citation tool to the source mode (issue 7876 at https://github.com/rstudio/rstudio/issues). Currently there are still some open issues with the way the Zotero database is queried. Actually in issue 7876 I've suggested to also have a look at the rbbt approach.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants