Skip to content

commonman/zotero-bibtex-sb

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

23 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

An Alternative to Zotero’s BibTeX Export

What is this?

An attempt to improve on Zotero’s standard BibTeX export, at least for my own purposes.

This repository includes a few modified BibTeX export translators which you can drop into your zotero/translators folder and use to export BibTeX. This is easy enough to do, and keeps Zotero’s existing BibTeX intact.

Quickstart

Download one of these and throw it into your zotero/translators directory. Restart firefox. When you got to export, you’ll see it offered as a possibility.

  • BibTeX-LowFat.js (omits several fields (URL, ISBN, etc) which you might not want your over-eager BibTeX style to put in your citations/bibliography)
  • BibTeX-NonFat.js (omits all but the essential fields, no notes, no keywords, no abstract, nix)

See below for more detail.

Why do this?

I use Zotero as my bibliographic database and BibTeX (actually BibLaTeX) to process my references. I work on my database as I go, adding new items and editing old ones in the process of writing, and I want get new database items and edits into my LaTeX documents in as few steps as possible. Also, I use a BibLaTeX citation style which helpfully includes optional fields based on whether or not they are in my .bib database. Zotero’s BibTeX-export outputs some fields that I don’t want in my citations or bibliography (URL, ISBN, ISSN, DOI, ‘month’ for academic journals, and —at the moment— the “pages” field for books.) If I don’t keep them out of (or remove them from) my BibTeX database, they clutter my end result. This little project is an attempt to get the fields I want for BibTeX processing in as few extra steps as possible.

Why doesn’t Zotero do all this by default?

Zotero’s aim is to a decent job of exporting all bibliographic metadata which you can reasonably get into a not-too-idiosyncratic dialect of BibTeX. A full, flexible BibTeX export within Zotero would introduce interface complexity which some feel would be orthogonal to Zotero’s design principles. Also such an export interface would be a decent bit bit of work for what amounts to a small (and reportedly hard to please) set of BibTeX+Zotero users (who us?). Furthermore postprocessing your database is an ancient tradition in the BibTeX world. Tools exist for this sort of thing.

This doesn’t mean that you shouldn’t feel free to start discussions about sensible defaults on the Zotero forums. But you’ll want to do so with a basic grasp of Zotero’s one-size-tries-its-best approach.

But

I couldn’t find a BibTeX postprocessor that would meet all my needs. They didn’t handle Unicode BibTeX data well. They had artifacts that I didn’t want, and there were some changes I wanted that they weren’t designed to produce. In addition I was glad for the simplicity of doing my whole export in one step. Needless to say there are other ways to do these things (a few are listed at the end of this page). Follow your bliss. This was mine.

My initial aim is just to make the modifications available to anyone who might finds them useful. I’ll do my best to help you with modifications that you require, within the limits of my ability and time. I also have included code comments and instructions below to help you in making your own modifications.

Feel free to contact me with any comments, questions or patches:

 scot dot becker at gmail dot com

The BibTeX Exporters

At the moment, there are two. You can see sample BibTeX database output, and compare them with each other and Zotero’s stock export by comparing the sample output files in the ‘reference’ directory of this repository. Either download the “raw” files from their page or download the whole repository via the download link in the upper right (or, of course, git).

BibTeX (Low-Fat)

This makes the following changes from Zotero’s standard BibTeX export:

  1. Some fields are not exported. This is to prevent eager BibTeX styles from adding them to your citations when you don’t want them.
    • ISBN
    • ISSN
    • Library of Congress catalog number (‘lccn’)
    • DOI
  2. Some fields are exported only under certain conditions:
    • For any resources suspected of being primarily print-based, the URL is written to an ‘opt_url’ field instead of a ‘url’ field. This is to prevent BibTeX styles from adding the URL to citations of such resources, thus following the conventions of those quaint backwaters where words are still “printed” onto processed wood pulp, and citations are typically made to paper publications with no “U-R-L” thingy. The exporter still writes the URL (to the BibTeX ‘url’ field) for other resource types, and for journal articles with no page reference given. See “Notes” below for more.
    • The “pages” field is not exported for books (A patch for this has been submitted to the zotero-dev list.)
    • The ‘month’ field is not exported for articles which have an issue number. (Cf. for example Chicago Manual of Style 17.161–163). Some styles helpfully print this if it’s in your data file. It’s a way of giving the author flexibility. But with auto-generated data, it does the Wrong Thing.
  3. Other changes
    • The quality of the automatically generated keys has been improved:
      • No more merged title words
      • better exclusion of English particles from start of title field (for “name_TITLEWORD_year” key formation)
      • German, French and Spanish articles added to the list of words excluded from TITLEWORD
      • Names and titles with diacriticals now receive more gentle treatment in keys (no longer: “mnguez_potica_1980”, but now: “minguez_poetica_1980” for a book by “Dionisio Mínguez” called “Poética generativa”)
    • Exported Zotero tags in the LaTeX ‘keyword’ field are no longer LaTeX-escaped. This makes them easier to read and search. If you actually typeset them you may not want this, but that’s not a typical use of them, in my experience.

It’s easy enough to revert these changes individually by uncommenting the appropriate lines in the BibTeX-LowFat.js file, or by copying back the relevant bits from the stock BibTeX.js file.

It’s also not hard to eliminate other fields like “annote”, “extra” or “location”. Just look at the examples in the code. Even you can probably do it. :-)

BibTeX (Non-Fat)

If the aim of “Low-Fat” was to exclude those fields which get unintentionally used in citations, the aim of this style is to export only those core fields which are typically used for citation. Therefore it does not export copyright (“rights”), archive location (“location”), “abstract”, “notes”, or “keywords”. Even the “extra” field is cut. It also does not export the opt_url field (as “Low-Fat” does for resources which look like print resources. The ‘url’ field is kept for non-print items which contain the field.

This of course may go too far for many users, though (except for ‘extra’) I tried not to cut anything that was commonly used in citations for standard print resources and websites. But if you’re having severe problems with surplus data for your bibliographic style, it may be just the thing. The changes from the “Low-Fat” version are all omissions, so they are easy enough to reverse individually by uncommenting the lines or stanzas of code that mention them.

Proposed to Zotero

This is a version with what I think are uncontroversial changes which the stock Zotero BibTeX export could usefully adopt. (BibTeX key improvements and the removal of the “pages” field from @book items, ATM) A patch for these has been submitted to Zotero-dev and is awaiting approval. The file and the patch are found in the ‘proposed-for-stock-zotero’ directory.

If people were amenable, two further changes might be made to Zotero’s stock output: removing the LaTeX escaping from the “keywords” field, and (possibly) the use of the opt_url field where print-based items also have URLs, as many will when their metadata is harvested by Zotero. I didn’t feel like wading into this controversy.

Notes

The rule for exporting the URL is: a Zotero URL goes into ‘opt_url’ instead of ‘url’ for all resources which are of the type “book”, “thesis” or “book section” or for other resources which have a value in the “pages” field. This will mostly do the right thing, but not if you catalog resources which have both a citable URL and a page range. It does what I want. Instructions are included in the file for writing the URL to the BibTeX ‘url’ field (Zotero’s standard behavior).

Note that although I also added the import of ‘opt_url’ into Zotero’s URL field, this won’t happen automatically for you even after you install this file, since Zotero still uses its stock BibTeX translator for import. You may have to replace that one with this. (I haven’t tested it). Unless you do this you won’t be able to re-import those URL’s stored in the ‘opt_url’ field, so no data round-tripping. In practice you can’t expect decent data round-tripping since Zotero export to BibTeX is by nature lossy. (Zotero stores more information than BibTeX).

Use some care: I haven’t yet been able to figure out how Zotero decides which translator to use for BibTeX import. These edited copies reside in the same directory as your stock BibTeX import file, and I can’t that Zotero will use its stock importer by default. Nor do I know how to force it to use an edited one. At the moment, this doesn’t matter much, since (but for my addition of a facility to import the opt_url field) the importers are all the same. But potential conflict this is something to be aware of if and as the Zotero stock BibTeX import/export develops.

To Use

Download one of the BibTeX-xxxxx.js files. At the moment there are two:

Alternately you can clone the repository, which contains proposed patches for Zotero, test data, and sample outputs for each of the exporters.

git clone git@github.com:commonman/zotero-bibtex-sb.git

Then, drop your choice of BibTeX-xxxxxx.js files into your zotero/translators directory, which is usually within your Firefox profile. To find out where your zotero data directory is, look in:

Zotero Preferences –> Advanced –> Show Data Directory

The ‘translators’ directory is inside that. When you restart Firefox, the new translator will be available when you export.

Yet To Do

  • Add a function to ascii-ize the author’s last name and title words of the key rather than just removing the non-ascii letters (which makes for ugly keys when citing names and titles that have diacriticals).
  • Add the ability to specify what fields do/don’t get exported as a variable at the top of the file
  • Remove HTML markup from ‘notes’ field. At the moment, I get ugly LaTeXification of HTML. With junk like this: {\textless}p{\textgreater}

Yuk.

Wish I could do but probably can’t

  • Add a link to the attached files in Zotero’s storage directory. I’ve looked at the code for this in a few other exporters, but I can’t grok it. If anyone has any ideas how to add this, I’d be happy to hear about it.

Roll your own

These files consist only of modified versions of Zotero’s standard BibTeX export file, which you can find in ‘[firefox-profile-dir]/zotero/translators/BibTeX.js’ If you want to change anything, you can edit the Javascript yourself. This is a little daunting if you’ve never done it before, but it’s not rocket science either. Have a look at the versions here—especially in a text editor that does syntax highlighting. These versions are commented to given some guidance to non-Javascripters in further modifications.

Of course if you do this, you should keep good backups of your Zotero data. Nothing should happen during export to corrupt your data normally, but be warned. Neither I nor (especially) the Zotero developers take responsibility for your data in any case. We take even less if you take things into your own hands. There.

If you want to make your own BibTeX exporter which will show up separately in the export list in Zotero (as opposed to modifying this one), do this:

(1) Start with the stock BibTeX.js file, or my BibTeX-LowFat.js (which at the moment is better commented)

(2) In a decent editor, change the “label” in the header to a name you like: (e.g. “BibTeX (My Prefs)”).

(3) Generate a unique GUID for the ‘translatorID’ field. Possibly online at a site like Create GUID. This is just a unique-in-the-history-of-the-world number so that your translator doesn’t get confused with anyone else’s.

(4) Make your edits. The easiest kind are edits of omission, which you will usually make either at the beginning of the file in the ‘fieldMap’ variable, or at the end of the file in the ‘doExport’ function. The file is huge, but almost all is taken up by the large translation tables. See the lines commented out with ‘//’ for examples of, well, commenting lines out.

(5) Save your new translator and put it into zotero/translators with a new file name. It should show up in your list of exporters after you restart Firefox. I keep a small test bibliography (in reference/TestBib of this repository) to test my modifications on a small dataset. This makes it easy to see differences in the generated BibTeX entries.

The future

I’d love to see a flexible BibTeX export for Zotero, either in Zotero itself or—what seems more likely—in a purpose-built BibTeX export plugin for Zotero, perhaps following on the work of LyZ.

In a world where lots of databases and bibliography tools export ‘BibTeX’, BibTeX data files are increasingly likely to have ‘surplus’ data, useful and standard data fields which may be unused in any given publishing project. Because of this is would be smart if BibTeX styles and their descendants allow for extra data in the database files they use.

To my mind this means that such styles need to include mechanisms to specify at the document level (and for flexible styles like Chicago, at the point of citation as well) whether any of the optional fields (ISBN, DOI, URL, etc.) should be used.

It may be that as BibTeX evolves to use better data storage formats and ‘real’ databases as its backend, that it will be able to directly access the databases of Zotero and other modern bibliography managers. This will make it even more necessary for whatever then passes for BibTeX .bst styles to allow for flexibility in the matter of what bibliographic data they make use of in the act of citing.

Alternatives to this approach

If your BibTeX output from Zotero contains fields you don’t want, there are also other options for getting good output.

(1) Use a BibTeX style which just ignores the extra fields and does the right thing–if such a style exists for you. Some BibTeX files automatically ignore the ‘month’ field for journal articles, for example, if it’s not necessary for citation. You may be able to use such a style or to modify your existing BibTeX style so that it doesn’t use the fields you don’t want. This is less possible for citation styles which leave considerable control in the hands of the author and editor (e.g. Chicago notes), because those require that the author be able to make such decisions based his or her sense of what is required for the citation of any individual resource. Of course if you need that level of flexibility then postprocessors and this exporter won’t be of much help either. You’ll need to find a style that lets you specify inclusions and omissions on a per-citation basis (biblatex-chicago-notes-df will add some of this kind of capability), or else you’ll be stuck maintaining a BibTeX database by more manual means.

(2) Postprocess your BibTeX files using a text editor, a BibTeX reference manager like JabRef, a scripting language (Perl, bash, python or sed) or a dedicated BibTeX postprocessor like ‘bibtool’ or the new ‘bibtexformat’. These last two are particularly recommended. Bibtool is a venerable old thing which is highly flexible. It will not handle UTF-8 BibTeX though, whatever you do. BibTeXformat is a newer project written in Perl which does a smaller number of transformations, but is well-documented and currently in active development. It does not handle UTF-8 BibTeX either, but its author assures me (June 2010) that this limitation will disappear with the next release. He also plans to offer an ability to remove fields based on item types.

(3) If you use LyX for writing LaTeX, check out the new Firefox plugin LyZ, which maintains a BibTeX based of the works cited in a particular LyX document.

(4) Mendeley has the ability to automatically update its own database based on your Zotero collection (read-only) and to keep a BibTeX file up-to-date with exports from that collection. Though since Mendeley is a closed-source program you may have limited control over the BibTeX export Of course you can still make a copy of your BibTeX database and postprocess it (2).

I chose to tweak the Zotero export files simply because it lets me keep Zotero as my main database (rather than just as a collection tool) without having to run a postprocessor every time I export modifications to my data. I also wanted my BibTeX database in UTF-8 encoding, which Zotero does well, but which the postprocessors at the time did not.

I have a big-ish database of 1800 items. Since it still exports in under a minute, I just export the whole thing afresh when I want to update my BibTeX file with the latest from my Zotero database. It keeps export to a single step.

About

Changes to Zotero's BibTeX export (NOW BROKEN as of new Zotero version)

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published