Skip to content
This repository has been archived by the owner on Apr 30, 2021. It is now read-only.

Extended locators not always detected for combination with URL in note styles #168

Closed
adunning opened this issue Sep 16, 2015 · 11 comments
Closed

Comments

@adunning
Copy link
Contributor

While pandoc-citeproc correctly moves the URL or DOI after the page number or other locator for a note, it only does this (as far as I can tell) with those that are a) purely numerical and b) have a known locator type. Thus, in the following example (with 0.7.3), the first two examples are rendered as expected, but the last two are not.

I recognize that fixing this could be difficult. For instance, I have a string in a citation of pp. 41 (no. 58, art. 6.18), 163 (no. 201, art. 71), 201 (no. 241, art. 9) that I would consider to be series of locators, but pandoc-citeproc would consider most of this to be a suffix. Allowing added locators to be bracketed would probably fix most of the problems here.

pandoc  -t commonmark -F pandoc-citeproc << EOT

---
references:
- id: test1
  DOI: 10.3243/424234
- id: test2
  DOI: 10.3243/424234
- id: test3
  DOI: 10.3243/424234
- id: test4
  DOI: 10.3243/424234
csl: chicago-fullnote-bibliography.csl
...

[@test1, p. 1]

[@test2, pp. 1, 3]

[@test3, pp. 234–244 (chap. 1)]

[@test4, fols. 85r–88r]

Results:

\[1\]

\[2\]

\[3\]

\[4\]

n.d. [doi:10\.3243/424234](http://doi.org/10.3243/424234).

n.d. [doi:10\.3243/424234](http://doi.org/10.3243/424234).

n.d. [doi:10\.3243/424234](http://doi.org/10.3243/424234).

n.d. [doi:10\.3243/424234](http://doi.org/10.3243/424234).

1.  N.d., 1, [doi:10\.3243/424234](http://doi.org/10.3243/424234). 

2.  N.d., 1, 3, [doi:10\.3243/424234](http://doi.org/10.3243/424234). 

3.  N.d., 234–44, [doi:10\.3243/424234](http://doi.org/10.3243/424234)
    (chap. 1\). 

4.  N.d., [doi:10\.3243/424234](http://doi.org/10.3243/424234), fols.
    85r–88r.
@njbart
Copy link
Contributor

njbart commented Sep 16, 2015

Try:

pandoc  -t commonmark -F pandoc-citeproc << EOT
---
references:
- id: test1
  DOI: 10.3243/424234
- id: test2
  DOI: 10.3243/424234
- id: test3
  DOI: 10.3243/424234
- id: test4
  DOI: 10.3243/424234
csl: chicago-fullnote-bibliography.csl
...

[@test1, ch. 1]

[@test2, chap. 1, chap. 3]

[@test3, chapter 1)]

[@test4, folios 85r–88r]
EOT

Output:

\[1\]

\[2\]

\[3\]

\[4\]

n.d. [doi:10\.3243/424234](http://doi.org/10.3243/424234).

n.d. [doi:10\.3243/424234](http://doi.org/10.3243/424234).

n.d. [doi:10\.3243/424234](http://doi.org/10.3243/424234).

n.d. [doi:10\.3243/424234](http://doi.org/10.3243/424234).

1.  N.d., [doi:10\.3243/424234](http://doi.org/10.3243/424234), ch. 1\.

2.  N.d., chap. 1, [doi:10\.3243/424234](http://doi.org/10.3243/424234),
    chap. 3\.

3.  N.d., chap. 1,
    [doi:10\.3243/424234](http://doi.org/10.3243/424234)).

4.  N.d., f. 85r–r,
    [doi:10\.3243/424234](http://doi.org/10.3243/424234).

So some strings are detected as locator terms. Others I tried are the unabbreviated forms of the locator terms listed in the CSL specs (book, chapter, column, figure, folio, issue, line, note, opus, page, paragraph, part, section, sub verbo, verse, volume; see http://docs.citationstyles.org/en/stable/specification.html#locators); all these (except “sub verbo”) are rendered with an abbreviated term and in front of the DOI, so pandoc-citeproc must have detected them. Plurals (“folios”) and some abbreviations (“pp.”, “vol.”) are detected, too. Some other abbreviations aren’t (“ch.”). Strangely enough, I can find the locator terms in the CSL locale files but nowhere in the pandoc or pandoc-citeproc source.

Suggestions:

  • fix “sub verbo”
  • fix handling of “number” ranges like “85r–88r”
  • allow more than one locator: as mentioned in the OP, you might want [@test4, vol. 7, folios 85r–88r] and the like (but check CSL specs on this; Zotero/LO offers one only, but maybe that’s just a GUI limitation)
  • not sure about round brackets in general, but with a known locator inside, sure
  • we really need to document this

@adunning
Copy link
Contributor Author

Thanks for the further testing. The folio issue, at least, is logical enough, since that isn't the abbreviation used in the CSL locale files; I've opened citation-style-language/locales#115 to take care of this.

@adam3smith
Copy link

CSL specs currently allow only one locator. It's come up before occasionally, but no aggressive movement to change this at the moment, I believe (not to say it wouldn't be a good idea).

@jgm
Copy link
Owner

jgm commented Sep 26, 2015

Pandoc detects locator terms based on the locale file. Either abbreviated or unabbreviated forms are accepted. If no locator term is used, "page" is assumed.

@jgm
Copy link
Owner

jgm commented Sep 26, 2015

So, what changes to pandoc-citeproc, if any, are needed here?

@njbart
Copy link
Contributor

njbart commented Sep 26, 2015

  • fix “sub verbo”, “sub verbis”, …
  • fix incorrect abbreviation of ranges like 85r–88r => 85r–r
  • we really need to document this

@jgm
Copy link
Owner

jgm commented Sep 26, 2015

fix “sub verbo”, “sub verbis”, …

My guess is that the problem relates to these being two-word phrases -- they are in the locale file with "chapter", "page", etc. I'll look into this.

we really need to document this

Can you say more about what needs documenting?

jgm added a commit that referenced this issue Sep 26, 2015
@jgm
Copy link
Owner

jgm commented Sep 26, 2015

Fixed the "sub verbo" issue.

@njbart
Copy link
Contributor

njbart commented Sep 26, 2015

On documentation: jgm/pandoc#2418
(Note that "ch." in the README was incorrect, at least for en-US and en-GB.)

@jgm
Copy link
Owner

jgm commented Sep 26, 2015

On the 85r-88r issue, the culprit seems to be expandedRange:

*Text.CSL.Eval> expandedRange "88a" "89a"
("88a","a")

This part of the original citeproc-hs code is written in an idiom I find impenetrable, so it may take time to get to the bottom of this, and maybe we should simply rewrite the range collapsing code with an eye to the spec.

http://docs.citationstyles.org/en/stable/specification.html#appendix-v-page-range-formats

jgm added a commit that referenced this issue Sep 26, 2015
@jgm
Copy link
Owner

jgm commented Jul 15, 2016

I'm no longer seeing the bad collapsing: in the example above I get

fols. 85r–88r. 

which is correct. So I'll close this.

@jgm jgm closed this as completed Jul 15, 2016
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants