misunderstanding test case sort_BibliographyCitationNumberDescending #1

Open
seboettg opened this Issue Feb 24, 2017 · 11 comments

Comments

Projects
None yet
4 participants

Hi everyone,

I have a question concerning the following test case:
https://github.com/citation-style-language/test-suite/blob/master/processor-tests/humans/sort_BibliographyCitationNumberDescending.txt

The above test case aims to test sorting behaviour. But I do not understand this test case for different reasons:

  1. The sorting key is citation number (<key variable="citation-number" sort="descending"/>), but the items to test does not contain any field citation-number and i did not found any mapping between "citation-number" and "id" in the specification. How do I know that citation-number means id?
  2. Assuming id is the mentioned sorting key... id contains "item-i" but the expected results are single numbers. According to my understanding the line <text variable="citation-number" prefix="[" suffix="] "/> would result in "[item-1]", "[item-2]" and so on...
  3. IMHO, the sorting key is the key that will be compared in order to bring the associated values in the specified order. In this case it would mean that the order of the items will changed from the unsorted: > item-1, item-3, item-4, item-6, item-2, item-5, item-7, item-8
    to the sorted:

[8] Book Eight
[7] Book Seven
[6] Book Six
...

but the expected result is:

[8] Book One
[7] Book Three
[6] Book Four

This means, the association between the key and the item will took out.

Either I didn't understand the sense of sorting or the test case is completely wrong. It would be great if someone can explain the sense of this test case or clarify my misunderstanding.

Thank you in advance,
Sebastian.

Owner

rmzelle commented Feb 24, 2017

It's my understanding that the "id" field has no relation with "citation-number" whatsoever, and that the latter is purely determined by the order in which the items are stored in the array. See http://citeproc-js.readthedocs.io/en/latest/csl-json/markup.html?highlight=item-1#id-field and http://citeproc-js.readthedocs.io/en/latest/csl-json/markup.html?highlight=item-1#citation-items-container.

That said, the results do look wrong, although I've never seen a style that wants you to count backwards for "citation-number". @fbennett?

Member

fbennett commented Feb 25, 2017

@seboettg, @rmzelle: Nice one. Thanks for questioning this. Playing with the test, it's clear that sorts on citation-number in citeproc-js are buggy, so there's definitely some work to do; but citation-number itself is a little slippery, and we'll need to pin down exactly what we mean by it before fixing things up.

Rintze is right that the citation-number displayed against a bibliography or citation item has no relation to the item ID. The default behavior is pretty clear:

  1. If no sorting is performed, items are listed in document order (or, if only a bibliography is generated, the the order in which they are submitted to the processor).
  2. citation-number values are normally assigned in bibliography order.
  3. It follows from (1) and (2) that an ascending sort on citation-number alone is a no-op, producing the same result as an "unsorted" bibliography:
    <sort>
      <key variable="citation-number"/>
    </sort>

So far, so good, but up to this point, citation-number might have either (or both) of two roles:

  • As a display variable, it is an ascending sequence with 1-origin, assigned to items in a preexisting sort order.
  • As item data, it expresses document position in a 1-origin sequence.

Thinking through the possibilities ...

  • (a) If treated strictly as a display variable, sort="descending" should reverse the order of the citation-number sequence, without affecting the order of items themselves. That appears to be what the test expects at the moment. (While poking around, I've discovered that citeproc-js doesn't behave this way when citation-number is mixed with other sort keys---that's definitely a bug, but not raised by this test.)

If citation-number is treated as item data for sorting purposes, setting sort="descending" on it could be handled in one of four ways:

  • (b) It could reverse the existing sort order, and assign citation-number in reverse sequence (ending in 1);
  • (c) It could reverse the existing sort order, and assign citation-number in ascending sequence with 1-origin;
  • (d) It could treat document sequence as an ordinary sort key, attempt to sort in reverse document order, and assign citation-number in reverse sequence (ending in 1); or
  • (e) It could treat document sequence as an ordinary sort key, attempt to sort in reverse document order, and assign citation-number in ascending sequence with 1-origin.

Looking at how these options play out when mixed with other keys, I'm inclined to think that (e) above is most correct. In that case, the test result would look like this (I've replaced the named numbers with numeric equivalents to facilitate other forms of the test):

<div>
   <div class="csl-bib-body">
  <div class="csl-entry">[1] Book 008</div>
  <div class="csl-entry">[2] Book 007</div>
  <div class="csl-entry">[3] Book 005</div>
  <div class="csl-entry">[4] Book 002</div>
  <div class="csl-entry">[5] Book 006</div>
  <div class="csl-entry">[6] Book 004</div>
  <div class="csl-entry">[7] Book 003</div>
  <div class="csl-entry">[8] Book 001</div>
</div>

Um ... what do you think?

Owner

rmzelle commented Feb 25, 2017

@seboettg, can you explain why you are interesting in this behavior? Are you programming a CSL processor, or do you want a CSL style that sorts items in reverse order?

Thanks for your replies. That clarify my confusions.



@rmzelle citeproc-php by Ron Jerome is not longer maintained by him, so I took this task some time ago. Since a couple of month I'm working for a completely rewritten version of citeproc-php, that will be able to render citations as well as bibliographies. And I would like to implement missing features like sorting. Also it aims in a lot of architecture improvements. I try to build the processor completely test-driven, and so I stumbled upon this test case.

@fbennett your proposal sounds good for me. Option (e) is most correct.

By the way: sort_BibliographyCitationNumberDescendingViaMacro caused the same wrongdoing.

Owner

rmzelle commented Feb 26, 2017

@rmzelle citeproc-php by Ron Jerome is not longer maintained by him, so I took this task some time ago.

Okay, great. So your fork completely superseded his repository? If so, I'll update http://citationstyles.org/developers/#CSL_Processors to point to https://github.com/seboettg/citeproc-php for citeproc-php.

Since a couple of month I'm working for a completely rewritten version of citeproc-php, that will be able to render citations as well as bibliographies.

Please let us know when you reach 1.0 for your rewrite!

Owner

rmzelle commented Feb 26, 2017

I try to build the processor completely test-driven, and so I stumbled upon this test case.

By the way, you should be aware that the current test suite has some issues in terms of its custom format and coverage. It was created by @fbennett specifically for development of citeproc-js, and we haven't yet put in a lot of work to make it an accessible resource for developers of other CSL processors. See http://xbiblio-devel.2463403.n2.nabble.com/CiteProc-for-NET-td7579469.html for some additional discussion.

I'd be happy to help curate the test suite to make sure that the CSL specification and test suite are in full agreement, and that we have good coverage of the various features of CSL, but, as a very amateur programmer, I'm not the best person to convert the current test suite to a more accessible shape. Any help with this would be more than welcome, and we should be able to pay a bounty as well. (cc @inukshuk, @fouke-boss, @fbennett)

By the way, you should be aware that the current test suite has some issues in terms of its custom format and coverage. It was created by @fbennett specifically for development of citeproc-js, and we haven't yet put in a lot of work to make it an accessible resource for developers of other CSL processors. See http://xbiblio-devel.2463403.n2.nabble.com/CiteProc-for-NET-td7579469.html for some additional discussion.

Rintze, Thank you for this hint. Before, I was getting in trouble with a lot of test cases, caused of these test cases and not because of a falsely implementation of the CSL specification. I was running in exactly the same problems as it Fouke has described in your mailing list.

I really like the approach of human readable test cases, that can be converted by a processor into the JSON format (or similar).

I would suggest to keep all test cases within one project (also the tests for the special features of citeproc-js). Within this project one could distinguish between "basic tests" and "special tests". Latter would contain all the special tests for citeproc-js. This course of action would avoid redundant maintaining of different test case projects.

I'd be happy to help curate the test suite to make sure that the CSL specification and test suite are in full agreement, and that we have good coverage of the various features of CSL, but, as a very amateur programmer, I'm not the best person to convert the current test suite to a more accessible shape. Any help with this would be more than welcome, and we should be able to pay a bounty as well.

Yes, I would like to help with this.

Please let us know when you reach 1.0 for your rewrite!

I guess I will reach a early alpha version in the next weeks.

Member

fbennett commented Mar 15, 2017 edited

If there is any way I can help, let me know. The tests in the citation-styles repo are meant to be CSL-conformant. For several years, as I've come across tests of citeproc-js-specific features, I've moved them to the "local" tests subdir of the citeproc-js-specific source repo. If it will help, I can spin that subdir off to a separate repo and give write privileges to anyone working on test-suite reorganization.

The chaotic overlapping coverage of the test fixtures (that is, coverage of multiple CSL features in most tests) is a headache in early development, but I've come to appreciate its virtues. There's a lot of chaos in the wild, and I've lately not been caught out much by unexpected patterns of input, thanks to the unsystematic nature of the test suite. So there's that too.

Owner

rmzelle commented Mar 15, 2017 edited

If there is any way I can help, let me know. The tests in the citation-styles repo are meant to be CSL-conformant. For several years, as I've come across tests of citeproc-js-specific features, I've moved them to the "local" tests subdir of the citeproc-js-specific source repo. If it will help, I can spin that subdir off to a separate repo and give write privileges to anyone working on test-suite reorganization.

@fbennett, ah, thanks. We're probably fine with the current split, then. @fbennett, @inukshuk, if you would be creating a test suite for CSL from scratch, what would be different? Is it mainly that you would pick a format such as Cucumber? Anything else? Is there e.g. anything we should do to make it easier to allow for continuous integration testing (e.g. via Travis CI) for the various CSL processor repositories?

Member

inukshuk commented Mar 15, 2017

The main benefit of something like the current text format (or Cucumber) is that it is relatively easy for everyone to add test files. Having said that, the JSON format is great because it is so easy, nowadays, to use it as is, or to convert it to a more suitable format for your platform.

A few years ago, I converted most of the tests from JSON to cucumber for citeproc-ruby, because that gave me the test runner for free. The resulting Cucumber files are OK to read but not necessarily a big improvement on the original text files. Cucumber would allow to add a lot of syntactic sugar to improve on this, but a simple text format with corresponding JSON is hard to beat in terms of interoperability. The fact that the tests have been used by (all? most?) citeproc implementers speaks for itself.

Having said that, the tests are great for integration tests; during development I wrote most tests for citeproc-ruby as unit tests and also added Cucumber tests which work on isolated CSL nodes: these kind of tests were extremely useful, because they allow you to test parts of a style in isolation (I often just copied examples from xbiblio verbatim into the test suite). Something along those lines would be a great addition to the test-suite (if it does not support it already, I'm afraid I'm not up to date!)

@rmzelle I've just released the new version of citeproc-php. It would be great if you set a link to https://github.com/seboettg/citeproc-php on citationstyles.org.

I guess a main improvement would be to add a description for every single test case. Not infrequently one do not understand what a test case intend to test exactly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment