Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

author-only, suppress-author, composite #117

Merged
merged 60 commits into from
Jul 2, 2021
Merged

author-only, suppress-author, composite #117

merged 60 commits into from
Jul 2, 2021

Conversation

cormacrelf
Copy link
Collaborator

@cormacrelf cormacrelf commented Jun 30, 2021

Fixes #114, available in @citeproc-rs/wasm@0.0.0-canary-0157c4b

See the updated crates/wasm/README.md for JS-oriented documentation (diff). I made it citeproc-js-compatible, which has unfortunately muddied the whole kebab-case vs camelCase situation. At least now Cite and Cluster are compatible with the CSL-JSON schemas.

There are some minor incompatibilities with citeproc-js; the only divergent tests from the relevant citeproc-js fixtures/local tests are:

  • discretionary_ExampleUnauthored.txt, which remains failing.
  • discretionary_AuthorOnly.txt has a snapshot reflecting what I think is an inconsistency with discretionary_SuppressAuthor, where AuthorOnly considers a date (issued) to be an author to be extracted, but SuppressAuthor does not suppress it.

Other implementation notes:

  • None of the citeproc-js tests appear to test what happens when the names block is not at the beginning of the rendered cite. So not sure what should happen there; currently only leading names & titles can be extracted.
  • The <intext> element has been feature flagged. Add <features><feature name="custom-intext" /></features> to use, or force-enable that feature globally via Driver.new as described in the wasm README.
  • a bunch of new tests for adding locators and affixes to grouped and collapsed cites, and the delimiters to be used in those situations.

Internals:

  • New IrTree, IrTreeRef and IrTreeMut that represent NodeId + owned/borrowed/mutable IrArena. All the IR transforms are moved to functions defined on these types.
  • Rewritten and much cleaner/simpler cite grouping and collapsing to accomodate Composite mode
  • Notably construction of a cluster out of its cites is formalised and given names, like LayoutStream
  • Improvements to the csl::Features API including a way to specify enabled-by-default CSL features in the processor's initialisation options.
  • The test-utils crate is moved into crates/citeproc/tests/test_format

means using <intext> without activating the feature is a parse error.
adds a hint for activating the feature.
@adomasven
Copy link
Member

This is great work!

  • None of the citeproc-js tests appear to test what happens when the names block is not at the beginning of the rendered cite. So not sure what should happen there; currently only leading names & titles can be extracted.

Would it be possible for parseStyleMetadata() to return a property for this, that way citeproc-rs clients can select to display or not to display the option for their users? Something like authorSuppressable: boolean

I made it citeproc-js-compatible, which has unfortunately muddied the whole kebab-case vs camelCase situation.

Since citeproc-rs API is incompatible in many ways with citeproc-js, I don't think it makes sense to maintain the compatibility here. Is there another reason for this (like test compatibility)? It would be much neater if the library maintained consistent camelCase across the board.

discretionary_ExampleUnauthored.txt, which remains failing.

I am mostly completely uninformed on the citeproc test suite, but I wonder why this is failing.

@cormacrelf
Copy link
Collaborator Author

when the names block is not at the beginning of the rendered cite

This is not really something I would make user-configurable, it's more of a CSL 1.1 spec problem that just needs a decision. As the citeproc-js docs are written I think my choice might be wrong, it should probably work anywhere because why have the limitation at all? I don't think it needs an urgent answer yet. Moreover, it is not possible to discover at parse time whether a style supports it, since by leading I mean "the first rendered" not the first to appear syntax-wise in the XML. The leading node can be as nested as you want under many groups, and all sorts of empty content can appear beforehand (e.g. choose blocks with no output) and it works just fine.

What would be more helpful I think is having it configurable in the previewCitationCluster API. That way you can tick the box and see the result before saving the cluster in your document. Currently that doesn't work but I can punt it into a follow up PR if you want it.

Is there another reason for this (like test compatibility)?

Yes, the citeproc-js-format test suite code was able to be simplified by a small amount. But that's not super important, I can just add a second Cluster struct to the test suite again.

I think I'll do that. While I'm at it, I'll probably change the Cite API to match the Cluster mode syntax, altogether like so, matching includeUncited("None").

// only these two options for cites
let citeAO = { id: "jones2006", mode: "AuthorOnly" };
let citeSA = { id: "jones2006", mode: "SuppressAuthor" };

// additional settings for clusters
let clusterAO       = { id: "one", cites: [...], mode: "SuppressAuthor" };
let clusterSA       = { id: "one", cites: [...], mode: "SuppressAuthor" };
let clusterSA_First = { id: "one", cites: [...], mode: "SuppressAuthor", suppressFirst: 3 };
let clusterC        = { id: "one", cites: [...], mode: "Composite" };
let clusterC_Infix  = { id: "one", cites: [...], mode: "Composite", infix: ", whose book" };
let clusterC_Full   = { id: "one", cites: [...], mode: "Composite", infix: ", whose books", suppressFirst: 0 };

discretionary_ExampleUnauthored.txt, which remains failing.
I am mostly completely uninformed on the citeproc test suite, but I wonder why this is failing.

I haven't figured out what I want to do about unauthored items. @fbennett chose to force them to emit [NO_PRINTED_FORM], but the only citation style I've ever had to use (AGLC) has a commonly used form where a case is introduced in-text with only its title, and a footnote added that omits the title. This is the diff at the moment, showing that lone titles in cites with no author at all are perfectly good author-only extractions in citeproc-rs:

 (where [1] is cluster author-only, [2] is cluster suppress-author, [3] is cluster composite
 and - lines are citeproc-rs, + lines are citeproc-js
 [0] (Unauthored thing 1964)
-[1] Unauthored thing
-[2] (1964)
-[3] Unauthored thing (1964)
+[1] [NO_PRINTED_FORM]
+[2] (Unauthored thing 1964)
+[3] [NO_PRINTED_FORM] (Unauthored thing 1964)

@cormacrelf
Copy link
Collaborator Author

Small follow up on the first question: another reason why previewing would work better is that not every item works with it, even if the style as a whole has the potential to extract a non-empty author-only. Obviously this is because some things are unauthored. So you basically have to check every time anyway.

@adomasven
Copy link
Member

What would be more helpful I think is having it configurable in the previewCitationCluster API

Yep, at least for Zotero we definitely need to support all cite customizations in the preview call.

I haven't figured out what I want to do about unauthored items. @fbennett chose to force them to emit [NO_PRINTED_FORM], but the only citation style I've ever had to use (AGLC) has a commonly used form where a case is introduced in-text with only its title, and a footnote added that omits the title.

The example seems like abuse of the API, maybe. Maybe it makes sense to substitute the title for the author when no author is available, but that has to be documented and maybe discussed with the CSL people.

@cormacrelf
Copy link
Collaborator Author

all cite customizations

To be clear the cite-level ones, e.g. { id: "jones2006", prefix: "see their other work", mode: "SuppressAuthor" } currently work from the previewCitationCluster API, simply add a mode flag to any cite that you're passing in. It is only the cluster level flags { id: "cluster-one", cites: [...], mode: "Composite" } that aren't yet exposed there. To get that, I'll have to add a parameter to let you supply the preview cluster's cluster mode.

While I'm at it breaking compatibility in favour of usability, I might also kill the reuse of the field name "id" to refer to every different kind of ID. clusterId on a Cluster, refId on a Cite because it points to a Reference. Then you won't have so many accidents like earlier in this issue.

Change is quite succinct overall, see IrTreeRef::leading_names_block_or_title

+discretionary_ExampleUnauthored.txt
this brings -rs mostly in line with -js, I believe.
adds test case whose output matches citeproc-js exactly.
+intext_AuthorOnly_Title.yml
+intext_AuthorOnly_DateYearSuffix.yml (to test a Substitute with a Seq inside it)
@cormacrelf
Copy link
Collaborator Author

Latest changes bring -rs in line with -js on discretionary_ExampleUnauthored and some new tests I added. Now any names block or any output of <substitute> can be extracted. Still limited to a names output at the beginning of a rendered cite, but that appears to be correct as changing it to anywhere in the rendered cite breaks some tests.

@cormacrelf cormacrelf merged commit 0157c4b into master Jul 2, 2021
@cormacrelf cormacrelf deleted the suppression branch July 2, 2021 07:17
@cormacrelf cormacrelf added A-core Area: affects all builds of citeproc-rs enhancement New feature or request A-crates/citeproc Area: citeproc crate A-wasm Area: wasm package on npm A-internals Area: internal implementation details A-crates/csl Area: csl crate breaking Breaking change labels Sep 5, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-core Area: affects all builds of citeproc-rs A-crates/citeproc Area: citeproc crate A-crates/csl Area: csl crate A-internals Area: internal implementation details A-wasm Area: wasm package on npm breaking Breaking change enhancement New feature or request
Development

Successfully merging this pull request may close these issues.

suppress-author and friends
2 participants