Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove extraneous fields from CSL references #47

Closed
dhimmel opened this issue Aug 6, 2018 · 1 comment
Closed

Remove extraneous fields from CSL references #47

dhimmel opened this issue Aug 6, 2018 · 1 comment

Comments

@dhimmel
Copy link
Member

dhimmel commented Aug 6, 2018

Some of our methods to generate CSL items (reference metadata) produce many extraneous fields. This is most acute for Crossref DOIs, which contain many fields in addition to those part of the CSL specification. Here are some examples:

    "content-domain": {
      "domain": [],
      "crossmark-restriction": false
    },
    "link": [
      {
        "URL": "https://syndication.highwire.org/content/doi/10.1126/science.aaf5675",
        "content-type": "unspecified",
        "content-version": "vor",
        "intended-application": "similarity-checking"
      }
    ],

This creates our CSL references.json file to be unnecessarily large. Some of the fields are helpful, but are unnecessary for the purpose of creating the bibliography.

See the machine-readable schema for CSL here, which includes what fields are supported.

My proposal would be to filter (possibly optional but as default) fields using the CSL Data Schema. Potentially there would even be a way to automatically detect and delete fields that don't meet the schema to avoid downstream issues.

@dhimmel
Copy link
Member Author

dhimmel commented Aug 6, 2018

dhimmel added a commit to dhimmel/manubot that referenced this issue Aug 7, 2018
dhimmel added a commit to dhimmel/manubot that referenced this issue Aug 12, 2018
Refs manubot#47

CSL: replace arxiv_id with archive_location

Travis: install package using pip

Attempt to fix
python-jsonschema/jsonschema#449

--prune-csl option for manubot cite

Only remove a single additional property sub_error

Workaround the effect of
citation-style-language/schema#154

Switch to dhimmel/schema CSL JSON

Move validation to remove_jsonschema_errors

Test CSL pruning

Improve CSL pruning documentation

Default to pruning unless --bad-csl flag supplied

DOI CSL retriever: use shortDOI for URL

Switch CSL pruning logging to DEBUG

Update manubot cite help in README

arxiv citeproc: use int for date-parts
dhimmel added a commit to dhimmel/manubot that referenced this issue Aug 13, 2018
Prune CSL Items to validate JSON schema

Refs manubot#47

CSL: replace arxiv_id with archive_location

Travis: install package using pip

Attempt to fix
python-jsonschema/jsonschema#449

--prune-csl option for manubot cite

Only remove a single additional property sub_error

Workaround the effect of
citation-style-language/schema#154

Switch to dhimmel/schema CSL JSON

Move validation to remove_jsonschema_errors

Test CSL pruning

Improve CSL pruning documentation

Default to pruning unless --bad-csl flag supplied

DOI CSL retriever: use shortDOI for URL

Switch CSL pruning logging to DEBUG

Update manubot cite help in README

arxiv citeproc: use int for date-parts
dhimmel added a commit to dhimmel/manubot that referenced this issue Aug 14, 2018
Prune CSL Items to validate JSON schema

Refs manubot#47

CSL: replace arxiv_id with archive_location

Travis: install package using pip

Attempt to fix
python-jsonschema/jsonschema#449

--prune-csl option for manubot cite

Only remove a single additional property sub_error

Workaround the effect of
citation-style-language/schema#154

Switch to dhimmel/schema CSL JSON

Move validation to remove_jsonschema_errors

Test CSL pruning

Improve CSL pruning documentation

Default to pruning unless --bad-csl flag supplied

DOI CSL retriever: use shortDOI for URL

Switch CSL pruning logging to DEBUG

Update manubot cite help in README

arxiv citeproc: use int for date-parts
dhimmel added a commit that referenced this issue Aug 16, 2018
Merges #49
Closes #47

* Prune CSL Items to validate against JSON schema
* Travis: install package using pip
* Update manubot cite help in README
* DOI CSL retriever: use shortDOI for URL
* arxiv citeproc: replace arxiv_id with number
* arxiv citeproc: use int for date-parts
dhimmel added a commit to dhimmel/manubot that referenced this issue Aug 16, 2018
Prune CSL Items to validate JSON schema

Refs manubot#47

CSL: replace arxiv_id with archive_location

Travis: install package using pip

Attempt to fix
python-jsonschema/jsonschema#449

--prune-csl option for manubot cite

Only remove a single additional property sub_error

Workaround the effect of
citation-style-language/schema#154

Switch to dhimmel/schema CSL JSON

Move validation to remove_jsonschema_errors

Test CSL pruning

Improve CSL pruning documentation

Default to pruning unless --bad-csl flag supplied

DOI CSL retriever: use shortDOI for URL

Switch CSL pruning logging to DEBUG

Update manubot cite help in README

arxiv citeproc: use int for date-parts
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant