Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add script that converts the catalyst_cites.bib to a CSV for reporting #3436

Draft
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

bendnorman
Copy link
Member

Overview

This PR:

  1. Creates a script that converts the catalyst_cites.bib to a CSV for Sloan reporting. The script is hardcoded to use Sloan's requested format, but we can improve it later to handle multiple formats if needed.
  2. Standardized the author format to be {First name} {middle initial} {last name} as opposed to {Last name}, {first name} {middle initial}. Is this kosher from a citation standpoint?

Testing

How did you make sure this worked? How can a reviewer verify this?

To-do list

Edit tasklist title
Beta Give feedback Tasklist To-do list, more options

Delete tasklist

Delete tasklist block?
Are you sure? All relationships in this tasklist will be removed.
  1. Ensure docs build, unit & integration tests, and test coverage pass locally with make pytest-coverage (otherwise the merge queue may reject your PR)
    Options
  2. For significant ETL changes, ensure the full ETL runs locally
    Options
  3. For major data coverage & analysis changes, run data validation tests
    Options
  4. If updating analyses or data processing functions: make sure to update or write data validation tests
    Options
  5. Update the release notes: reference the PR and related issues.
    Options
  6. Review the PR yourself and call out any questions or issues you have
    Options
Loading

@bendnorman bendnorman marked this pull request as ready for review February 28, 2024 00:13
@bendnorman bendnorman marked this pull request as draft February 28, 2024 00:22
@bendnorman
Copy link
Member Author

@zaneselvans I misread the Sloan guidelines so we don't actually need this script! It might be worth keeping the author standardization from this PR. Is there a reason why some authors' first and last names are in different orders?

@zaneselvans
Copy link
Member

zaneselvans commented Feb 28, 2024

Many of these citations are exported directly from the journals or other publications, not compiled by hand, and some of them use the "last, first middle" to more accurately identify surnames. E.g. "Alexandra Von Meier" would identify her last name as "Meier" while "Von Meier, Alexandra" correctly indicates that the family name is composed of 2 words, and different BibTeX formatting templates will parse out these first/last names using the word "and" as a separator and reformat them appropriately, depending on how the particular citation template expects names to be displayed.

It looks like there are some hard-coded parsing rules that understand particles like "von" but in special cases, only using the comma separated elements for individual names and the word "and" to separate names should work. But in the standard cases, the normal BibTeX parser should understand the different possible formats.

https://www.bibtex.com/f/author-field/

Base automatically changed from new-bibtex-refs to main February 28, 2024 19:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: New
Development

Successfully merging this pull request may close these issues.

None yet

2 participants