Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Re-run the script that adds the string sort attribute #177

Closed
wlpotter opened this issue Feb 24, 2023 · 6 comments
Closed

Re-run the script that adds the string sort attribute #177

wlpotter opened this issue Feb 24, 2023 · 6 comments
Assignees
Milestone

Comments

@wlpotter
Copy link
Collaborator

Now that we have new data we need to update so that the records ahve the string sort attribute letting records within the same work appear properly sorted w.r.t. the section ordering.

Add a test to this script so that it only runs on work-groups that have at least one record that lacks the n attribute. This will cut down on time re-processing unchanged work groups and, more importantly the time spent manually checking the ones that couldn't be sorted by the matching algorithm.

@wlpotter wlpotter self-assigned this Feb 24, 2023
@wlpotter
Copy link
Collaborator Author

Hmm I might have uncovered a second problem:

We have several cases of the same work being referred to separately. And since we don't have stable URIs for all authors and works, I have had to rely on string matching to create the author-work groups for sorting. Here are the offenders as of 2023-02-24:

Eusebius of Caesarea. Dictionary of Place Names
Eusebius of Caesarea. Ecclesiastical History
Eusebius of Caesarea. Historia ecclesiastica
Eusebius of Caesarea. Life of Constantine
Eusebius of Caesarea. Martyrs of Palestine
Eusebius. Dictionary of Place Names
Eusebius. Life of Constantine
Photius. Library
Photius. Library Library
Pliny the Elder. Natural History
Pliny. Natural History
Procopius of Gaza. Letter
Procopius of Gaza. Letters
Socrates of Constantinople. Church History
Socrates Scholasticus. Ecclesiastical History
Sozomen. Church History
Sozomenus. Church History

We will need some data normalization before we can reliably run an updated string sort. @davidamichelson we should discuss.

The string sort is not essential, and ideally we'd just run it once more before official publication, so this is not a priority.

@davidamichelson
Copy link
Collaborator

davidamichelson commented Mar 1, 2023

@wlpotter here are the revisions for making these uniform.

Eusebius of Caesarea. Historia ecclesiastica -> Eusebius of Caesarea. Ecclesiastical History
Eusebius. Dictionary of Place Names -> Eusebius of Caesarea. Dictionary of Place Names
Eusebius. Life of Constantine -> Eusebius of Caesarea. Life of Constantine
Photius. Library Library -> Photius. Library
Pliny. Natural History -> Pliny the Elder. Natural History
Procopius of Gaza. Letters -> Procopius of Gaza. Letter
Socrates Scholasticus. Ecclesiastical History -> Socrates of Constantinople. Church History
Sozomenus. Church History -> Sozomen. Church History

After you run these please prepare a two column report of all names and titles.

@davidamichelson
Copy link
Collaborator

@wlpotter please make an issue to discuss "Letters vs. Letter" @josephrife on Friday please.

@wlpotter Also one more to discuss, should be add geographic names for other authors when there are multiple with the same name (Procopius. Secret History -> Procopius of Caesarea -> Secret History)

@josephrife
Copy link
Contributor

josephrife commented Mar 1, 2023 via email

@wlpotter
Copy link
Collaborator Author

wlpotter commented Mar 1, 2023

I split this into #179 and #180. Once these are resolved, I will run the script to add the string-sort attribute.

@wlpotter wlpotter added this to the 1.0 Release milestone Mar 3, 2023
wlpotter added a commit that referenced this issue Jul 7, 2023
Automatically applying sort values where possible (8 works with errors to process manually)
wlpotter added a commit that referenced this issue Jul 7, 2023
Fixing remaining errors by hand
@wlpotter
Copy link
Collaborator Author

wlpotter commented Jul 7, 2023

This should be fixed

@wlpotter wlpotter closed this as completed Jul 7, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Done
Development

No branches or pull requests

3 participants