Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(new_OGL) new edition submission for tlg1386 and phi1672 #2694

Open
wants to merge 10 commits into
base: master
Choose a base branch
from

Conversation

lcerrato
Copy link
Collaborator

@lcerrato lcerrato commented Mar 1, 2023

first checkin #2693

@KodieBastian1
This is a better home for new work submission. We have to do a number of revisions here, especially to file names and folder structure, before we can get to tagging issues.
I am also interested in permissions/copyright verification for the files that are post-1922, particularly the 2011 Latin.

If you decide not to continue the submission, please let me know and close out the issue.

It may be best to simply focus on one core file and then use this as a basis for other work, rather than trying to fix various files at once, but we can see how to proceed. All files must be compliant before ingest.

As I noted in the Perseus repo, I am happy to walk you through the various steps required for compliance.

Keep in mind, the Scaife Viewer can be set up to point to other repositories elsewhere, if long term file management is a concern or if you anticipate a larger project. We encourage collaborators to set up their own repos in cases where numerous sets of files will be continually refined or added.

@lcerrato lcerrato changed the title (new_OGL) new edition submittion for tlg1386 and phi1672 (new_OGL) new edition submission for tlg1386 and phi1672 Mar 1, 2023
@lcerrato lcerrato assigned lcerrato and unassigned lcerrato Mar 1, 2023
@lcerrato lcerrato added help wanted new contribution use for adding new files Special case requires special handling labels Mar 1, 2023
@lcerrato
Copy link
Collaborator Author

lcerrato commented Mar 1, 2023

First note is that we cannot have two different workgroups in one nested folder.

So, phi1672.phi002.perseus-lat2 has to have its own folder structure including associated cts metadata files.
There is an OGL/Latin repository, but not knowing the relationship here, I do not know if that is applicable.

Given that there is a different URN for this work, is it a separate edition or is it a translation of tlg1386.tlg001.perseus-lat2? I am guessing that the two separate URNs mean these are separate works.

If this is an exact translation, then we would generally use the main URN for a translation. If this is another edition, then it should go with the Latin.

In terms of associating the files with one another, we don't have current examples of this in the existing workflow but I believe there have been discussions of how to do this going forward. (@AlisonBabeu do you have an example of how we want to represent a relationship/association between two separate workgroups in the software table of contents? I don't know where we are with those discussions.)

Second question is what urns we should assign/use. In general, "perseus-grcX" or "perseus-latX" indicate an existing Perseus file.

So we need new URNs minted as well. @AlisonBabeu

Thanks.

@lcerrato
Copy link
Collaborator Author

lcerrato commented Mar 1, 2023

  • URN minting
  • CTS metadata completion
  • folder/file organization (including best repo selection)
  • header updates including credits
  • file association for SV TOC (?)
  • file compliance including current EpiDoc/CTS
  • update picklist
  • update catalog data

@lcerrato
Copy link
Collaborator Author

lcerrato commented Mar 1, 2023

Some links:

CTS metadata guidelines:
https://github.com/PerseusDL/tei-conversion-tools/wiki/cts-textgroup-and-work-metadata-files

Note we need canonical author and title information. The former is in the Perseus Catalog, but not the latter.

Header guidelines: https://github.com/OpenGreekAndLatin/First1KGreek/wiki/Header-atttributions
Note these are always being refined, so the last checkin/PR of a new file is always a good place to start.

An example of an external contributor header:
https://github.com/OpenGreekAndLatin/First1KGreek/blob/master/data/tlg0094/tlg001/tlg0094.tlg001.1st1K-eng1.xml

Latest checkins:
#2687

@KodieBastian1
Copy link

On copyright, the Kroll 1926 Greek should be out of copyright in the U.S. since it was published before 1928 and was not later published in the U.S. The 2011 Latin I'm less sure about. It's from DigilibLT, and the editions there are distributed under a Creative Commons license; however, their edition is based on the 2004 Teubner, and given different copyright laws in Italy with regards to editions of classical texts, I do not know how that plays out in the U.S.

Both Latin editions are translations of the Greek text, and at least Valerius (PHI1672) should be fairly close (with some omissions). The medieval Latin translation has more variations and omissions. The English translation is based on the Syriac translation, and that too has a number of variations. But it's the only English translation that is out of copyright.

@lcerrato
Copy link
Collaborator Author

lcerrato commented Mar 1, 2023

@KodieBastian1
In terms of license, the entire OGL repo is CC BY-SA, whereas the DigilibLT Project is CC BY-NC-SA, so we would probably be in violation of their license by offering this here.
We don't have different permissions for individual works at this stage.

@AlisonBabeu
Copy link
Collaborator

hi @lcerrato so our discussion of how to represent two related workgroups was kind of left hanging in our discussion of Seneca back in November over at the Scaife Viewer (scaife-viewer/scaife-viewer#571).
There Jake suggested the use of structured metadata but that of course was for two textgroups for the same author, where we wanted both to list under the same canonical Latin name. In this case, although the work by Julius Valerius (phi1672) is a translation of Pseudo-Callisthenes (tlg1386), there are two separate identifiers and two separate canonical names, and for the moment as far as I know we have not tried to represent that in any meaningful way in the TOC. This is probably worth a new issue.

In terms of URNs, as we currently have no edition of either work in the catalog metadata and they would both be going directly into Scaife, I would tentatively suggest:

For Julius Valerius: phi1672.phi002.ogl-lat1 (the URN structure we are using for new Latin texts in Scaife)
For the Greek edition of Pseudo-Callisthenes tlg1386.tlg001.1st1K-grc1 (since we have no edition in Perseus at all).
For the English translation: tlg1386.tlg001.1st1K-eng1
For the Latin "direct" translation: tlg1386.tlg001.1st1K-lat1

From the conversation, above, however, it does not appear we will be including the Latin text at this point.

@lcerrato
Copy link
Collaborator Author

lcerrato commented Mar 3, 2023

So the phi1672 should probably be removed for now due to the different license and the other files should adopt the recommended naming conventions above.

@lcerrato
Copy link
Collaborator Author

@KodieBastian1

I have started on this file with several sets of comments. It should serve as an example of what needs to happen with the other files.

Here are my initial comments/recommendations.

  1. Header/Credits: Lots of missing or incomplete information on credits and availability. No license policy was found on Attalus.org. We would prefer there was explicit permission on reuse there. Please note comments/questions in header pertaining to other credits as well as suggested credits. As all credits appear in the Scaife Viewer, this is a publication prerequisite.
  2. Header/Other: I have corrected other header inconsistencies such as EpiDoc version, ref targets, refsDecl, change log, etc.
  3. Div tags: I have corrected the div tags for consistency. NB "xml:base" is missing and should be applied.
  4. EpiDoc errors: all <gap> tags require a reason typically "lost" -- I have started to add these but also note that the rendering should be captured. Any gap should be within the structure, not outside of it. (eg within the <p> and <div> tags, not apart from them.)
  5. Quotation marks: There are quotation marks which are not standard. = <q rend="double"> or <q rend="double"> at a minimum
  6. P containers: There are dummy or empty <p>tags (see comment on 1.2.3 for example).
  7. Indentation: Distinguishing between an indented paragraph on the page versus a container paragraph is helpful for editing purposes but not required. It can be done in conjunction with cleanup of dummy paragraphs in point 5 above.

@lcerrato
Copy link
Collaborator Author

Quotation marks are too inconsistent and irregular to automatically convert to <q> tags.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted new contribution use for adding new files Special case requires special handling
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants