Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(phi0474.phi027.perseus-lat1.xml) CTS and EpiDoc conversion #378

Merged
merged 8 commits into from Jan 4, 2021

Conversation

PonteIneptique
Copy link
Member

Hi there :)
I needed the In Pisonem for my PhD, so here it is with a step by step.

Regex : Revision (\d\.\d+)  ([\d+\-\/ \:]+)\s+(\w+)\n([^\n]+)\n+
Replacement: <change n="\1" when="\2" who="\3">\4</change>\n
Regex : `when="(\d+)[-/](\d+)[-/](\d+) ([\d:]+) "`
Replacement: `when="\1-\2-\3T\4"`
Regex: `<milestone n="(\d+)" unit="chapter"/>`
Replacement: `</div><div n="\1" type="textpart" subtype="chapter">`

Manual correction of missing `</p>`s and `<p>`s
@lcerrato lcerrato changed the title Convert/phi474.027 (phi0474.phi027.perseus-lat1.xml) CTS and EpiDoc conversion Jan 4, 2021
bumped URN and regularized format of biblio info
@ghost
Copy link

ghost commented Jan 4, 2021

Global

Changed Status
coverage 0.12
metadata_passing 1
metadata_total 1
nodes_count 42
texts_passing 1

Words

Changed Status
lat +10935

Units

Changed Status
./data/phi0474/phi027/__cts__.xml New
./data/phi0474/phi027/__cts__.xml Passing
./data/phi0474/phi027/phi0474.phi027.perseus-lat1.xml Passing

Hook UI build recap

@lcerrato
Copy link
Contributor

lcerrato commented Jan 4, 2021

@PonteIneptique
Still looking at this. I have to bump it and bring the header into the most up to date conventions.
The containers we use for Cicero are the sections, not the chapters, so that needs a major revision.

@PonteIneptique
Copy link
Member Author

The containers we use for Cicero are the sections, not the chapters, so that needs a major revision.

Then I shall restart the work from the change part of things. Do you care rend about anything such as <p rend="cont"> or should I just move from milestone section to div ? I'll do that tomorrow most probably

@lcerrato
Copy link
Contributor

lcerrato commented Jan 4, 2021

@PonteIneptique
Hi, I've already made header changes and other formatting changes. We do some basic things like capture indents and so forth. So it would be better for me to finish this review for now. Otherwise we are just going to introduce conflicts I think.

@PonteIneptique
Copy link
Member Author

Hmm, I would rather start from scratch than the other way around if you don't mind. Putting back the chapter into milestone and the section into div will be horrible to do. Worst case, I can copy your header to the file when I am done on the new branch... But I would not merge this if you want to have section :)

@lcerrato
Copy link
Contributor

lcerrato commented Jan 4, 2021

@PonteIneptique
Actually, I've done it for other Cicero texts — my student did the same thing on earlier one so the changes were made then. This is fairly short. My changes are deeper than the header, such as indentation for ease of reading. Let me finish (since I've done a lot already) and then if you spot anything you can revise.
The problem with any file bumping is that we cannot easily diff.

@ghost
Copy link

ghost commented Jan 4, 2021

Global

Changed Status
coverage 0.11
metadata_passing 1
metadata_total 1
nodes_count 42

Words

Changed Status
lat +10935

Units

Changed Status
./data/phi0474/phi027/__cts__.xml New
./data/phi0474/phi027/__cts__.xml Passing

Hook UI build recap

@lcerrato
Copy link
Contributor

lcerrato commented Jan 4, 2021

@PonteIneptique
There were numerous incorrect breaks here — that's something we look out for during conversion. If there is no hard break, sometimes that means the sections or chapters were inserted incorrectly, and that's quite common in these Latin texts. Cicero was not always professionally entered, so there are many errors to look for.
There was also a little beta code and some quotation marks. I'll double check the word count to be sure that I have it right.

@ghost
Copy link

ghost commented Jan 4, 2021

Global

Changed Status
coverage 0.14
metadata_passing 1
metadata_total 1
nodes_count 160
texts_passing 1
texts_total 1

Words

Changed Status
lat +21857

Units

Changed Status
./data/phi0474/phi027/__cts__.xml New
./data/phi0474/phi027/phi0474.phi027.perseus-lat2.xml New
./data/phi0474/phi027/__cts__.xml Passing
./data/phi0474/phi027/phi0474.phi027.perseus-lat2.xml Passing

Hook UI build recap

@lcerrato
Copy link
Contributor

lcerrato commented Jan 4, 2021

Bumped URN and removed old files.

@ghost
Copy link

ghost commented Jan 4, 2021

Global

Changed Status
coverage 0.12
metadata_passing 1
metadata_total 1
nodes_count 118
texts_passing 1

Words

Changed Status
lat +10922

Units

Changed Status
./data/phi0474/phi027/__cts__.xml New
./data/phi0474/phi027/phi0474.phi027.perseus-lat1.xml Deleted
./data/phi0474/phi027/phi0474.phi027.perseus-lat2.xml New
./data/phi0474/phi027/__cts__.xml Passing
./data/phi0474/phi027/phi0474.phi027.perseus-lat2.xml Passing

Hook UI build recap

@lcerrato lcerrato merged commit 8423f3e into master Jan 4, 2021
lcerrato added a commit that referenced this pull request Jan 14, 2021
 into typo-fixes

* 'master' of https://github.com/PerseusDL/canonical-latinLit: (56 commits)
  (phi0474_cicero) adding metadata files and updating existing data #380
  (phi0474_cicero) adding metadata files and updating existing data #380
  (phi0474_cicero) adding metadata files and fixing header #380
  (convert_phi474.027) removed old file #378
  (convert_phi474.027) revisions to #378
  Update __cts__.xml
  (phi0474.phi027) `__cts__.xml` and Epidoc compliancy
  (phi0474.phi027) Added CapiTainS refsDecl and fixed missing milestones for chapter
  (phi0474.phi027) Add chapter support
  (phi0474.phi027) revisionDesc update
  (phi0474.phi027) revisionDesc update
  (typos) adding new license to repo
  (phi0690.phi003.perseus-lat2) small typo at 9.730
  (typos_errors) errors at 62.49 and 62.50
  Update phi1017.phi005.perseus-lat2.xml
  corrections7
  Update phi0690.phi003.perseus-lat2.xml
  corrections6
  corrections5
  corrections3
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants