Numbering under <ls>...</ls> portions of Koeln MW99 data #95

Andhrabharati · 2021-01-05T11:19:42Z

The main division of a book is always marked in small Roman numbers [ivxc] in MW99 print, followed by a comma for further numbers in Indo-Arabic [0-9].

(a) correct "([ix]). ([0-9])" with "\1, \2" : 55 occurrences (proofing errors)

(b) correct "([^r])iv. ([0-9])" with \1iv, \2 : 7 occurrences (proofing errors)
;; while checking for "v." cases in different combinations, found puṇyamaheśākhya having the ls marked wrongly - "<s1 slp1="divya">Divya</s1>'v." instead of "<ls>Divyâv.</ls>".

(c) Pāṇ. is the largest deviant, having changed these (either unwittingly or deliberately) to Indo-Arabic numerals [0-9] (and followed by a dash or erroneously marked/tagged) in the Koeln data! All these can be by found by "Pāṇ. ([0-9])" & "(.?)[0-9]-(.*)</l" and appropriately corrected : over 8000 occurrences.

(d) Also seen that at many places i and 1 were taken wrongly (one for another). In some fonts (0,1,2,6,8) look smaller within the 'x-height' (Marcis should be knowing this term as he worked on Fonts as well!!), and (3,4,5,7,9) look bigger extended towards the bottom of base-line. (These all could be found by checking for isolated i, " i " places.)

Seen that a part of this topic (limited to my #c, Pāṇ.) was raised by Marcis (@gasyoun) earlier as issue #63 and also got closed, but not sure what was the conclusion there. The file I got few days back from Jim has all these uncorrected.

@------------------
There sure are different styles adopted in different books, and in my opinion we should not strive to make them uniform (or normalised) in data portion.

Of course, we can (and should) have such normalization (done internally) for search purposes, as was proposed by Dhaval (@drdhaval2785 ) elsewhere

Andhrabharati · 2022-06-20T14:46:55Z

@funderburkjim

Just seen that except the point (c) which you had resolved sometime later, other points [(a), (b) and (d)] still need to be attended.
Hope you would look at this and do the needful soon, to close the issue.

Andhrabharati · 2022-06-22T15:07:02Z

I've corrected the above points (a, b & d) in my file now.

As apparently no one else seems to have time (or interest ?) to look at these observations, closing this issue now.

Andhrabharati mentioned this issue Aug 7, 2021

Links for Panini in MW sanskrit-lexicon/csl-websanlexicon#22

Closed

Andhrabharati mentioned this issue Sep 19, 2021

MW: Panini references sanskrit-lexicon/csl-orig#519

Closed

Andhrabharati closed this as completed Jun 22, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Numbering under <ls>...</ls> portions of Koeln MW99 data #95

Numbering under <ls>...</ls> portions of Koeln MW99 data #95

Andhrabharati commented Jan 5, 2021 •

edited

Loading

Andhrabharati commented Jun 20, 2022

Andhrabharati commented Jun 22, 2022

Numbering under <ls>...</ls> portions of Koeln MW99 data #95

Numbering under <ls>...</ls> portions of Koeln MW99 data #95

Comments

Andhrabharati commented Jan 5, 2021 • edited Loading

Andhrabharati commented Jun 20, 2022

Andhrabharati commented Jun 22, 2022

Andhrabharati commented Jan 5, 2021 •

edited

Loading