-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
o
vs O
Corrections in PWG, Part 1
#130
Comments
As mentioned by PW, the Tandya Brahmana 21,2,5 has आच्यादोह : According to MW आच्या in आच्यादोह (with ā) comes from the Vedic ind. p. (aka gerund) of आच् (< आ-√अच्), instead of the regular ind. p. with ă आच्य. Here the MW screenshot: Regarding the Vedic ind.p., v. MacDonell, A Vedic Grammar for Students: |
Factual error. |
Factual error, corrected by PWG itself in the section „Verbesserungen und Nachträge“ (vol. 5):
|
@funderburkjim is Devanagari OK with you? |
Good to see that the raw data is now put to analysis and corrections are
pouring in. Good work.
|
@drdhaval2785 I should have mentioned in the opening of this issue that its object is an analysis of the data contained in the file http://drdhaval2785.github.io/o_vs_O/output1/PWG.html, |
False positive. केशरिन् and केसरिन् are alternative forms of the same word. Cf. MW:
|
False positive MW: SCH: |
Factual error. Śaṅkara’s work is called उपदेशसाहस्री MW: |
Acceptable alternative forms. PWG: MW: |
Factual error. |
Factual error. The change should include ऐन्द्रावरुण and ऐन्द्रावारुण (both forms incorrectly mentioned in PWG with first ă instead of ā) PWG: MW: The Tandya Brahmana 8.8.6 has ऐन्द्रावारुण : |
Factual error. MW: |
@Shalu411 ever heard such Marathi word as in 17.? |
Regarding case 19 (क्रोलायन → क्रौलायन), now I think it is better to preserve the reading क्रोलायन. It is an attested form, mentioned as such by MW and PW. I think it is important to preserve as much as possible the correspondence between the digital and the printed version, which should be treated as a historical document, with its imperfections and all. |
@gasyoun Re |
Re |
As could be observed at #131 (Re 247. niHzAmam -> niHzamam), there is an OCR error under PWG निःषम (due to the poor quality of the printed text): दुःपमम् → दुःषमम् PWG:
|
On |
Re This was concluded to be a NO-CHANGE. While not disagreeing with the choice, the thought occurs that we should consider the two spellings to be variants. Currently there is no provision in the dictionaries to handle variant spellings. If there were a system for identifying 'equivalent' spellings, this would be such a case. |
Re: The form of the record (having the parenthetical (विचित्वारा) following the headword) may be a pattern using in PWG to identify alternate spellings. Everyone should realize that we are now applying to other dictionaries (PWG in this case) the kind of scrutiny that was applied to MW several years ago. One upshot of this scrutiny is that we see things where additional markup would help to expose (and therefore make useable) features of the dictionary. In particular, adding markup to identify alternate spellings, as here , would probably add to the utility of the dictionary. To give an idea of what I mean by 'additional markup', here's a seat-of-the-pants possibility for addtional markup in this case (I'm adding markup to a record of pwg.txt):
Note that only markup (XML-tags) has been added - the text has not been changed. With such markup, programs could make use of the markup, for instance, to generate a list Just a thought. |
Re. 9. उत्पलवती ― उत्पलावती Acc. to the Smith digitization of Mahabharata, utpalAvatim occurs at 06010033. |
Just a thought will remain such if no Jim around. But anyway - that's not top priority. |
re '® is a markup for plants in PW.' @gasyoun is right. This was markup that Thomas put in the original digitization. This feature is documented in the 'pw-meta.txt' file, which is part of the pwtxt.zip , one of the pw download items. Incidentally, in MW this would be marked as |
Regarding Am trying to think how to add markup to the digitization. Current idea is that the markup should be simple such as :
Such changes should also be documented in a file for each dictionary, the file being called something like pwg_printchange.txt . This is a more neutral-sounding name than 'corrections_factual', The displays can use the markup to provide a brief indication that the digitization intentionally differs from the print edition, and link to the printchange.txt file. The printchange file can have the free form of current corrections_factual, and in particular For cases where the change is to a headword, we could also take this into account via the hw2 file, The above sounds like it might have the virtues of
|
@zaaf2 Would you elaborate on your 'crowdsourcing' idea? |
How about |
Suggestion for crowdsourcing the work on @drdhaval2785's lists. In the next screen we would have something like this: |
@zaaf2 Such a well-presented suggestion! Would you transfer it to another issue, so that it may |
re From 81, you've also identified 'vah/vAh' as a similar phenomenon . There you use the term 'strong form', which may be a better way to think of it than 'nominative singular'. This is similar to the 'vat/vant' spelling variation. So, maybe these can be tailored as additional alternate form spelling rules for hwnorm1. |
@zaaf2 re |
@gasyoun |
re @zaaf2 Agree? |
@zaaf2 Maybe |
re देविका f. is the name of the river. दाविक is the adjective, “(water) coming from the river देविका”. दाविकाकूल itself is also an adjective, “(rice etc.) coming from the banks (कूल) of the देविका”. I was not sure about the change because I thought the first member of the compound was the adj. दाविक, and I could not explain the second ā in दाविकाकूल. Now I see my doubt is unfounded. As one can see in the commentary to Pāṇini’s rule, the adj. दाविकाकूल comes directly from the Tatpuruṣa compound देविकाकूल n. (which may be translated as “bank of the देविका river”). When देविकाकूल as a whole is transformed into the adjective by an (absorbed) -a suffix (v. Whithey 1208.h), then the special rule in question takes effect, and दे- is changed to दा-, the rest of the word remaining unchanged. |
@gasyoun I am not aware I used the expression |
@funderburkjim re |
Re: 71. आज्ञाप्ति ― आज्ञप्ति No change. OCR error. I think this should be changed, as an "OCR error" (typo). As MW has AjYapti but not AjYApti. |
Corrections now installed. |
This issue is about an analysis of the data contained in the file http://drdhaval2785.github.io/o_vs_O/output1/PWG.html,
generated by the o_vs_O method of highest probability (one dictionary in first word and more dictionaries in second word), as applied to PWG.
OCR error.
The text was updated successfully, but these errors were encountered: