Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

o vs O Corrections in PWG, Part 2 #137

Closed
zaaf2 opened this issue Nov 3, 2015 · 22 comments
Closed

o vs O Corrections in PWG, Part 2 #137

zaaf2 opened this issue Nov 3, 2015 · 22 comments
Assignees
Labels

Comments

@zaaf2
Copy link

zaaf2 commented Nov 3, 2015

#130 continued.

Checking the divergences detected by the o_vs_O method of highest probability (one dictionary in first word and more dictionaries in second word), as applied to PWG, at the file http://drdhaval2785.github.io/o_vs_O/output1/PWG.html.

@zaaf2
Copy link
Author

zaaf2 commented Nov 3, 2015

103. रराठ्य → रराट्य
OCR error.
image

MW:
रराट [p= 868] : n. = ललाट, the forehead, brow
रराटी : f. id. BhP. [L=175290]

104. राजशूक ― राजशुक MW,PW,SHS,WIL,YAT
Typographical error in the PWG printed edition. Only शुक gives the appropriate sense.
PW *राजशुक [L=93533] [p= 5181-1] m. eine Papageienart mit rothen Streifen an Hals und Flügeln Râǵan.19,113.
PWG: शुक [L=100072] [p= 7-0235] … — a) Papagei
MW:
राज-शुक [p= 874] : m. a kind of parrot (with red stripes on the neck and wings) L. [L=176675]
शुक [p= 1079] : m. (prob. fr. √1. शुच्, and orig. " the bright one ") a parrot RV. &c [L=218749]
शूक [p= 1085] : mn. (g. अर्धर्चा*दि ; derivation doubtful) the awn of grain; mn. a bristle, spicule, spike (esp. the bristle or sharp hair of insects &c ) W. [L=220137]
शूका : f. scruple, doubt L. (…) f. the sting of an insect (cf. above ) , anything that stings or causes pain Suṡr. Car. [L=220145]; (…)

105. वाङ्घ् ― वङ्घ् (→वाङ्क्ष्)
OCR error, different words, change to वाङ्क्ष्
image

PWG: वाङ्घ् [L=89427] [p= 6-0886] वाङ्क्षति (काङ्क्षायाम्) Dhâtup. 17, 17. — Vgl. काङ्क्ष् und वाञ्छ्.
MW:
वङ्घ् cl.1 A1. वङ्घते, to go; to set out; to move swiftly; to blame or censure Dhātup. iv, 36.
वाङ्क्ष् [p= 936] : (connected with √ वाञ्च्क् cf. काङ्क्ष्) cl.1 P. वाङ्क्षति, to wish, desire, long for Dhātup. xvii, 17. [L=189801]

106. वार्वरक ― वर्वरक
No change. Different, connected words. A variant of बार्बरक
PWG: वार्वरक [L=90415] [p= 6-0951], (बार्बरक) adj. von वर्वर (बर्बर) gaṇa धूमादि zu P. 4, 2, 127.
MW:
वर्वरक [p= 926] : m. (more correct बर्ब्°) N. of a man Mudr. [L=187836]
बार्बरक [p= 728] : mfn. (fr. बर्बर) g. धूमा*दि. [L=144434]

@gasyoun
Copy link
Member

gasyoun commented Nov 3, 2015

No change words will be included in No change list. I guess it's time we start to collect them in a new TXT file, what do you say @drdhaval2785 ?

@drdhaval2785
Copy link
Contributor

drdhaval2785 commented Nov 4, 2015 via email

@zaaf2
Copy link
Author

zaaf2 commented Nov 4, 2015

107. शक्राग्नी ― शक्राग्नि
No change. The form with -ī is the correct dual of m. words in -i. The form with -ĭ is mentioned “im comp.”
image
शक्राग्नि [p= 1045] : m. du. इन्द्र and अग्नि (lords of the नक्षत्र विशाखा) VarBṛS. [L=211134]

108. शरिका ― शणिका
No change. Different words.
PWG: शरिका [L=98150] [p= 7-0097] f. N. eines Palastes Schiefner, Lebensb. 305 (75).
MW: शणिका [p= 1048] : f. Crotolaria of various species L. [L=211796]

109 शुनकचञ्चूका ― शुनकचञ्चुका SKD,VCP
Insufficient elements to reach a conclusion.
It is the name of a plant. Both PWG and SKD are referring to the same source. According to PW, it is a manuscript:
image
चञ्चु f. and चञ्चू f. are variants in the sense “a beak, bill” (MW).
PWG शुनकचञ्चूका [L=100357] [p= 7-0258] f. ein best. Strauch, = क्षुद्रचञ्चु Râǵan. im Çkdr.
SKD शुनकचञ्चुका, स्त्री, (शुनकस्य चञ्चरिव । इवार्थे कन् ।) क्षुद्रचञ्चुक्षुपः । इति राजनिर्घण्टः ॥
MW:

  • क्षुद्र-चञ्चु [p= 330] : f. " having small points ", N. of a plant L. [L=59926]
  • चञ्चु (…) m. the castor-oil plant L.; (…) m. the plant गो-नाडीक (or नाडीच) L.; (…) m. the plant क्षुद्रचञ्चु; f. a beak, bill VarBṛS. Pañcat. Hit. [L=70802].
  • चञ्चू [p= 382] : f. a beak, bill Vop. iv, 31 [L=70822]
  • चञ्चूक [p= 382] : m. = °ञ्चु-पत्त्र Bhpr. [L=70825] ; m. pl. N. of a people (south-west of मध्य-देश) VarBṛS. xiv, 18. [L=70825.1]

110. षडस्र ― षडश्र
No change. A wrong reading, according to MW.
image
MW: षड्-अश्र [p= 1109] : (Cat. ) mfn. hexagonal (w.r. -अस्र &c ) [L=225117]

111. संधनाजित् → संधनजित्
Typographical error in the PWG printed text.
There is no धना; आजित् is unexemplified. √जि is frequently used with धन.
image
MW:
सं-धन-जित् [p= 1144] : mfn. (= धनसं-जित्) winning booty together, accumulating booty by conquest AV. [L=231582]

  • जित् a [p= 420] : mfn. ifc. (Pāṇ. 3-2, 61) winning, acquiring cf. गो- and स्वर्-जित्, स्वर्ग- &c [L=79195]; (…)
  • जि 1 [p= 420] : (…) to win or acquire (by conquest or in gambling) (…)
  • धन [p= 508] : n. the prize of a contest or the contest itself (lit. a running match, race, or the thing raced for ; हितं धा*नम्, a proposed prize or contest ; धनं- √जि, to win the prize or the fight).
  • धन-जित् [p= 508] : mfn. winning a prize or booty, victorious, wealth-acquiring RV. AV. VS. [L=99285].
  • धनंजय (…) N. of अर्जुन MBh. Hariv.
  • आ- √जि 2 [p= 133] : √ जि (p. -जयत् ; impf. 3. du. आ*जयताम्) to conquer, win RV. ii, 27, 15 AitBr. TāṇḍyaBr. : Desid. p. -जिगीषमाण, trying or desiring to win RV. i, 163, 7. [L=23128]
    आजि-तुर् [p= 133] : mfn. victorious in battles RV. viii, 53, 6. [L=23119]

112. समुद्ररसन ― समुद्ररशन
No change. Variant form.
PWG: समुद्ररसन [L=105923] [p= 7-0729] adj. (f. आ) meerumgürtet: die Erde Ragh. 15, 83. Varâh. Bṛh. S. 43, 32. f. आ die Erde H. 938, Schol.
MW: सम्-उद्र-रशन [p= 1167] : mf(आ)n. (also written -रस्°) sea-girdled (said of the earth) Hariv. Ragh. VarBṛS. [L=235294]

@gasyoun
Copy link
Member

gasyoun commented Nov 4, 2015

I'm afraid documenting in such detail takes more time than needed. Guess half would be 2 more than needed, @zaaf2 You are too good!

@zaaf2
Copy link
Author

zaaf2 commented Nov 8, 2015

113. सांकुची ― सांकुचि
No change. PWG gives the f.
PWG: सांकुची [L=108029] [p= 7-0896] f. = संकोचमत्स्य Çabdar. im Çkdr. शां° und शंकोच° gedr.
MW: सांकुचि [p= 1198] : mf(°ची). (perhaps fr. सं-कुच, but cf. शङ्कुचि) a partic. aquatic animal Bhpr. [L=241201]

114. सारमीति ― सारमिति
Factual error. The ī is not justified (But it is better to get a second opinion).

image

MW: सार-मिति [p= 1208] : m. " measure of all truth ", N. of the वेद L. [L=242908]
In MW मिति comes under √मि. ― °ति is a primary suffix forming f. action nouns.
मिति 2 [p= 816] : f. (for 1. » [p= 815,3]) measuring, measure, weight. VarBṛS. ṠārṅgS. [L=164144]; f. accurate knowledge, evidence MāṇḍUp. [L=164145]
मि 1 [p= 815] : (cf. √3. मा and मी) cl.5. P. A1. (Dhātup. xxvii, 4) मिनोति, मिनुते (...), to fix or fasten in the earth, set up, found, build, construct RV. AV. ṠBr. ṠrS. ; to mete out, measure VarBṛS.; (…)
PW: मि — 1) in den Boden einsenken , befestigen ; gründen , aufrichten ; errichten , bauen ; construiren Çulbas.1,41. — 2) messen. — 3) ermessen , erkennen , wahrnehmen. (...)
[Cf. मी 1 (…) to lessen, diminish, destroy (A. and Pass. to perish, disappear, die) RV. AV. Br. Up. BhP. ― मी 2 [p= 818] : » मन्यु-मी. ― मी 3 [p= 818] : cl.1.10. P. मयति or माययति, to go, move Dhātup. xxxiv, 18 ; to understand]
MW: सार 2 (…) mn. the substance or essence or marrow or cream or heart or essential part of anything, best part, quintessence; (…) mn. the real meaning, main point MW.

According to PWG and PW the meaning attributed to this word is the result of a misunderstanding, explained at Hemachandra’s Abhidhana Chintamani (1964 ed.), 248, but I could not find the passage. If I am not mistaken the etymology quoted by PWG सारं यथार्थं मीयते ज्ञायते ऽनेन could be translated as: “The essence [सारम् i.e. the real thing] is truly measured [and] known through this [Veda]”.

@zaaf2
Copy link
Author

zaaf2 commented Nov 8, 2015

115. सिद्धान्तलक्षणाक्रोड ― सिद्धान्तलक्षणक्रोड
Insufficient elements to reach a conclusion. (Cf. #130 case 46. लक्षणवादरहस्य ― लक्षणावादरहस्य)
The repetition of this same variation confirms that those cases should not be changed, as they may reflect different conceptions of the lexicographers.
MW: सिद्धा*न्त-लक्षण-क्रोड [p= 1216] : m. N. of wk. [L=244606]
MW:
लक्षण [p= 892] : mfn. indicating, expressing indirectly Vedântas. [L=180381] (…); n. (ifc. f(आ).) a mark, sign, symbol, token, characteristic, attribute, quality (…); n. a lucky mark, favourable sign (…); n. accurate description, definition, illustration (…)
लक्षणा b [p= 892] : f. aiming at, aim, object, view Hariv. [L=180434]; f. indication, elliptical expression, use of a word for another word with a cognate meaning (as of " head " for " intellect "), indirect or figurative sense of a word (…)

@zaaf2
Copy link
Author

zaaf2 commented Nov 9, 2015

116. अजिनपत्री ← अजीनपत्री
Print error in SKD.

There is no अजीन.
MW:
अजिन n. the hairy skin of an antelope, especially a black antelope (which serves the religious student for a couch seat, covering &c )
पत्त्र a [p= 581] : n. (and m. Ṡāk. ; ifc. f(आ and ई). )(sometimes spelt पत्र) the wing of a bird, pinion, feather VS. ṠBr. &c [L=114848]

The word is out of Devanagari order in SKD:
image

@zaaf2
Copy link
Author

zaaf2 commented Nov 9, 2015

@funderburkjim Since the cases from 116. to 182 are about “One dictionary in first word and one dictionary in second word”, one has to decide whether they should be worked here or directly from the lists in which the other dictionaries are the first. For example case 142. (PWG मधुकरि x MW मधुकारी): will @gasyoun deal with it as he works the MW list?

@drdhaval2785
Copy link
Contributor

Treat the first dictionary only.
Ignore the second dictionary. It would be the first in its own place.
This is how the HTMLs were generated.

e.g. PWG maDukari - MW maDukArI
Concentrate on PWG only.

Those who intend to cover MW e.g. @gasyoun would handle it in MW, where it would be the first dictionary.

NOTE - Till date we are concentrating only on 'Highest probability' ones only. If @zaaf2 ventures into the second slot of 'Medium probability' it would be great. Sooner or later, we would have to deal them. But in my opinion, it should wait till we exhaust 'Highest probability' of all dictionaries. Then we take up the 'Medium' of all.

@gasyoun
Copy link
Member

gasyoun commented Nov 9, 2015

@drdhaval2785 what you tell does not makes sense. It made sense a year ago, if there would be only 1 person working and even than partly. I wanted to comment in detail, but let me do it later than never. If working with PWG @zaaf2 finds a MW error and ignores it - it does not makes sense. The very way he works is documenting everything and reinventing the wheel does not sounds great. When working quickly, 5 times quicker than I and 10 times quicker than @zaaf2 (because he is too good!) - I agree, ignore everything and run. But we are slower, much slower. So benefiting each other does not hurt at all.

Sure a system is needed. After stating Print error in SKD. one should open a new thread dedicated to SKD. 38 threads total. Whoever comes first - starts first. It takes time to jump from thread to thread, but even more - to reinvent and re-research.

I disagree that Highest of all dictionaries should come above Medium. First of all because I disagree with the arrangement of words. The general logic was inspired by me, Dhaval made it very well, but I'll have to add the sorting part, because without it hours are spent on shorter words that indeed would have to be treated the last. SCH has thousands of Buddhist Sanskrit words. BHS, that was printed after it - even more, so we need a Prakrit reading linguist to finalize that list
731 kolika kOlIka कोलिक कौलीक BHS,IEG,PE,SCH MW,MW72,PW,PWG,VE. The NO CHANGE list is wanted more than any other enhancement, I think.

Just because Lowest probability (More than one dictionary in first word and it has dictionary under consideration) in http://drdhaval2785.github.io/o_vs_O/output1/SCH.html has
657 aBimardin aBimarDin अभिमर्दिन् अभिमर्धिन् BEN,GST,MW,PW,PWG,SCH MW72, aBimardin more than in one dictionary does not make it less possible an error. Sure it will turn on again if we open the MW72 file. But when will it be, even if MW, the post popular Sanskrit dictionary worldwide, has 5000 headwords with possible errors.

My understanding is that major dictionaries should come first. MW, PW, PWG, AP (without word endings, "cleaned"). High priority, Low priority and Medium (489 uttarasArasvAdinI uttarasArAsvAdinI उत्तरसारस्वादिनी उत्तरसारास्वादिनी SCH ACC shows that if starting with longer words the "easy" cases will be cleaned in months - the rest will take years anyway) after - in this order. Than - the smaller ones, specific language, like Buddhist Sanskrit in BHS and SCH. No idea how to clean PD - it must contain a few hundreds of mistakes of it's own.

drdhaval2785 added a commit to drdhaval2785/SanskritSpellCheck that referenced this issue Nov 10, 2015
drdhaval2785 added a commit to drdhaval2785/drdhaval2785.github.io that referenced this issue Nov 10, 2015
@drdhaval2785
Copy link
Contributor

@gasyoun
If you intend to take up the matter in one go,
the current format of Dictionarywise errors would not be productive.
http://drdhaval2785.github.io/o_vs_O/output3/composite1.html (highest)
http://drdhaval2785.github.io/o_vs_O/output3/composite2.html (medium)
http://drdhaval2785.github.io/o_vs_O/output3/composite3.html (lowest)

They are composite files.
Linking and decoration would take some time.

This would ensure that we are going one by one, ignoring the dictionary to be corrected.

@drdhaval2785
Copy link
Contributor

This would also reduce the burden, and make sure we are not handling the same word again and again in different dictionaries.

@gasyoun
Copy link
Member

gasyoun commented Nov 10, 2015

Good work. 3800 high, 1100 medium, 14000 low - low is far too big. Give me longer words first. No need to have them ordered alphabetically.

drdhaval2785 added a commit to drdhaval2785/SanskritSpellCheck that referenced this issue Nov 11, 2015
@gasyoun
Copy link
Member

gasyoun commented Nov 11, 2015

5 73888 citraSAkApUpaBakzyavikArakriyA <-citraSAkapUpaBakzyavikArakriyA
PW, PWG print error.

In PW print text चित्रशाकापूपभक्षविकारक्रिया is used instead of the wrong चित्रशाकपूपभक्ष्यविकारक्रिया.

citrasakapupabakzyavikarakriya

citrasakapupabakzyavikarakriya2

@zaaf2
Copy link
Author

zaaf2 commented Nov 12, 2015

Re. 5 73888 citraSAkApUpaBakzyavikArakriyA ― citraSAkapUpaBakzyavikArakriyA

No change. Alternative forms.

PWG & PW give an alternative form, as is clear from the entry’s text.

image

PWG:
image

PW:
image

MW:

  • भक्ष [p= 742] : m. drinking or eating, drink or (in later language) food RV. &c &c (often ifc., with f(आ)., having anything for food or beverage, eating, drinking, living upon) [L=147535]
  • (H3) भक्ष्य [p= 742] : mfn. to be eaten, eatable, fit for food (…); n. anything eaten, food (…)
  • शाक 2 [p= 1061] : n. (…) a potherb, vegetable, greens GṛṠrS. Mn. MBh. &c [L=214666]
  • पूप [p= 641] : m. a cake, a sort of bread MBh. R. &c (cf. अपूप).
  • अपूप [p= 56] : m. (cf. पूप), cake of flour, meal, &c RV. &c

@zaaf2
Copy link
Author

zaaf2 commented Nov 12, 2015

I consider my mission accomplished. Now it's up to @funderburkjim to review and install changes.

@gasyoun
Copy link
Member

gasyoun commented Nov 12, 2015

Let there be Jim.

@funderburkjim
Copy link
Contributor

Re '107. शक्राग्नी ― शक्राग्नि'

This is interesting when one contemplates using the PWG lexicon as a source for generating declension tables. For such a purpose, the given headword form 'SakrAgnI' is NOT the stem, but rather the short-i form 'SakrAgni'.

This phenomenon of dual-form as the citation form was encountered in MW. There, we (Peter and Chandrashekar mostly) developed a list of such forms, and used that list in deriving the 'lexical grammar' for nominal forms. See DualPlural.txt.

@funderburkjim
Copy link
Contributor

@zaaf2 re '116. अजिनपत्री ← अजीनपत्री Print error in SKD.'

There IS an ajIna in PD:
अजीन¦ (a-jīna) adj. not subject to old age or decay अजिनमिति चर्म
अजीनमिति ज्यानिहीनम् TattvBi. 21. 8; Bhām. 355. 9 (on ii. 1. 14)

This was discovered from hwnorm1.

Thus, I would demur, and be disinclined to changed SKD.

@funderburkjim
Copy link
Contributor

Here are the harvested changes:

103. रराठ्य -> रराट्य : typo
104. राजशूक -> राजशुक : print error
105. वाङ्घ् -> वाङ्क्ष् : typo
111. संधनाजित् -> संधनजित् : print error
114. सारमीति -> सारमिति : print error

@zaaf2 might check the list and be sure I've not made mistakes.

Will now go about installation.

@funderburkjim
Copy link
Contributor

Corrections installed. Can revise if @zaaf2 finds mistakes in processing.
Also closing issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants