Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

o vs O Corrections in MW #127

Closed
gasyoun opened this issue Oct 3, 2015 · 36 comments
Closed

o vs O Corrections in MW #127

gasyoun opened this issue Oct 3, 2015 · 36 comments
Assignees
Labels

Comments

@gasyoun
Copy link
Member

gasyoun commented Oct 3, 2015

#45 continued with a one year break. http://drdhaval2785.github.io/o_vs_O/output1/MW.html Highest probability (One dictionary in first word and more dictionaries in second word) first.

  1. daRqAjinika -> dARqAjinika word grammaticaly related (base form daRq), meaning virdhization supports that they are related, but basing on the meaning the virdhization is lacking in the original (printed) MW form, so a factual print error.

darqajinika

  1. dIpaKori -> dIpaKorI PWG quotes SKD, SKD links to dIpakUpI, where 2 dīpakhorī again is given, so with KorI and not Kori as in MW. MW quotes Lexicographers, that means Indian authors and has quoted wrongly. The printed MW has it right, it's an OCR error.

dipakori

@zaaf2
Copy link

zaaf2 commented Oct 3, 2015

@gasyoun a factual print error.

Do you mean an OCR error? दाण्डाजिनिक in the printed edition seems correct to me. The word comes from दण्डाजिन -- with a short first a -- (“n. sg. staff and dress of skin as mere outward signs of devotion, hypocrisy, deceit Pāṇ. 5-2, 76”), which is a Dvandva compound (no vṛddhi here), from दण्ड m. staff + अजिन “n. the hairy skin of an antelope, especially a black antelope (which serves the religious student for a couch seat, covering &c”). दाण्डाजिनिक is formed by secondary derivation with the suffix –ika, which requires the vṛddhi-strengthening of the initial syllable. Cf. Whitney’s Sanskrit Grammar (1204 and 1222 j):

image

(...)
image

@gasyoun
Copy link
Member Author

gasyoun commented Oct 3, 2015

@zaaf2 thanks for the detailed answer with quoting, love the style. Do you know of https://en.wikisource.org/wiki/Page%3ASanskrit_Grammar_by_Whitney_p1.djvu/483 at https://en.wikisource.org/wiki/Sanskrit_Grammar/Chapter_XVII#418?
Yes, OCR now I see it - I was looking and did not saw it before. At http://drdhaval2785.github.io/o_vs_O/output1/MW.html you can see line 50 daRqAjinika dARqAjinika दण्डाजिनिक दाण्डाजिनिक MW AP,PW,PWG,SCH,SHS,VCP,WIL,YAT.

@gasyoun
Copy link
Member Author

gasyoun commented Oct 3, 2015

3. dfptabAlaki -> dfptabAlAki 1st argument, MW was published after PWG and many words were "taken", but in many cases wtih same mistakes as in original or with new ones. 2nd argument, Dṛptabālāki is more popular form https://www.google.ru/search?q=d%E1%B9%9Bptab%C4%81laki&ie=utf-8&oe=utf-8&gws_rd=cr&ei=-oUPVrrpCIHSyAO_vLHQBw#newwindow=1&q=d%E1%B9%9Bptab%C4%81l%C4%81ki and MW's form is not met outside MW.

dfptabalaki

@gasyoun
Copy link
Member Author

gasyoun commented Oct 4, 2015

MW.html
4. deuliya -> deüliya (non o_vs_O) 1st, @funderburkjim, can we track all the 2 vowel following each other? I thought we did it before, but now I remember we did it only with 3 following consonants. 2nd, the ü should be at least in key2, because it's there in the book and is lost. If all the umlauts are lost in the OCR, that's a sad story. 3rd, at the end of the article, there is the word Kshitïṡ that contains ï - an Umlaut that is not there in the book, but that seems to be used there to denote MW sandhi markup. Is this system widely used, or only sparely, any clue?

deuliya

P.S. Prakrit -> Prākr.
grāma -> Grāma (?)

@gasyoun
Copy link
Member Author

gasyoun commented Oct 4, 2015

5. devadutI -> devadUtI OCR error

devaduti

@gasyoun
Copy link
Member Author

gasyoun commented Oct 4, 2015

6. dyOSaMsita -> dyOsaMSita

dyosamsita

Should we ignore the original MW's and use just instead?

@gasyoun
Copy link
Member Author

gasyoun commented Oct 4, 2015

7. dvadaSAra dvAdaSAra OCR error, not only the original is dīrgha, it is with accent as well, which is totally lost, @funderburkjim?

dvadasara

@zaaf2
Copy link

zaaf2 commented Oct 5, 2015

6. dyOSaMsita -> dyOsaMSita

By the sense of the word (from √शो) there is clearly an error in the printed edition. But the correct form should be dyOzaMSita, since स् becomes ष् after vowel (except a/ā) if followed by vowel, त् थ् न् म् य् or व्

Does the digital edition output make no distinction between ṉ and ṃ in the original? I find under SaMsita:

SaMsita [p= 1044] : mfn. (often confounded with saM-Sita » saM- √So) said, told, praised, celebrated Pañcat. praiseworthy ib. [L=210855]

The distinction is clear only in the printed edition:

image

Is there no way to make the distinction in the digital edition, even when the output is in Devanagari?

@gasyoun
Copy link
Member Author

gasyoun commented Oct 5, 2015

@zaaf2 Is there no way to make the distinction in the digital edition - some coding might help, but no single click solution I see. Because I'm afraid MW was not guided by a bulletproof logic behind it, it's based on etymology, I guess. Does the digital edition output make no distinction between ṉ and ṃ in the original? - seems not, and that's sad indeed.

7. In vyoma 2 [p= 1041,3] [L=210423] What does the * mean in daśā*rha? Why not just Daśārha with some additional markup, no note that the ā is one of 4 MW sandhi type marked. (H1) vyoman 2 [p= 1041,2] [L=210350] m. (for 1. » [p= 1029,1] ; accord. to Uṇ. iv, 150 fr. √ vye accord. to others fr. vi- √av or √ ve) heaven, sky, atmosphere, air m. -> n.
Wilson: n. sky
Bopp: n. coelum
PWG: n. Himmel
PWK: n. Himmel
MW 1872: n. sky
Apte: n. sky
Macdonell: n. sky

vyoma

@funderburkjim
Copy link
Contributor

  • Regarding 'daRqAjinika -> dARqAjinika ' Agree this is a typo in digitization.
    Appreciated @zaaf2 's explanation of the reason 'dA...' occurs. Using hwnorm1 display convirms that the 'dA...'
    spelling occurs in several dictionaries.
  • dIpaKori -> dIpaKorI Agree this is typo.
  • dfptabAlaki -> dfptabAlAki . Agree with change, and that this is MW print error.
    Another confirm of Gasyoun's copying theory is
    • MW has m. N. of a man with the patr. gArgya ṠBr.
    • PWG m. N. pr. eines Mannes mit dem patron. Gârgja Çat. Br. 14, 5, 1, 1.
      So MW text is essentially a translation to English of PWG.
  • deuliya -> deüliya This does not have a good solution with the current coding of MW.
    Here is the underlying record of MW in the mysql database:
<H1><h><hc3>110</hc3><key1>deuliya</key1><hc1>1</hc1><key2>deuliya</key2></h>
<body> <lex>n.</lex> <p><as0 type="ns">Pra1krit</as0><as1>Prakrit</as1>
~for~<s>devakulya</s>?</p> <c>N._of_a_<as0>Gra1ma</as0><as1><s>grAma</s></as1></c> <ls>Kshiti7s3.</ls> 
</body>
<tail><mul/> <MW>061671</MW> <pc>492,2</pc> <L>95498</L></tail>
</H1>
  • key2 is coded with SLP1. There is no representation of umlaut in SLP1
  • I added 'deüliya' to the text of the record. This is a partial solution.
  • Then there is the question of the literary source reference, which as you see in the record is
    spelled Kshiti7s3. Now as I recall, that '7' in 'i7' was generally used in MW by Thomas to
    indicate, in Sanskrit words, that the print showed a circumflex over the vowel: î . This
    situation occurs notably in literary source abbreviations, as here, in the <ls> tag.
    MW used this notation in his IAST to indicate two things:
    • the vowel is a long vowel, so in terms of Sanskrit IAST spelling, this would normally be shown
      with the macron ī.

    • The circumflex also indicates that this long vowel is not just any long vowel, but it is a long
      vowel resulting from vowel sandhi combining (long-or-short-vowel x)+(long-or-short-vowel same x)

      However, elsewhere Thomas uses x7 to indicate that x has the umlaut diacritic, as in German
      words.

  • So, in general, there is a question as to how, in a display, an 'x7' (coded in Thomas Anglicized
    Sanskrit) should be displayed.
  • The manner of display, in the current Cologne displays, of 'x7' is governed by details located in
    the as_roman.xml transcoder file which the particular display uses. Before this discussion,
    the as_roman.xml file used by MW for displaying the <ls> tag transcodes 'x7' to the unicode
    for 'x-umlaut'.
  • As an experiment, I changed the as_roman.xml to make 'i7' = i-circumflex. This now changes
    the display of Kshiti7s3 to have a circumflex over the 'i'. This change just affects the MW
    displays.
  • I hope this change doesn't break anything elsewhere in MW displays. Probably not.
  • regarding grāma -> Grāma (?) in the same headword. Note the record coding above:
<as0>Gra1ma</as0><as1><s>grAma</s></as1>
  • For such Sanskrit words, displayed in IAST in print, Thomas coded them in AS (Anglicized
    Sanskrit, such as Gra1ma, with the capitalization preserved. At some point in the process of
    'improving' the markup, I added to these original codings an 'slp1' translation, such as grAma.
  • The displays for MW are written to render the 'slp1' translation into the user's choice of output,
    When that choice is IAST, the display becomes grāma, with loss of capitalization.
  • It would be
    possible to revise the display program to use as_roman.xml based upon the <as0> contents
    when displaying in Roman Unicode. I am not eager to undertake this program revision, but
    if someone wants to revise the code, I would likely be glad to install it at Cologne.

@funderburkjim
Copy link
Contributor

  • devadutI -> devadUtI OCR error Agree
  • dvadaSAra -> dvAdaSAra Agree. Key2 also changed to show accent: dvA/daSAra

@funderburkjim
Copy link
Contributor

re vyoma 2 [p= 1041,3] [L=210423] What does the * mean in daśā*rha ?

This is closely related to the discussion if 'i7', and also the discussion of 'Gra1ma'.
Here is the database coding in question.

<as0>Das3a7rha</as0><as1><s>daSA<srs/>rha</s></as1>
  • In <as0>, we see the AS coding 'a7' of the textual a-circumflex
  • In <as1>, this 'a7' has been coded as <srs/> (srs = simple replacement sandhi, or some such
    acronym)
  • In the display, the content of the <as1> element is used. And, the current display renders
    <srs/> as an asterisk, whatever the output choice of the user.

@funderburkjim
Copy link
Contributor

re dyOSaMsita -> dyOsaMSita Agree with the change.

  • The print agrees with dyOSaMsita, so its not an OCR error.
  • MW : (dyO-) mfn. impelled or incited by heaven AV. x, 3, 25
  • dyOSaMsita only occurs in MW
  • dyOsaMSita occurs in PW, PWG
  • PWG (of dyOsaMSita): (द्यौ = द्यो + सं°) adj. vom Himmel getrieben Av. 10, 5, 25.
  • Google translate of pwg: driven from the sky
  • @zaaf2 reasoning summary:
    • saMSita from sam+So, to urge, excite, speed. make ready, prepare RV. AV.
    • Samsita from Sam : to praise
    • saMSita is closer to 'driven' or 'incited or impelled'
    • Thus, saMSita must be correct
  • Comparison to PWG suggests MW copied from PWG , with this error.
  • minor question re citation detail: mw AV 10,3,25 and PW 10,5,25 - Did MW copy this
    wrong also?
  • Does anyone know how to consult a version of Atharva Veda, to see which was really used in
    the cited verse? This could provide basis for answering is it 3 or 5.
  • Regarding, by sandhi, it should be 'dyO-zaMSita' - Given the PWG spelling, my suspicion is there
    is some special sandhi reasoning that supports 'dyO-saMSita'. Finding AV reference would also
    provide evidence.
  • This probably deserves an entry in corrections_factual.

@funderburkjim
Copy link
Contributor

Regarding ṉ and ṃ : In the Cologne digitization, I think this distinction is lost - the two are treated the same, as anusvAra.

I'm not aware of this distinction in Devanagari. Was this distinction introduced by European scholars?

@funderburkjim
Copy link
Contributor

I'll install the above corrections tomorrow.

@gasyoun
Copy link
Member Author

gasyoun commented Oct 6, 2015

@funderburkjim what about deüliya and similar cases, distinction lost as well? This is restorable, I guess and I would do it, if you agree to implement.

@zaaf2
Copy link

zaaf2 commented Oct 6, 2015

@funderburkjim Was this distinction introduced by European scholars?

The signs ṉ and ṃ are used by MW to mark the phonetic distinction between what he calls a True Anusvára and a Substitute Anusvára (a distinction not made in Devanagari). The first is a nasalized vowel, with no accompanying consonantal closure; the second represents, by mere substitution, the five Sanskrit nasal consonants. V. MW Grammar (6 a, b):

image

image

Whitney also makes this distinction (Sanskrit Grammar 73.c):

image

@zaaf2
Copy link

zaaf2 commented Oct 6, 2015

@funderburkjim Does anyone know how to consult a version of Atharva Veda, to see which was really used in the cited verse?

The word is not found in 10.3.25.

In 10.5.25 there is पृथिवीसंशित, which MW has as: "mfn. impelled by the earth AV. [L=128517]", but which in the consulted edition is translated as "praised on the earth".

I find द्यौसंशित at AV 10.5.27, which is there translated as “praised in the heavenly region”:

image
image
(https://archive.org/stream/ATHARVAVEDAVOL1OF2/ATHARVA-VEDA-VOL-1-OF-2#page/n773/mode/2up)

In fact, the word संशित is repeated in all the verses from 10.5.25 to 10.5.35 and in the consulted edition has been consistently translated as “praised”, the sense best adapted to all instances (so it seems to me): “praised on earth” (10.5.25), “praised on the atmospheric region” (10.5.26), “praised in the heavenly region” (10.5.27), “praised in the regions” (10.5.28), “praised in your desirable enterprise” (10.5.29), “praised in the attainment of Rigvedic Knowledge” (10.5.30), “praised in the performance of Yajna” (10.5.31), “praised in the advancement of medical affairs” (10.5.32), “praised in the waters” (10.5.33), “praised in agriculture” (10.5.34), “praised in vitality” (10.5.35). In all these instances it would hardly be possible to translate संशितः as “impeled by”.

In this case, according to MW (L=210855), the word should have been written as शंसित. I conclude that MW’s द्यौशंसित (from √शंस्) is the correct form of the word as it is used in the Atharva Veda, but that the meaning “impelled or incited by heaven” (translated from PWG) is incorrect.

@zaaf2
Copy link

zaaf2 commented Oct 6, 2015

@funderburkjim Regarding, by sandhi, it should be 'dyO-zaMSita' - Given the PWG spelling, my suspicion is there is some special sandhi reasoning that supports 'dyO-saMSita'.

I was wrong. The rule I mentioned about the change of स् to ष् applies to internal Sandhi. The formation of compound words follows the general rules for external combination (v. Whitney 1249). But in the Vedic language the change of स् to ष् occurs frequently even in compounds (MacDonell, Vedic Grammar 67 a, b):

image
image

@gasyoun
Copy link
Member Author

gasyoun commented Oct 6, 2015

confirm of Gasyoun's copying theory - I did not invent it, it's Zgusta's theory (http://yadi.sk/d/h8ALxcCb8sY9w @zaaf2 has not seen the file yet, so might be interesting to him). As of essentially a translation to English of PWG - in most cases that is exactly what you stated.
As of There is no representation of umlaut in SLP1 - let's add. Can we ask Peter if he can approve a solution? I need these Umlaut's back for my Dictionary, so I guess it's a regex question. Partial solutions for a few records will not do. This is an easy fix and all I ask is your approval.
Oh, so the Gra1ma is there. Your improvement seems suspicious to me :) As per "When that choice is IAST, the display becomes grāma, with loss of capitalization." got it. But IAST is the default mode in printed MW and it's with Capital, so does not make much sense for me.
simple replacement sandhi - never heard before, good to know. current display renders <srs/> as an asterisk - oh, so maybe add a popup to the asterisks to note how to understand them?

@funderburkjim
Copy link
Contributor

Regarding dyOSamsita -> dyOsaMSita. I am leaving the correction to dyOsaMSita - @zaaf2 's finding of the word in AV clinches the deal for me as to spelling.

For the same reason, I'll also change 10-3-25 to 10-5-25 in MW.

The choice of interpretation ('praised in heaven' or 'impelled by heaven' ) seems like a separate question, and one which we don't need to answer to justify the MW correction. I wonder how Indian scholars treat what seems to be the confusion between the two interpretations. This question might be the tip of a very big iceberg regarding translation and interpretation of Indian sacred literature.

@funderburkjim
Copy link
Contributor

Regarding reference to AV: One thing both Thomas and Peter have mentioned is the desire to have
links from the literary source references of MW, PWG, and other lexicons to digital editions of the references. This example shows some of the values that such links could provide.

But this facility is still beyond current abilities, despite the greater availability of digitized texts now
than 10-15 years ago.

However, it might be possible to resolve the references for, say, AV. This would be a good
research project for someone to undertake.

@funderburkjim
Copy link
Contributor

Regarding the 10-3-25 error in MW. Since literary sources are identifiable in both MW and PWG, it would be possible to write a program to do at least a partial comparison. Likely other errors in MW would result. This would also be a good, probably relatively small, research project.

@zaaf2
Copy link

zaaf2 commented Oct 6, 2015

dyOSaMsita -> dyOsaMSita

Perhaps the most prudent solution would be to leave it as it is. Reasons:

  • The form द्यौशंसित is not incorrect;
  • it is adapted to the context;
  • it could hardly have been the result of a typographical error, since it involves the change and transposition of three signs, including ṉ, which is not used before स्);
  • the common confusion between संशित and शंसित has been pointed by MW s.v;
  • lastly, one cannot exclude the possibility that MW or one of his collaborators found this form in another (perhaps more correct) edition of the AV.

@funderburkjim
Copy link
Contributor

@zaaf2 I'm leaving the correction in place. It has been mentioned in corrections_factual, which in turn mentions this issue thread.

With present knowledge, the saMSita spelling seems most useful, since it leads to both PWG and at least one version of AV. At least that's the way it looks to me now.

@funderburkjim
Copy link
Contributor

Corrections now installed.

@gasyoun Glad you revisited Issue #45. I'll leave it to you to close that issue, or not.

The only item left among the many mentioned in this issue is the umlaut under deuliya headword.

As you can see from current display of MW for deuliya, the umlaut version shows within the entry.

I can make similar changes to other sanskrit-umlaut cases in MW, if you find them.

@funderburkjim
Copy link
Contributor

I think this issue can be closed, but will leave that to @gasyoun , since he opened.

@zaaf2
Copy link

zaaf2 commented Oct 7, 2015

@funderburkjim dyOSaMsita -> dyOsaMSita

I think this proves you are right. There seems to be no other original source with the reading द्यौशंसित

image

(...)

image

(from: Atharva-veda Saṁhitā by William Dwight Whitney, Charles Rockwell Lanman, https://archive.org/stream/atharvavedasahi05lanmgoog#page/n126/mode/2up)

@zaaf2
Copy link

zaaf2 commented Oct 7, 2015

A suggestion: the digital display should point to a factual error detected in the printed edition, with a link to the reasons for the correction. In this way, a comparison with the scanned page would not force the user to go through the same process to discover which is right and which is wrong, and he would be alerted to interesting corrections such as this.

@gasyoun
Copy link
Member Author

gasyoun commented Oct 7, 2015

@zaaf2 This is why we make screenshots here - we add them here, not to open the same page again. What you want is like http://www.kolchose.org/simon/ajaximagemapcreator/ or http://stackoverflow.com/questions/18560097/how-to-make-a-section-of-an-image-a-clickable-link and would be a good idea in 2018-2022 - after the basic headword proofreading is over. If you'll help with that, I'll see how to code it.

@zaaf2
Copy link

zaaf2 commented Oct 7, 2015

@gasyoun What I mean may be best explained by an example.

After the correction dyOSaMsita -> dyOsaMSita, we now have:

(H3) dyO-saMSita [p= 500] : (dyO-) mfn. impelled or incited by heaven AV. x, 5, 25. [L=97387]

I propose something like this:

(H3) dyO-saMSita {dyO-Samsita in the printed edition} [p= 500] : (dyO-) mfn. impelled or incited by heaven AV. x, {5}, 25. [L=97387]

The remarks {...} being at the same time clickable links which would show to the user a text with a summary of the reasons for the correction adopted. I don’t think this would be difficult, considering that this information is already available at corrections_factual

I would also suggest that a search for the old reading dyOSamsita would automatically lead to the corrected article under dyOsaMSita, instead of showing no result, as now.

@funderburkjim
Copy link
Contributor

@zaaf2 Your suggestions regarding display enhancements are good ones.

  • Suggest you make a separate issue, tagged as 'enhancement', in which you essentially copy the
    comments you made above. Then, this current issue o vs O Corrections in MW #127 can be closed, as it deals with many
    other things. And the new issue can remain open as a reminder
  • In terms of implementing it, there would be many steps. If we establish a sanskrit-lexicon development server, maybe someone (not necessarily me!) can work to make this a reality. If you have the inclination, you could learn programming and do it yourself!
  • Of similar interest is @gasyoun 's suggestion regarding an alternate to the output of MW, wherein
    Gra1ma would be displayed as capitalized IAST.

Also really appreciate the cross-referencing to other sources that you are coming up with, such as
Whitney's Atharva veda.

@gasyoun
Copy link
Member Author

gasyoun commented Oct 7, 2015

The current issue can't be closed, as there at least 332 issues to be covered. So not yet, Jim.
There are two new Russian coders, @masted and @juhnowski whom I wanted to introduce to you.
2nd task will be finishing sanskrit-lexicon/COLOGNE#45, after - who knows.

@funderburkjim
Copy link
Contributor

@gasyoun Think other chunks of the 332 should be posted in additional issues
("ovsO` Corrections in MW , Part 2" etc), just to make the size of issues manageable.

Russian coders have the reputation of being highly skilled, so it will be good if there is a way for them to help with the sanskrit-lexicon project.

@gasyoun
Copy link
Member Author

gasyoun commented Oct 8, 2015

"ovsO` Corrections in MW , Part 2" - so be it, in that case it's closed.
These coders are not only skilled, but are willing to help. My task
is to guide them where most help is wanted. For now - whatever
may help the Reverse Dictionary comes first.

@funderburkjim
Copy link
Contributor

Corrections installed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants