MWS accent correction, continue phase 3 #142

funderburkjim · 2022-10-21T00:34:45Z

In #141, the last comments pertained began what was called phase 3. This is a page-by-page, column-by-column comparison of the scanned images with the cologne digitization of mw.txt. This comparison focuses primarily on the accents in the metaline and headline portion (before the broken bar) of the digitization.

This issue continues that task.

Ref: sanskrit-lexicon/MWS#142

Ref: #142

funderburkjim · 2022-10-21T01:29:47Z

Recall from #141, that changes for page 1-59 are in issue141/change_mw_6.txt, of 10-17-2022.

The current work is done in mwissues/issue142 directory,.

This table is a log of progress.

date	page range	change-file
10-17-2022	0001-0059	issue141/change_mw_6.txt
10-20-2022	0060-0130	change_mw_01.txt
10-24-2022	0131-220	change_mw_02.txt
10-27-2022	0221-0299	change_mw_03.txt
10-31-2022	0300-0399	change_mw_04.txt
11-06-2022	0400-0499	change_mw_05.txt
11-12-2022	0500-0599	change_mw_06.txt
11-15-2022	0600-0699	change_mw_07.txt
11-19-2022	0700-0799	change_mw_08.txt
11-25-2022	0800-0899	change_mw_09.txt
11-28-2022	0900-0999	change_mw_10.txt
12-05-2022	1000-1099	change_mw_11.txt
12-13-2022	1100-1199	change_mw_12.txt
12-20-2022	1200-1308	change_mw_13.txt

Ref: sanskrit-lexicon/MWS#142

Ref: #142

Ref: sanskrit-lexicon/MWS#142

Ref: #142

Ref: sanskrit-lexicon/MWS#142

Ref: #142

Ref: sanskrit-lexicon/MWS#142

Ref: #142

Ref: sanskrit-lexicon/MWS#142

Ref: #142

Andhrabharati · 2022-11-12T01:32:25Z

So far about 48800 changes in 499 pages, i.e. almost 100/page on average [in this issue alone].

Guess @funderburkjim feels this exercise is worthy enough of his time, and would be alloting further time to continue the work in the remaining pages.

funderburkjim · 2022-11-12T01:38:00Z

Yes, currently at page 617. Almost half-way. Probably 5-7 weeks to end (about 20 pages/day).

Andhrabharati · 2022-11-12T04:31:34Z

Today's correction file (06) is dated as 20th, instead of 12th, by error.

gasyoun · 2022-11-12T19:03:57Z

Probably 5-7 weeks to end

So until end of 2022. When should we plan for our yearly call? Right after?

Ref: sanskrit-lexicon/MWS#142

Ref: #142

funderburkjim · 2022-11-15T17:48:08Z

Today's correction file (06) is dated as 20th, instead of 12th, by error.

Corrected.

Ref: sanskrit-lexicon/MWS#142

Ref: #142

Ref: sanskrit-lexicon/MWS#142

Ref: #142

Ref: sanskrit-lexicon/MWS#142

Ref: #142

Ref: sanskrit-lexicon/MWS#142

Ref: #142

Ref: sanskrit-lexicon/MWS#142

Andhrabharati · 2022-12-17T02:48:07Z

@funderburkjim

You forgot to add the last commit (1100-1199) in the 'progress log table' above.

Also pl. see my post at sanskrit-lexicon/SKD#16 (comment) reg. the annexure pages.

Hope you would be finishing this commendable task before this Christmas; this indeed is the most worthy exercise (in my view) in the last 25+ years of MW work at CDSL.

funderburkjim · 2022-12-17T04:49:10Z

Updated progress log table. Thanks for mentioning.

funderburkjim · 2022-12-17T04:54:25Z

Regarding annexure pages accent review, I was hoping you would cover that when I finish the body text accent review.

However, if you decide not to do that, then I will consider it later.

Andhrabharati · 2022-12-17T04:59:44Z

Sure, I can resume my unfinished task (referred above) once you're done with your work and give me the updated iast file.

gasyoun · 2022-12-17T20:49:13Z

Sure, I can resume my unfinished task (referred above) once you're done with your work and give me the updated iast file.

Good to know.

Ref: sanskrit-lexicon/MWS#142

Ref: #142

funderburkjim · 2022-12-20T18:30:46Z

accent review completed.

Review ends at page 1308.
Hurray! Plan to deal with a few (~ 100) cases noticed along the way before preparing a version for perusal by @Andhrabharati .

funderburkjim · 2022-12-20T18:33:58Z

two accents.

There are a relatively few cases where an entry headword is marked with two accents.
two_accents.txt is the current list of 177.

These should be reviewed manually sometime.

Ref: sanskrit-lexicon/MWS#142

Ref: #142

funderburkjim · 2022-12-21T03:09:50Z

a few extra

Based on notes made during the accent review, several additional entries were reviewed.
Some entries were changed, some were identified as open questions, and some were identified as no change required.
The commit above (b42..) can be used to review the changes.
readme_extra.txt may be consulted.

gasyoun · 2022-12-21T15:28:05Z

Review ends at page 1308.

I bow to the way you deal with issues.

two_accents.txt is the current list of 177.

agnī/-va/ruṇau should be read as agnī́-varuṇau and agnī-váruṇau or agnī́-váruṇau

The commit above (b42..) can be used to review the changes.

@Andhrabharati would you be willing to look at the 200 lines?

Andhrabharati · 2022-12-21T16:08:09Z

@funderburkjim hasn't made room for my stepping in, @gasyoun ; he wants to do something still (look at his prev. post).

funderburkjim · 2022-12-21T22:37:17Z

A crude statistic shows that the primary difference between the original and final versions of MW in this exercise
is due to removal of 'inherited' accents in compounds.

$ grep -E "<k2>[^<]*[\/^]" temp_mw_00.txt | wc -l  
BEFORE:  107142  metalines with accents

$ grep -E "<k2>[^<]*[\/^]" temp_mw_extra.txt | wc -l
AFTER: 48852  

(- 107142 48852) = 58290   metalines whose accents  are removed

------------------------------------------------------------------------
<e>[34]  are, roughly, the compounds.
$ grep -E "<k2>[^<]*[\/^].*?<e>[34]" temp_mw_00.txt | wc -l
BEFORE:  73303  metalines (of compounds) with accents

$ grep -E "<k2>[^<]*[\/^].*?<e>[34]" temp_mw_extra.txt | wc -l
AFTER: 15001 
-----------------------------------------------------------------------
(- 73303 15001) = 58302 metalines (of compounds) whose accents are removed

NOTE: Since 58302 is almost equal to 58290, the net accent removals are predominantly
attributable to removal of accents in compounds.

gasyoun · 2022-12-21T23:08:28Z

due to removal of 'inherited' accents in compounds

15001 changes?

funderburkjim · 2022-12-22T01:28:02Z

See revision of comment above. Roughly 58000+ metalines changed.

Ref: sanskrit-lexicon/MWS#142

funderburkjim · 2022-12-22T02:18:19Z

Closing this issue.
#145 continues this review.

Andhrabharati · 2022-12-22T02:57:47Z

two_accents.txt is the current list of 177.

agnī/-va/ruṇau should be read as agnī́-varuṇau and agnī-váruṇau or agnī́-váruṇau

The commit above (b42..) can be used to review the changes.

@Andhrabharati would you be willing to look at the 200 lines?

@gasyoun

See what pwk says on this word--

@funderburkjim

I would again request you to make a page like https://sanskrit-lexicon.uni-koeln.de/scans/csl-apidev/pwkvn/03/ (option 3 of https://sanskrit-lexicon.uni-koeln.de/scans/csl-apidev/pwkvn/03/), for MW, PWG and pwk+pwkvn; this definitely would be helpful to easily track such queries, as MW is heavily depending on those two works.
[I had asked for this sometime back, and might've skipped your notice.]

funderburkjim · 2022-12-22T03:17:30Z

Such a display with MW is not as easy as the pwkvn/03/ page, because there are differences in spelling conventions between MW and Boehtlingk. e.g. kar (pw) vs kf (mw) [slp1 spelling of root 'to do']. Without handling these spelling differences, the pwkvn/03 display could be adapted, but would sometimes stumble.
Another approach is https://www.sanskrit-lexicon.uni-koeln.de/scans/csl-apidev/sample/dalglob1.php.
This handles the spelling differences well, but is visually less useful.

So getting a perfected display is non-trivial.

My solution when working with the mw accents and consulting occasionally pw or pwg has been to
open the simple-search list displays (input=slp1) in two tabs (one for pw, one for mw), and open servepdf (for mw) in a third tab (or these views could be opened in separate windows), This configuration makes it fairly easy to consult PW when necessary . This setup can be facilitated in windows-11 using a collection.

gasyoun · 2022-12-22T08:22:56Z

MW, PWG and pwk+pwkvn; this definitely would be helpful to easily track such queries, as MW is heavily depending on those two works.

@funderburkjim kar (pw) vs kf (mw) - with the acceneted words it will not become an issue at all. @Andhrabharati is not asking for a universal tool for all cases.

Another approach is https://www.sanskrit-lexicon.uni-koeln.de/scans/csl-apidev/sample/dalglob1.php.

Yeah, it's not even reachable from homepage. Hope it can get some love in 2023.

two tabs (one for pw, one for mw), and open servepdf (for mw) in a third tab (or these views could be opened in separate windows)

Three open tabs is 2 tabs too much for me.

See what pwk says on this word--

Thanks, so the last one. Is it still a single pada?

Andhrabharati · 2022-12-22T09:24:35Z

Thanks, so the last one. Is it still a single pada?

yes, it is a dvandvasamAsa.

gasyoun · 2022-12-22T15:32:01Z

it is a dvandvasamAsa

But why only a small part of them have two accents at once? Archaic ones?

Andhrabharati · 2022-12-22T15:43:42Z

it is a dvandvasamAsa

But why only a small part of them have two accents at once? Archaic ones?

Almost every 'dual' category entry that I came across is with double accent; just wait till I finish reading through the MW entries.

funderburkjim added a commit to sanskrit-lexicon/csl-orig that referenced this issue Oct 21, 2022

MW accent update pages 0060-0130.

802c86d

Ref: sanskrit-lexicon/MWS#142

funderburkjim added a commit that referenced this issue Oct 21, 2022

MW accent update pages 0060-0130.

4fb4839

Ref: #142

gasyoun added the cleanup label Oct 21, 2022

funderburkjim added a commit to sanskrit-lexicon/csl-orig that referenced this issue Oct 24, 2022

MW accent update pages0131-0220.

45796e5

Ref: sanskrit-lexicon/MWS#142

funderburkjim added a commit that referenced this issue Oct 24, 2022

MW accent update pages 0131-0220.

23a8977

Ref: #142

funderburkjim added a commit to sanskrit-lexicon/csl-orig that referenced this issue Oct 27, 2022

MW accent update pages 0221-0299.

f3f99c6

Ref: sanskrit-lexicon/MWS#142

funderburkjim added a commit that referenced this issue Oct 27, 2022

MW accent update pages 0221-0299.

61c526a

Ref: #142

funderburkjim added a commit to sanskrit-lexicon/csl-orig that referenced this issue Nov 1, 2022

MW accent update pages 0300-0399.

7ef6462

Ref: sanskrit-lexicon/MWS#142

funderburkjim added a commit that referenced this issue Nov 1, 2022

MW accent update pages 0300-0399.

3c163dd

Ref: #142

funderburkjim added a commit to sanskrit-lexicon/csl-orig that referenced this issue Nov 6, 2022

MW accent update pages 0400-0499.

c7f6fd6

Ref: sanskrit-lexicon/MWS#142

funderburkjim added a commit that referenced this issue Nov 6, 2022

MW accent update pages 0400-0499.

9d8329f

Ref: #142

funderburkjim added a commit to sanskrit-lexicon/csl-orig that referenced this issue Nov 10, 2022

MW accent update pages 0500-0599.

5b3f451

Ref: sanskrit-lexicon/MWS#142

funderburkjim added a commit that referenced this issue Nov 10, 2022

MW accent update pages 0500-0599.

c55b5b8

Ref: #142

funderburkjim added a commit to sanskrit-lexicon/csl-orig that referenced this issue Nov 15, 2022

MW accent update pages 0600-0699.

98bf4cb

Ref: sanskrit-lexicon/MWS#142

funderburkjim added a commit that referenced this issue Nov 15, 2022

MW accent update pages 0600-0699.

1841489

Ref: #142

funderburkjim added a commit to sanskrit-lexicon/csl-orig that referenced this issue Nov 19, 2022

MW accent update pages 0700-0799.

7746d86

Ref: sanskrit-lexicon/MWS#142

funderburkjim added a commit that referenced this issue Nov 19, 2022

MW accent update pages 0700-0799.

a21303a

Ref: #142

funderburkjim added a commit to sanskrit-lexicon/csl-orig that referenced this issue Nov 25, 2022

MW accent update pages 0800-0899.

2e7e8a4

Ref: sanskrit-lexicon/MWS#142

funderburkjim added a commit that referenced this issue Nov 25, 2022

MW accent update pages 0800-0899.

c981b58

Ref: #142

funderburkjim added a commit to sanskrit-lexicon/csl-orig that referenced this issue Nov 28, 2022

MW accent update pages 0900-0999.

5a66cc3

Ref: sanskrit-lexicon/MWS#142

funderburkjim added a commit that referenced this issue Nov 28, 2022

MW accent update pages 0900-0999.

274ded4

Ref: #142

funderburkjim added a commit to sanskrit-lexicon/csl-orig that referenced this issue Dec 5, 2022

MW accent update pages 1000-1099.

caa3864

Ref: sanskrit-lexicon/MWS#142

funderburkjim added a commit that referenced this issue Dec 5, 2022

MW accent update pages 1000-1099.

911ed95

Ref: #142

funderburkjim added a commit to sanskrit-lexicon/csl-orig that referenced this issue Dec 14, 2022

MW accent update pages 1100-1199.

02b2a0e

Ref: sanskrit-lexicon/MWS#142

funderburkjim mentioned this issue Dec 15, 2022

MW Scan review #144

Closed

funderburkjim added a commit to sanskrit-lexicon/csl-orig that referenced this issue Dec 20, 2022

MW accent update pages 1200-1308.

917e16c

Ref: sanskrit-lexicon/MWS#142

funderburkjim added a commit that referenced this issue Dec 20, 2022

MW accent update pages 1200-1308.

39e1fa1

Ref: #142

funderburkjim added a commit to sanskrit-lexicon/csl-orig that referenced this issue Dec 21, 2022

MW accent update . see readme_extra.txt.

b42685b

Ref: sanskrit-lexicon/MWS#142

funderburkjim added a commit that referenced this issue Dec 21, 2022

MW accent review. extra changes.

ee3a3cc

Ref: #142

funderburkjim added a commit to sanskrit-lexicon/csl-orig that referenced this issue Dec 22, 2022

MW correction noticed during IAST conversion.

360db2b

Ref: sanskrit-lexicon/MWS#142

funderburkjim added a commit that referenced this issue Dec 22, 2022

convert latest MW to IAST for Andhrabharati. #142

49bfc9d

funderburkjim mentioned this issue Dec 22, 2022

MWS accent correction, continue, phase 4 #145

Closed

funderburkjim closed this as completed Dec 22, 2022

funderburkjim mentioned this issue Dec 22, 2022

mw:112078 sanskrit-lexicon/csl-orig#1038

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MWS accent correction, continue phase 3 #142

MWS accent correction, continue phase 3 #142

funderburkjim commented Oct 21, 2022

funderburkjim commented Oct 21, 2022 •

edited

Loading

Andhrabharati commented Nov 12, 2022 •

edited

Loading

funderburkjim commented Nov 12, 2022

Andhrabharati commented Nov 12, 2022 •

edited

Loading

gasyoun commented Nov 12, 2022

funderburkjim commented Nov 15, 2022

Andhrabharati commented Dec 17, 2022 •

edited

Loading

funderburkjim commented Dec 17, 2022

funderburkjim commented Dec 17, 2022

Andhrabharati commented Dec 17, 2022

gasyoun commented Dec 17, 2022

funderburkjim commented Dec 20, 2022

funderburkjim commented Dec 20, 2022

funderburkjim commented Dec 21, 2022

gasyoun commented Dec 21, 2022 •

edited

Loading

Andhrabharati commented Dec 21, 2022

funderburkjim commented Dec 21, 2022 •

edited

Loading

gasyoun commented Dec 21, 2022

funderburkjim commented Dec 22, 2022

funderburkjim commented Dec 22, 2022

Andhrabharati commented Dec 22, 2022 •

edited

Loading

funderburkjim commented Dec 22, 2022

gasyoun commented Dec 22, 2022

Andhrabharati commented Dec 22, 2022

gasyoun commented Dec 22, 2022

Andhrabharati commented Dec 22, 2022 •

edited

Loading

MWS accent correction, continue phase 3 #142

MWS accent correction, continue phase 3 #142

Comments

funderburkjim commented Oct 21, 2022

funderburkjim commented Oct 21, 2022 • edited Loading

Andhrabharati commented Nov 12, 2022 • edited Loading

funderburkjim commented Nov 12, 2022

Andhrabharati commented Nov 12, 2022 • edited Loading

gasyoun commented Nov 12, 2022

funderburkjim commented Nov 15, 2022

Andhrabharati commented Dec 17, 2022 • edited Loading

funderburkjim commented Dec 17, 2022

funderburkjim commented Dec 17, 2022

Andhrabharati commented Dec 17, 2022

gasyoun commented Dec 17, 2022

funderburkjim commented Dec 20, 2022

accent review completed.

funderburkjim commented Dec 20, 2022

two accents.

funderburkjim commented Dec 21, 2022

a few extra

gasyoun commented Dec 21, 2022 • edited Loading

Andhrabharati commented Dec 21, 2022

funderburkjim commented Dec 21, 2022 • edited Loading

gasyoun commented Dec 21, 2022

funderburkjim commented Dec 22, 2022

funderburkjim commented Dec 22, 2022

Andhrabharati commented Dec 22, 2022 • edited Loading

funderburkjim commented Dec 22, 2022

gasyoun commented Dec 22, 2022

Andhrabharati commented Dec 22, 2022

gasyoun commented Dec 22, 2022

Andhrabharati commented Dec 22, 2022 • edited Loading

funderburkjim commented Oct 21, 2022 •

edited

Loading

Andhrabharati commented Nov 12, 2022 •

edited

Loading

Andhrabharati commented Nov 12, 2022 •

edited

Loading

Andhrabharati commented Dec 17, 2022 •

edited

Loading

gasyoun commented Dec 21, 2022 •

edited

Loading

funderburkjim commented Dec 21, 2022 •

edited

Loading

Andhrabharati commented Dec 22, 2022 •

edited

Loading

Andhrabharati commented Dec 22, 2022 •

edited

Loading