Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ACC extra headword. #352

Closed
funderburkjim opened this issue May 8, 2017 · 4 comments
Closed

ACC extra headword. #352

funderburkjim opened this issue May 8, 2017 · 4 comments

Comments

@funderburkjim
Copy link
Contributor

funderburkjim commented May 8, 2017

This issue was discovered in the course of working on another issue.

A correction needs to be made to ACC, but preliminary work is required in order to do this change
while maintaining stability of L-numbers.

cintAmaRi obstacle and L-numbers

During examination of the cases mentioned above, the following headword error was noticed:

acc.txt
<HI>{#tattvacintAmaRi#}¦ or fully {#nyAyatattvacintAmaRi,#} often called
<HI>{#cintAmaRi#}¦ or merely {#maRi#} by Gaṅgeśa or Gaṅge-     <<< ERROR
<>śvara. Divided into four books: Pratyakṣa, Anu-

image

While this cintAmaRi possibly should be classified as an Alternate headword, it definitely should not
be a normal headword.

One aspect of correcting this is simple:

20531 old <HI>{#cintAmaRi#}¦ or merely {#maRi#} by Gaṅgeśa or Gaṅge-
20531 new <>{#cintAmaRi#}¦ or merely {#maRi#} by Gaṅgeśa or Gaṅge-

However, if this change flows through the current system, then we'll have a shift of L-numbers for
all the thousands of headwords following this dropped headword. This is because in the current
system for acc, the L-numbers are determined dynamically based on the <HI> sequence number in
acc.txt.

We've decided that fixed L-numbers are better than dynamic L-numbers. This is a goal for all dictionaries. But currently this goal is implemented only for SCH (recently) and MW.

We should adapt the SCH scheme to ACC before making this correction and other corrections to acc.txt.

We should think about some of the details of this before jumping into code changes.
A discussion of this is in #130.

When the details of headword coding are decided on in #130, and have been implemented in acc.txt,
then will be the time to return and make this cintAmaRi correction.

@funderburkjim
Copy link
Contributor Author

The cintAmaRi error mentioned above is corrected. It is now one of the extra headwords associated with tattvacintAmaRi

@gasyoun
Copy link
Member

gasyoun commented May 25, 2017

It is now one of the extra headwords associated with tattvacintAmaRi

Do we have a single .txt file where all these additional words are intermingled with original ones? Not on web, but in a text or XML document, so I can get a full combined list of them, Jim?

@drdhaval2785
Copy link
Contributor

@gasyoun is asking to regenerate sanhw1.txt and sanhw2.txt in nutshell. :-)

@funderburkjim
Copy link
Contributor Author

For acc, the file acchw.txt has all the headwords, normal and alternate.

As mentioned, these are also in sanhw1/2, which have been regenerated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants