-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Disambiguation of multiword and singleword lemmas #649
Comments
Low level priority for me. Make distinction between word is quite hard, you should manage it in your input.
Le 25 mai 2017 à 10:18, Michael Stenskjær Christensen <notifications@github.com> a écrit :
… Take this non-sensical example (still not disambiguated):
\documentclass
{article}
\usepackage
{reledmac}
\begin
{document}
\beginnumbering
\autopar
per causam tamen scire
\edtext{causam}{\lemma{causam}\Bfootnote
{fnote}} est
\edtext{per causam}{\lemma{per causam}\Bfootnote
{causam rei B}} cognoscere
\edtext{causam}{\lemma{causam}\Bfootnote
{fnote}}.
\endnumbering
\end{document}
This results in this output:
With this apparatus:
Problems:
• "causam" occurs four times.
• two occurences of "causam" is as the only word in \edtext.
• one occurence of it is in a multiword lemma.
• the multiword lemma "per causam" is also ambiguous.
How would we annotate this?
This especially concerns how to annotate such multiword lemmas.
Solution 1
My initial idea was to annotate the multiword lemma as if it was one word like this:
\edtext{\sameword[1]{per causam}}{\lemma{\sameword{per causam}\Afootnote{note}}
That actually works on this reduced example:
\sameword{per causam} tamen scire \edtext{\sameword[1]{per causam}}{\lemma{\sameword
{per
causam}}
\Bfootnote{causam rei B}} cognoscere causam.
Which gives:
The idea here is that it's the whole expression "per causam" that is numbered (although that is not really clear).
Problem: How do we then also disambiguate single word lemmas?
My intuition was:
\sameword{per \sameword{causam}} tamen scire
\edtext{\sameword[1]{causam}}{\lemma{\sameword{causam}}\Bfootnote{fnote}} est
\edtext{\sameword[1]{per causam}}{\lemma{\sameword{per causam}}\Bfootnote{causam
rei B}} cognoscere \edtext{\sameword[1]{causam}}{\lemma{\sameword{causam}}\Bfootnote{fnote}}.
But that breaks with this error:
ERROR: Use of \\pstart doesn't match its definition.
--- TeX said ---
***@***.*** ...***@***.*** =#1\def ***@***.*** {
#2}\def ***@***.*** {#3}\f...l.19 \sameword{per \sameword{causam}}
tamen scire
Clearly, \sameword{}s can't be nested.
Solution 2
Another idea would then be only to mark one of the words of a multiword lemma (either the first or last) like this:
per \sameword
{causam} tamen scire
\edtext{\sameword[1]{causam}}{\lemma{\sameword{causam}}\Bfootnote
{fnote}} est
\edtext{per \sameword[1]{causam}}{\lemma{per \sameword{causam}}\Bfootnote
{causam
rei B}} cognoscere
\edtext{\sameword[1]{causam}}{\lemma{\sameword{causam}}\Bfootnote{fnote}}.
But this gives the following output:
I guess this could work, but my problem is with the "per causam" entry. If we interpret the "3" as a reference to the whole phrase "per causam", it is clearly incorrect.
Furthermore: when there is only one instance of "per causam" in the line, it is not necessary to even disambiguate that note, as "per causam" would only occur once in the line, resulting in a confusing (even ambiguous!) disambiguation.
What I would like
I know there might be problems with this, but my wish would be to be able to produce this apparatus:
1 causam² ] note
1 per causam² ] note
1 causam⁴ ] note
That was my expectation from the nested annotation, which looked like this:
\sameword{per \sameword
{causam}} tamen scire
\edtext{\sameword[1]{causam}}{\lemma{\sameword{causam}}\Bfootnote
{fnote}} est
\edtext{\sameword[1]{per causam}}{\lemma{\sameword{per causam}}\Bfootnote
{causam
rei B}} cognoscere
\edtext{\sameword[1]{causam}}{\lemma{\sameword{causam}}\Bfootnote{fnote}}.
If we then had a situation with only one instance of "per causam", it would just look like this:
1 causam² ] note
1 per causam ] note
1 causam⁴ ] note
Very clear to me.
But as I said, that does not work. I see how such nesting (as it would be able to go to any depth) is a lot more complex, but can it be done?
Conclusion
Any thoughts on how to best annotate multiword lemmas? Could my suggestion be done?
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub, or mute the thread.
|
The problem is that I don't have any idea for how to make such distinctions in the input that are not either ambiguous or that cannot compile. What would you (or anybody else) suggest? |
sorry, I am too tired to look in the detail, but why not using \lemma with \sameword inside when needed, and without when not need?
Le 26 mai 2017 à 11:10, Michael Stenskjær Christensen <notifications@github.com> a écrit :
… The problem is that I don't have any idea for how to make such distinctions in the input that are not either ambiguous or that cannot compile. What would you (or anybody else) suggest?
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub, or mute the thread.
|
No. The problem is that you get a conflict between lemmas with more than one word, where one of those words also need disambiguation in single word lemmas. Take this example: \sameword{causam} tamen scire
\edtext{\sameword[1]{causam}}{\lemma{\sameword{causam}}\Bfootnote{fnote}} est
\edtext{per \sameword{causam}}{\lemma{per causam}\Bfootnote{causam rei B}} cognoscere
\edtext{\sameword[1]{causam}}{\lemma{\sameword{causam}}\Bfootnote{fnote}}. There is only one "per causam", so there is no problem, as that does not need disambiguation. Then I can also mark the "causam" inside the But then see this: \sameword{per causam} tamen scire
\edtext{\sameword[1]{causam}}{\lemma{\sameword{causam}}\Bfootnote{fnote}} est
\edtext{\sameword[1]{per causam}}{\lemma{\sameword{per causam}}\Bfootnote{causam rei B}} cognoscere
\edtext{\sameword[1]{causam}}{\lemma{\sameword{causam}}\Bfootnote{fnote}}. But the first "causam1" should be "causam2" because "causam" is also part of the phrase "per causam". And the second ("causam2") should be "causam4". To correct this, I would annotate it like this: \sameword{per \sameword{causam}} tamen scire
\edtext{\sameword[1]{causam}}{\lemma{\sameword{causam}}\Bfootnote{fnote}} est
\edtext{\sameword[1]{per \samword{causam}}}{\lemma{\sameword{per causam}}\Bfootnote{causam rei B}} cognoscere
\edtext{\sameword[1]{causam}}{\lemma{\sameword{causam}}\Bfootnote{fnote}}. But that gives this error:
I can't see how just not annotating a \sameword in the \lemma should solve this and still keep correct numbering. |
hum, sameword should be applied on individual word, not on a group of words.
because here, there is a ambiguity in your reference in the second exemple.
As reader, I understand that you mean "causam" number 2 and not "per causam" number 2, and I think you are not able to have good count of causam. At once, you should prevent reader. Maybe I will be able to solve this issue, but I don't think that is a good idead
Le 3 juin 2017 à 10:37, Michael Stenskjær Christensen <notifications@github.com> a écrit :
|
The branch "issue649" should allow you to get nested |
Here the link eef3070 |
If you can test this branch, I would like to publish a release branch this week. If you have no time, I will publish only a bugfix branch. Please let me know. |
The solution in the branch gives the result I was expecting. That is really great, I think!
But as you see it, you would say that an annotation that results in this apparatus is clearer?
It's certainly unambiguous! I don't know. I just feel that it can be presented simpler.
And I certainly see that it us unclear whether "causam2" in "per causam2" refers only to "causam" or also to "per". But the lemma as such is the whole phrase "per causam" and that would be what you would have to look for in the text. Looking only for "causam" from a lemma that says "per causam" does not really make sense, does it? So in that way I find it unambiguous.
I don't know, maybe I am to formatted by the notion of "sameword". And maybe, I have a too mathematical vision. For
"per causam2" mean "per + causam * 2" and not "(per + causam) * 2".
I will add a not in the handbook and release tonight a new version
If anyone else are reading in on this, I would love to hear what you think about this problem. Maybe I'm just completely off here.
I understand you challenge as the maintainer: To be asked to implement a solution that you do not yourself support can be frustrating. Thanks a lot for listening to my questions and suggestions anyway!
The main challenge was to understand your need. Especially, previous weekend I was busy.
|
Really appreciate it. Cheers! |
for this point or in general?
|
I was mostly thinking of this point. Thought it would be natural to help close it when I brought it up :) If you need proof reading on the documentation more generally, I am willing to help out as much as I can. But I can't promise that I will be available when something needs to get pushed quickly. |
Take this non-sensical example (still not disambiguated):
This results in this output:
With this apparatus:
Problems:
\edtext
.How would we annotate this?
This especially concerns how to annotate such multiword lemmas.
Solution 1
My initial idea was to annotate the multiword lemma as if it was one word like this:
That actually works on this reduced example:
Which gives:
The idea here is that it's the whole expression "per causam" that is numbered (although that is not really clear).
Problem: How do we then also disambiguate single word lemmas?
My intuition was:
But that breaks with this error:
Clearly,
\sameword{}
s can't be nested.Solution 2
Another idea would then be only to mark one of the words of a multiword lemma (either the first or last) like this:
But this gives the following output:
I guess this could work, but my problem is with the "per causam" entry. If we interpret the "3" as a reference to the whole phrase "per causam", it is clearly incorrect.
Furthermore: when there is only one instance of "per causam" in the line, it is not necessary to even disambiguate that note, as "per causam" would only occur once in the line, resulting in a confusing (even ambiguous!) disambiguation.
What I would like
I know there might be problems with this, but my wish would be to be able to produce this apparatus:
That was my expectation from the nested annotation, which looked like this:
If we then had a situation with only one instance of "per causam", it would just look like this:
Very clear to me.
But as I said, that does not work. I see how such nesting (as it would be able to go to any depth) is a lot more complex, but can it be done?
Conclusion
Any thoughts on how to best annotate multiword lemmas? Could my suggestion be done?
The text was updated successfully, but these errors were encountered: