Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Disambiguation of multiword and singleword lemmas #649

Closed
stenskjaer opened this issue May 25, 2017 · 13 comments
Closed

Disambiguation of multiword and singleword lemmas #649

stenskjaer opened this issue May 25, 2017 · 13 comments

Comments

@stenskjaer
Copy link
Contributor

Take this non-sensical example (still not disambiguated):

\documentclass{article}
\usepackage{reledmac}

\begin{document}
\beginnumbering
\autopar

per causam tamen scire \edtext{causam}{\lemma{causam}\Bfootnote{fnote}} est
\edtext{per causam}{\lemma{per causam}\Bfootnote{causam rei B}} cognoscere
\edtext{causam}{\lemma{causam}\Bfootnote{fnote}}.

\endnumbering
\end{document}

This results in this output:
skaermbillede 2017-05-25 kl 09 47 34

With this apparatus:
skaermbillede 2017-05-25 kl 09 48 04

Problems:

  • "causam" occurs four times.
  • two occurences of "causam" is as the only word in \edtext.
  • one occurence of it is in a multiword lemma.
  • the multiword lemma "per causam" is also ambiguous.

How would we annotate this?
This especially concerns how to annotate such multiword lemmas.

Solution 1

My initial idea was to annotate the multiword lemma as if it was one word like this:

\edtext{\sameword[1]{per causam}}{\lemma{\sameword{per causam}\Afootnote{note}}

That actually works on this reduced example:

\sameword{per causam} tamen scire \edtext{\sameword[1]{per causam}}{\lemma{\sameword{per
      causam}}\Bfootnote{causam rei B}} cognoscere causam.

Which gives:
skaermbillede 2017-05-25 kl 09 55 58

The idea here is that it's the whole expression "per causam" that is numbered (although that is not really clear).

Problem: How do we then also disambiguate single word lemmas?

My intuition was:

\sameword{per \sameword{causam}} tamen scire
\edtext{\sameword[1]{causam}}{\lemma{\sameword{causam}}\Bfootnote{fnote}} est
\edtext{\sameword[1]{per causam}}{\lemma{\sameword{per causam}}\Bfootnote{causam
    rei B}} cognoscere \edtext{\sameword[1]{causam}}{\lemma{\sameword{causam}}\Bfootnote{fnote}}.

But that breaks with this error:

ERROR: Use of \\pstart doesn't match its definition.

--- TeX said ---
\kernel@ifnextchar ...rved@d =#1\def \reserved@a {
                                                  #2}\def \reserved@b {#3}\f...l.19 \sameword{per \sameword{causam}}
                                      tamen scire

Clearly, \sameword{}s can't be nested.

Solution 2

Another idea would then be only to mark one of the words of a multiword lemma (either the first or last) like this:

per \sameword{causam} tamen scire
\edtext{\sameword[1]{causam}}{\lemma{\sameword{causam}}\Bfootnote{fnote}} est
\edtext{per \sameword[1]{causam}}{\lemma{per \sameword{causam}}\Bfootnote{causam
    rei B}} cognoscere
\edtext{\sameword[1]{causam}}{\lemma{\sameword{causam}}\Bfootnote{fnote}}.

But this gives the following output:
skaermbillede 2017-05-25 kl 10 04 31

I guess this could work, but my problem is with the "per causam" entry. If we interpret the "3" as a reference to the whole phrase "per causam", it is clearly incorrect.
Furthermore: when there is only one instance of "per causam" in the line, it is not necessary to even disambiguate that note, as "per causam" would only occur once in the line, resulting in a confusing (even ambiguous!) disambiguation.

What I would like

I know there might be problems with this, but my wish would be to be able to produce this apparatus:

1 causam² ] note
1 per causam² ] note
1 causam⁴ ] note

That was my expectation from the nested annotation, which looked like this:

\sameword{per \sameword{causam}} tamen scire
\edtext{\sameword[1]{causam}}{\lemma{\sameword{causam}}\Bfootnote{fnote}} est
\edtext{\sameword[1]{per causam}}{\lemma{\sameword{per causam}}\Bfootnote{causam
    rei B}} cognoscere \edtext{\sameword[1]{causam}}{\lemma{\sameword{causam}}\Bfootnote{fnote}}.

If we then had a situation with only one instance of "per causam", it would just look like this:

1 causam² ] note
1 per causam ] note
1 causam⁴ ] note

Very clear to me.

But as I said, that does not work. I see how such nesting (as it would be able to go to any depth) is a lot more complex, but can it be done?

Conclusion

Any thoughts on how to best annotate multiword lemmas? Could my suggestion be done?

@maieul
Copy link
Owner

maieul commented May 26, 2017 via email

@stenskjaer
Copy link
Contributor Author

The problem is that I don't have any idea for how to make such distinctions in the input that are not either ambiguous or that cannot compile. What would you (or anybody else) suggest?

@maieul
Copy link
Owner

maieul commented May 26, 2017 via email

@stenskjaer
Copy link
Contributor Author

No.

The problem is that you get a conflict between lemmas with more than one word, where one of those words also need disambiguation in single word lemmas.

Take this example:

\sameword{causam} tamen scire
\edtext{\sameword[1]{causam}}{\lemma{\sameword{causam}}\Bfootnote{fnote}} est
\edtext{per \sameword{causam}}{\lemma{per causam}\Bfootnote{causam rei B}} cognoscere
\edtext{\sameword[1]{causam}}{\lemma{\sameword{causam}}\Bfootnote{fnote}}.

There is only one "per causam", so there is no problem, as that does not need disambiguation. Then I can also mark the "causam" inside the \edtext{per causam} without any problem. Everything is good here. This is what you suggest.

But then see this:

\sameword{per causam} tamen scire
\edtext{\sameword[1]{causam}}{\lemma{\sameword{causam}}\Bfootnote{fnote}} est
\edtext{\sameword[1]{per causam}}{\lemma{\sameword{per causam}}\Bfootnote{causam rei B}} cognoscere
\edtext{\sameword[1]{causam}}{\lemma{\sameword{causam}}\Bfootnote{fnote}}.

The result is:
skaermbillede 2017-06-03 kl 10 28 22

But the first "causam1" should be "causam2" because "causam" is also part of the phrase "per causam". And the second ("causam2") should be "causam4". To correct this, I would annotate it like this:

\sameword{per \sameword{causam}} tamen scire
\edtext{\sameword[1]{causam}}{\lemma{\sameword{causam}}\Bfootnote{fnote}} est
\edtext{\sameword[1]{per \samword{causam}}}{\lemma{\sameword{per causam}}\Bfootnote{causam rei B}} cognoscere
\edtext{\sameword[1]{causam}}{\lemma{\sameword{causam}}\Bfootnote{fnote}}.

But that gives this error:

! Use of \\pstart doesn't match its definition.
\kernel@ifnextchar ...rved@d =#1\def \reserved@a {
                                                  #2}\def \reserved@b {#3}\f...
l.43 \sameword{per \sameword{causam}}
                                      tamen scire

I can't see how just not annotating a \sameword in the \lemma should solve this and still keep correct numbering.

@maieul
Copy link
Owner

maieul commented Jun 3, 2017 via email

maieul added a commit that referenced this issue Jun 3, 2017
@maieul
Copy link
Owner

maieul commented Jun 3, 2017

The branch "issue649" should allow you to get nested \sameword. But please try complex case. I still think that the word and not the group of word should be the basic unit. In your example, personnaly, annotate each indivudual word (eef3070)

@maieul
Copy link
Owner

maieul commented Jun 3, 2017

Here the link eef3070

@maieul
Copy link
Owner

maieul commented Jun 4, 2017

If you can test this branch, I would like to publish a release branch this week. If you have no time, I will publish only a bugfix branch. Please let me know.

@stenskjaer
Copy link
Contributor Author

The solution in the branch gives the result I was expecting. That is really great, I think!

But as you see it, you would say that an annotation that results in this apparatus is clearer?
skaermbillede 2017-06-04 kl 16 45 46

It's certainly unambiguous! I don't know. I just feel that it can be presented simpler.

And I certainly see that it us unclear whether "causam2" in "per causam2" refers only to "causam" or also to "per". But the lemma as such is the whole phrase "per causam" and that would be what you would have to look for in the text. Looking only for "causam" from a lemma that says "per causam" does not really make sense, does it? So in that way I find it unambiguous.

If anyone else are reading in on this, I would love to hear what you think about this problem. Maybe I'm just completely off here.

I understand you challenge as the maintainer: To be asked to implement a solution that you do not yourself support can be frustrating. Thanks a lot for listening to my questions and suggestions anyway!

@maieul
Copy link
Owner

maieul commented Jun 4, 2017 via email

@maieul maieul closed this as completed in e4e5bf8 Jun 4, 2017
@stenskjaer
Copy link
Contributor Author

Really appreciate it.
Let me know if you need proof-reading or anything on the documentation.

Cheers!

@maieul
Copy link
Owner

maieul commented Jun 4, 2017 via email

@stenskjaer
Copy link
Contributor Author

I was mostly thinking of this point. Thought it would be natural to help close it when I brought it up :)

If you need proof reading on the documentation more generally, I am willing to help out as much as I can. But I can't promise that I will be available when something needs to get pushed quickly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants