New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ellipsis in v2 #396
Comments
Yes, that is correct. The reason behind this is that one of our core principles is that basic UD trees should always be strict surface syntax trees and we would have to give up this principle if we allowed empty nodes in the basic representation. Can you say more about why this makes it harder for tools to deal with the enhanced UD graphs? Or do you mean that it makes things more complicated if we have different treatments of ellipsis in the basic and the enhanced representation? |
Thank you @sebschu. Well, I have just learned in the mailing list that this page that I mentioned is not part of the documentation but only an archived discussion . We have also another similar confusion with the two pages about the format: About the tools, what I mean is that most tools will not pay much attention to the DEPS field, I suspect that we must consider this field with the enhanced dependency graph as an alternative but the corpora must always have the best possible analysis in the basic dependencies, right? |
Anything that starts with http://universaldependencies.org/v2/ is only archival. Sorry for the confusion, we will figure out a way to make it clear. Please only consult the documentation that is at the top level directly under http://universaldependencies/. |
Moving here e-mail discussion about precedence of orphans:
@jnivre :
|
@jnivre : I guess that the obliqueness hierarchy, if employed, would look something like this (not sure about the placement of the clausal dependents but they are more likely to have arguments of their own, so placing them lower might make the result more readable):
|
In sentences where the conjunct elements are present in both clauses (the head/ROOT of the sentences), I'd prefer to relate them and them attach all the orphans to the second element of the conjunction: " The total value is 50 million and the deficit, 40 million" (translated from a Portuguese example) How does it sound? This is a problematic sentence, because it is a copular sentence, but still we have an ellipsis here... I'm asking myself how to treat those cases. |
A conj link from "million1" to "million2" is fine, but there should be no "orphan" link here, because the omitted copula is not the root of the clause. The analysis should be: nsubj(million1, value) |
We usually use the But even if I reverse your notation, I would not do what you propose. This is a non-verbal predicate situation, which can occur (cross-linguistically) with or without copula. I understand that using copula is the norm in Portuguese and that it has been elided here, but it is just a missing function word. Its omission does not change anything on the fact that both millions are predicates and value resp. deficit are their subjects. Therefore I would do nsubj(million1, value) |
tks a lot @dan-zeman and @jnivre I was thinking in having an |
It is a trade-off. There are lots and lots of things that could potentially be useful, but in order to satisfy its goal of being a cross-linguistically consistent easily understandable syntactic annotation, UD cannot include all of them but has to put priority on basic syntactic relations. But you can always add it yourself in the MISC field or using standoff annotation. |
So should we add the |
Sounds good to me. |
In #396 it was suggested the head promotion priorities for predicate ellipsis are `nsubj > obj > iobj > obl > advmod > csubj > xcomp > ccomp > advcl`. Also, I think examples of incorrect annotation should be clearly marked, e.g. with red color for the wrong edges.
This might be too late now and I don't feel too strongly about it, but what if we used a different hierarchy so that we end up with constructions that are more parallel to copular constructions (and that way potentially avoid a catastrophe in languages where copulas can optionally be omitted)? In practice, this would mean that we put the So either
or
That way She is a professor and the second clause in He likes tea, and she coffee have a more parallel structure:
and
|
I can see the point of this, but I am not sure the advantages are strong enough to motivate an apparently ad hoc exception to the obliqueness hierarchy. Also, the whole point of the "orphan" relation is to have a warning flag signaling "don't trust this structure to reflect real dependencies", so I am not sure it is only an advantage if the structure looks plausible. And, yes, I think it may be too late now. :) |
From http://universaldependencies.org/v2/ellipsis.html and http://universaldependencies.org/format.html it seems that null nodes can only be referenced in the enhanced dependencies field of conllu files. Is that right? Why can't we mention them in the regular HEAD field? We liked the solution of null fields for ellipsis but we would like to avoid making harder for tools to deal with the enhanced dependencies.
The text was updated successfully, but these errors were encountered: