-
Notifications
You must be signed in to change notification settings - Fork 245
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The "compound" relation for nominals: what, why and when? #761
Comments
I would like to comment here on a point from this post by @amir-zeldes, from #756.
I really do not agree with this interpretation of such word strings, at least not for Italian. Actually, they are not rare at all, on the contrary; in my personal experience, they are even expanding. They are also the standard for signage and newspaper headlines: it is a matter of telegraphic vs. articulated style.There is absolutely no difference between centro trasfusione sangue (ca. 'blood transfusion center') and centro di/per la trasfusione del sangue, lit. 'center of/for the transfusion of the blood'. The structure is the same, but some connectors become "implicit" and relations underspecified, because in context or from a pragmatic point of view they are clear. On the contrary, the French case looks very different, in that the two elements are coreferential: it is a vote and at the same time it is a sanction. Some might see an argument for |
Some remarks on the first post by @Stormur:
|
(Thanks @dan-zeman for your patience.) Right, I have to restrict the issue to nominal compounds (I am changing the title). The field of verbal compounds, light verbs and so on is too vast to fit in here! As for the last point, I thought I had read it on the |
@Stormur I agree that if we didn't have
is a very typical expression of what many morphologists assume is a prototypical characteristic of compounds, for example: " .. two important aspects of compound meaning. The first of these is the idea that there is an underspecified semantic relation between the constituents..." (Bell & Schäfer 2016, "Modelling semantic transparency". Morphology 26, 57–199) Removing compounds in UD would add a major difference between treebank analyses and common practices in linguistics, and I'm not sure what the advantage would be. If we just want to find all cases of noun modification, isn't it still easy to do so using the current scheme? For Italian, I'd like to point out that even if these constructions are understood as telegraphic paraphrases, they are still syntactically distinct from full NP+PP in that we can mix and match determiners or add modifiers, just as in other languages discussed in the above threads:
But not:
etc. From a practical perspective, I think having compounds annotated even for languages like Italian can be helpful, though of course individual language guidelines are developed separately and with all sorts of higher level considerations. For example, you said you have the feeling that this type of telegraphic construction is expanding in Italian -- labeling such cases differently would make it much easier to study whether this is the case, and in what contexts / what kinds of lexical items it tends to appear. |
Haha, you are right, it also occurred to me while writing it! 🙂 Just to make it clear: I am absolutely not contesting the existence of a compound construct, but only its (in my opinion) undue representation/annotation as a separate First thing, some more notes about the Italian "telegraphic style", just to put things in perspective (it really is a very interesting phenomenon with many ramifications!).
I am not sure of the validity of this argument to distinguish these two kinds of constructions. That is, I think that the point of view should be reversed: the telegraphic paraphrase of a sequence of nominal modifiers is only possible when some requirements are satisfied, like absence of determiners, genericity versus specificity, context, and so on. So I would rather say that it is not that such constructs do not admit those things, but that they arise when the right conditions are present. I mean, conversely it would not make much sense to say something like Consorzio di grandi crediti per alcune opere pubbliche... I think it is semantically driven. This and the already stated complete parallelism between telegraphic and articulated forms is just to remark again the arbitrariness of the line between "normal" nominal modifiers and compound-like ones: there is a sort of transition.
So, my suggestion would be not to remove a relation for compounds altogether. I agree with and am very sensitive to the practical problem of retrieving particular constructions and always put some thoughts about it during annotation. But here we have a specular issue: were I doing a research about noun modifiers, instead of focusing only on So, since we kind of seem to agree that I see a major parallel with the core/oblique vs. complement/adjunct distinction here: how can we consistently and cross-linguistically decide when something is a "required argument" of a predicate? It has way too much semantical variation in it. So we just have an overarching So, in my opinion a similar decision for |
I agree with @Stormur that everything is confusing with The definition is very unclear for me: "It is used for any kind of X0 compounding". No idea what this means and there is no reference. It looks like something coming from X-bar syntax, not from dependency syntax. The name: It is very important to recall that all relations in UD are syntactic relations corresponding to particular syntactic constructions. But for The confusion is increased by the fact that In other word, I suggest that the |
Agree that |
I'm not particularly opposed to renaming noun compound to nmod:compound, but I'd like to give what I think is some relevant background, as well as issues from other languages which may make this difficult in practice. First of all, I don't think it's totally arbitrary and unrelated that these things got named For noun compounds, this depends greatly on the language, but I think in virtually every language it means only one determiner for the whole thing (in many languages also: only one number, gender, case etc.) @sylvainkahane rightly pointed out that In terms of phrasal verbs being "one word", this is mainly meaning based in English (e.g. "pick" and "pick up" mean rather different things), but in other Germanic languages, the situation is more complicated. In German, phrasal verbs are literally analyzed as single words depending on tense and sentence position. So we have:
As these examples illustrate, it's tempting to treat the first case as the 'unusual' one, where a complex verb is split in context, but really it's a compound verb with a single lemma, meaning "rise". And in fact, the third example suggests that the complex verb is even nominalizable as a single item, retaining the idiosyncratic meaning and behaving like a single stem in a compound (single gender, case, number and definiteness as well). So does this mean we can't call the nominal case
In the first case, we have a denominal verb derived from an actual noun compound. Should we not label "finger" as So while for English a conversion to nmod:compound would be fairly straightforward (since we already distinguish |
what about |
The definition of |
Some recent discussions about the use of the
compound
relation in English (#753, #756, #757) have acted as slings for a wider discussion on the definition and use of this relation in the annotation of UD treebanks, which deserves its own space.My main point being that actually, at least as it stands now in the guidelines,
compound
seems poorly or unclearly defined and also not so much justified, in that it just appears as a variant ofnmod
/amod
. What I see:compound
ornmod
based e.g. on the position with respect to the head or presence/absence of morphological traits:compound
) vs. tail of the dog (nmod
) vs the dog's tail (nmod
?)It should represent "an absence of internal structure", butvery often constructions usingcompound
see a clear hierarchy, a kind of "nesting":amod(defense,criminal) nmod(attorney,defense) root(attorney)
So, in the end, I would be inclined to consider most of such uses as simple
*mod
s, as the most transversal relation which can account for the variability, and some degree of arbitrariness, of strategies in different languages. I doubt the real usefulness of acompound
relation, arguing that its possible peculiar traits are better seen as a correlation with other factors (word order, morphological features...) under the umbrella of "modifiers".Probably some particular cases remain, bordering on the traditionally called "apposition", such as latin Deus pater 'God father', which I don't know if some approach sees as a compound. But here it is to be cleared the status of
compound
with respect toflat
orappos
, which already seem capable of handling these expressions (in Latin, for example, we are usingflat
, as for name + title).In general, knowing that for Latin we tried to conceive a possible use for
compound
but ended discarding it, and having discussed the phenomenon at length for English, it would be interesting to gather the experience of other treebanks about the use of this relation, to get a clearer picture! One of the reasons I am interested in this is to understand ifcompound
does find its place in the annotation for Latin.The text was updated successfully, but these errors were encountered: