Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tough English sentence with "tough" adjective #308

Open
nschneid opened this issue May 28, 2016 · 71 comments
Open

Tough English sentence with "tough" adjective #308

nschneid opened this issue May 28, 2016 · 71 comments

Comments

@nschneid
Copy link
Contributor

nschneid commented May 28, 2016

In the English syntax notes, the discussion of "tough" adjectives gives an example (112) for "This problem was hard for me to solve".

  • What about "This was a hard problem for me to solve"?
  • In (112), "me" is arguably the subject of "solve" rather than the modifier of "hard" as in (111)—though it may actually be subtly ambiguous. A clearer example is: "It is rare for students to graduate debt-free" (= "It is rare that students graduate debt-free").

[Note by CDM: I think the reference here is to the v1 en specific syntax page; there isn't yet a v2 version.]

@nschneid
Copy link
Contributor Author

@sebschu

@nschneid
Copy link
Contributor Author

An even cleaner example that arose in our semantic annotation of prepositions: "I contracted for HDS to deliver the furniture." ("I contracted for HDS." cannot have this meaning unless it is construed as ellipsis.)

@dan-zeman
Copy link
Member

dan-zeman commented Jun 13, 2016

In (112), I would maybe make nsubj(solve, me) a secondary dependency in the enhanced representation, but on the surface level I would keep nmod(hard, me).

In “This was a hard problem for me to solve”, I would do
nsubj(problem, This)
cop(problem, was)
det(problem, a)
amod(problem, hard)
nmod(hard/problem, me)
case(me, for)
xcomp(hard/problem, solve)

The subtle distinction with the two "hard/problem" heads depends on whether we understand it that it was “hard for me to solve” (and this entire phrase modifies the problem), or that it was a “problem for me to solve”, and BTW it was also hard. I believe that the first reading is more likely but, being a non-native speaker of English, I may be mistaken. And I do not think the semantic difference is too strong, so it may not be so important which of the two options wins.

@nschneid
Copy link
Contributor Author

The subtle distinction with the two "hard/problem" heads depends on whether we understand it that it was “hard for me to solve” (and this entire phrase modifies the problem), or that it was a “problem for me to solve”, and BTW it was also hard. I believe that the first reading is more likely but, being a non-native speaker of English, I may be mistaken.

I agree: the more likely reading in my view is that it is a problem that is "hard for me to solve".

In (112), I would maybe make nsubj(solve, me) a secondary dependency in the enhanced representation, but on the surface level I would keep nmod(hard, me).

What about "I contracted for HDS to deliver the furniture."? nmod(contracted, HDS) would be a bit weird because I wouldn't say "I contracted for HDS"; it is the use with the infinitival that licenses "for". (Otherwise it would probably be "with": see PropBank examples of contract.02.)

@dan-zeman
Copy link
Member

This sentence is outside of the limited English language model in my head :) so it is definitely for the English team to answer. I can see your point but if the weirdness means that we should attach "for HDS" to the infinitive, then I am not sure whether the borderline will be clear in other cases.

@nschneid
Copy link
Contributor Author

@ngiordani, any comment? We're struggling over for-infinitivals in our data.

Another possibility that we've considered is to treat the for-PP as an argument of the adjective, and the embedded verb as an xcomp to indicate that it shares an argument:

This was a hard problem for me to solve
amod(problem, hard)
nmod(problem, me)
case(me, for)
xcomp(problem, solve)
mark(solve, to)

But in "I contracted for HDS to deliver the furniture.", it really does seem like "for" just marks the infinitival subject:

I contracted for HDS to deliver the furniture
ccomp(contracted, deliver)
nsubj(deliver, HDS)
case(HDS, for)
mark(deliver, to)

@ngiordani
Copy link
Contributor

ngiordani commented Jul 14, 2016

Sorry for the delayed response, @nschneid! The guidelines that we used in the EWT for tough-movement are given in this paper, pp. 2-3. By the guidelines, the analysis, updated for UD, would be as below... but keep reading, I suggest modifying this. Also, I'd ask if @tdozat has anything to add, since he's thought a lot about tough-movement.

PREVIOUS ANALYSIS
This was a hard problem for me to solve
nmod(hard, me)
case(me, for)
nsubj(hard, This)
amod(problem, hard)
ccomp(hard, solve)

(The paper explains why "for me" is considered an nmod, not an nsubj, and why ccomp, not xcomp, is used. I guess it's not that nsubj is ruled out, but this is probably an nmod.)

After this paper, we decided not to use ccomp under nonverbal predicates. In general, this decision is a little bit arbitrary and I won't say we are married to it, but in this case I think advcl actually does make more sense, because clearly the clausal modifier is completely optional. So my recommendation would be:

PROBABLY BETTER ANALYSIS
This was a hard problem for me to solve
nmod(hard, me)
case(me, for)
nsubj(hard, This)
amod(problem, hard)
advcl(hard, solve)

In the other case I would argue for mark, and that is what we did in the EWT:

I contracted for HDS to deliver the furniture
ccomp(contracted, deliver)
nsubj(deliver, HDS)
mark(deliver, to)
mark(deliver, for)

I understand that the argument for case would be that "for" licenses the overt subject, and that is certainly a good point.

One argument against is that we don't expect prepositioned subjects in other English constructions, so this would be the only place where we see an nsubj with a case dependent. (There is also the fact that "deliver" doesn't seem to select a prepositioned argument and that this PP seems to have a fixed position, but I admit these would be objections to labeled "HDS" with an nmod more than to labeling "for" with case.

The other (related and maybe stronger) argument for mark is that the appearance of "for" is tied to the infinitival form of the embedded predicate. That would be a strong argument for it to be understood as a dependent of that predicate.

@nschneid
Copy link
Contributor Author

nschneid commented Jul 14, 2016

@ngiordani, thanks! It looks like your recommendations parallel the cleft examples (108) and (109) of the English syntax notes: (108) uses nmod/case, and (109) uses nsubj/mark. [Updated URL for link: en v1 syntax specific constructions.]

@ngiordani
Copy link
Contributor

That's true. Although, 112, which is basically the sentence we are talking about here, uses xcomp. I'm not sure how that happened, but it's wrong...

@nschneid
Copy link
Contributor Author

It does seem to fit pretty well with the definition at xcomp. With the current analysis in (112), the xcomp makes it clear that "solve" should get a semantic argument from "problem" (in fact, it should get two!).

The paper linked by @ngiordani comments that it doesn't fit the classical definition of xcomp in LFG, but I wonder if the practical appeal of making it apparent that an argument should be inferred from a higher predicate should outweigh the historical boundaries of the term.

According to CGEL (p. 542): "For the most part, complements in AdjP structure are optional elements: they qualify as complements by virtue of being licensed by the head rather than being obligatory." One of their examples of a complement is "He's [happy to leave it to you]." So I think advcl would be less appropriate.

@ngiordani
Copy link
Contributor

I don't agree about xcomp. Open complements are defined to take their external argument from (the lowest argument of) the higher clause, which as you point out yourself, doesn't apply to the relation between "problem" and "solve". This isn't just a historical tie to LFG; it's a generalization that allows us to make inferences to identify arguments of predicates labeled xcomp. If you want to say that an xcomp complement inherits some core argument from the higher clause, then we'd have a much harder time coming up with the rules for deciding which argument comes from where and making inferences. (Let me know if you think there is a simple way of designing such rules.) I don't think that's a good move, because as far as I can see, we would lose the robustness of the generalization.

With respect to advcl, well basically the idea is that advcl is the clausal equivalent of nmod. So, being an advcl doesn't necessarily mean the dependent is not an argument, it means it's not a core argument; the premise is that adjectives don't take core arguments (which I guess is pretty similar to saying they don't assign Case, in the language of GB, for example). So, even without committing to a decision about argument/adjunct, what you brought up is not incompatible with advcl. But this is a bit murky at the moment because there is an ongoing discussion about the whole idea of core arguments and how to define them in a way that makes sense crosslinguistically.

@nschneid
Copy link
Contributor Author

If you want to say that an xcomp complement inherits some core argument from the higher clause, then we'd have a much harder time coming up with the rules for deciding which argument comes from where and making inferences. (Let me know if you think there is a simple way of designing such rules.)

In any case there would need to be a special rule to infer the semantic arguments for these for-infinitivals, right? I was thinking that calling it xcomp would make it clearer that something needs to be inferred, and the rules would have to distinguish the different kinds of xcomps based on the syntactic environment. But I haven't tried to write such rules before, so perhaps there's too much ambiguity to do this deterministically from the Basic dependencies anyway.

@dan-zeman
Copy link
Member

Tentatively closing. Feel free to reopen if unresolved issues remain in UD v2.

@nschneid
Copy link
Contributor Author

I would expect to see "tough" constructions and for-infinitivals documented at complex clauses, but I don't, so reopening.

@nschneid nschneid reopened this Apr 29, 2018
@dan-zeman
Copy link
Member

@nschneid could you please propose the documentation, perhaps as a summary from the discussion in this thread?

@mcdm
Copy link
Contributor

mcdm commented May 1, 2018 via email

@nschneid
Copy link
Contributor Author

nschneid commented May 2, 2018

We had some suggestions for analyzing “tough” adjective constructions in our Depling paper 2013 http://ufal.mff.cuni.cz/depling13/proceedings/pdf/W13-3721.pdf). Of course these can be revisited!

Yes, @ngiordani made reference to the paper in this post, saying that there was a later decision not to use ccomp with nonverbal heads.

Is that decision still valid? I see a lot of adjective-headed ccomps in the data: "It is clear that...", "I am sure that", "I was able to", etc., quite apart from tough adjectives. Those certainly look to me like complement clauses, and in fact the new ccomp guidelines mention that the head should be a verb or adjective.

So should we go ahead with the 2013 paper analysis, namely:

This was a hard problem for me to solve

nmod(hard, me)
case(me, for)
nsubj(hard, This)
amod(problem, hard)
ccomp(hard, solve)

I contracted for HDS to deliver the furniture

ccomp(contracted, deliver)
nsubj(deliver, HDS)
mark(deliver, to)
mark(deliver, for)

?

@jnivre
Copy link
Contributor

jnivre commented May 2, 2018

The status of adjectives as predicates (and the implications for labelling their dependents) has never been fully resolved in UD. My own feeling is that, as long as we use "amod" rather than "acl", we treat the adjective as a modifier word, not as a predicate, and should therefore not apply the core-oblique distinction to its dependents (which implies using "advcl" rather than "ccomp"). However, if we do use "ccomp", then it seems that we should also use "obl" (rather than "nmod") for "(for) me".

@nschneid
Copy link
Contributor Author

nschneid commented May 2, 2018

I'm no syntactician, but there are clear parallels with verb complement clauses:

  • Verb head:
    • They were informed of the problem.
    • They were informed that it was a problem.
  • Adj head:
    • They were aware of the problem.
    • They were aware that it was a problem.

From an annotator's perspective, "that" + non-relative clause is a nice recognizable pattern; it would be convenient if it always signaled ccomp.

Can't we just say that predicative adjectives are predicates, whereas attributive adjectives (amod) are not?

if we do use "ccomp", then it seems that we should also use "obl" (rather than "nmod") for "(for) me".

Good point. The guidelines call for nmod only for modifiers of a noun/NP. In EWT the only ADJ-headed nmods are "many of" and "most of".

@dan-zeman
Copy link
Member

if we do use "ccomp", then it seems that we should also use "obl" (rather than "nmod") for "(for) me".

Good point. The guidelines call for nmod only for modifiers of a noun/NP. In EWT the only ADJ-headed nmods are "many of" and "most of".

It could be read as an NP whose head noun has been elided, and the adjective many/most has been promoted to the head position.

@jnivre
Copy link
Contributor

jnivre commented May 2, 2018

You will never get ccomp for all that-clauses as long as we use acl in constructions like "the fact that ...". Note that UD doesn't have a complement-modifier distinction, it has a core-oblique distinction, and (at least according to the current guidelines), this only applies at the clause level. So the key issue here is whether the adjective is a clausal predicate or not, which it clearly is in "they were aware that ...". The fact that something switches from being ccomp to being advcl or acl when a construction is nominalised is a regular and expected feature of the annotation. Compare, for example:

She realised that ... ccomp
Her realisation that ... acl

@amir-zeldes
Copy link
Contributor

I'm actually all for "realization ... that X" being ccomp. Treating it as acl removes the distinction between:

The realization that I was made aware of (true acl)
The realization that ice cream is delicious (a nominalized ccomp)

I understand the core/oblique distinction idea, but I'm not sure why we would want to not make the above distinction, which to me looks syntactically very clear (in one 'that' is substitutable by 'which', in the other not). This seems like a loss of information.

@jnivre
Copy link
Contributor

jnivre commented May 2, 2018

That distinction is already made by having the first be acl:relcl. Using ccomp for dependents of nouns would require us to systematically distinguish complements from adjuncts in this position, which is even trickier than with verbs. I doubt that the benefits would outweigh the costs.

@nschneid
Copy link
Contributor Author

nschneid commented May 20, 2018

@adam-przepiorkowski, the coordination examples are interesting. To clarify: with

  • He told me an idea and that he thought it was viable. — NP+clause coordination
  • He told me about an idea and that he thought it was viable. — PP+clause coordination (*He told me about that he thought it was viable.) [I'm avoiding "ask" because both "ask whether" and "ask about whether" are valid, so the constituent structure is not obvious.]

Are you arguing that the first that-clause should be considered core because it's coordinated with an NP (direct object), whereas the second that-clause should be non-core because it's coordinated with a PP (oblique)? What is the basis for assuming that a core dependent can only be coordinated with another core dependent?

(Granted, the way UD handles coordination means that we aren't marking coreness on the second conjunct anyway....)

@nschneid
Copy link
Contributor Author

nschneid commented May 20, 2018

Oh, I think I see—you're assuming that the coordinated material forms a phrase which has to be either core or non-core with respect to the verb? Because of the way UD handles coordination, the answer might be that the coreness of the phrase is determined by the first conjunct. Unless you're arguing that unlike phrase coordination is evidence that UD's approach to coordination should be overhauled. :) These cases seem different from canonical coordination anyway—for example, I don't know if the clause can come first (??He told me (that) he liked my outfit and about a friend with the same shirt).

@adam-przepiorkowski
Copy link
Contributor

Yes, we are assuming that we want to avoid the situation where one conjunct is classified as core and the other as non-core, as then it is not clear how to treat the whole coordinated dependent. As to ‘inheriting’ from the first conjunct, Sag et al. 1985 (the classical “Coordination and How to Distinguish Categories”) give the following minimal pair:

• I didn't remember until it was too late John's inability to get along with Pat, and that he had no background in logic.
• I didn't remember until it was too late that John had no background in logic, and his inability to get along with Pat.

(In fact, they try to explain the latter away, as their theory does not predict it, but the fact remains that both are acceptable.)

It is true that sometimes only one word order is possible (when such a coordinated NP + CP phrase is a dependent of a preposition; relevant examples are again given in Sag et al. 1985), but in the usual case I think both orders are often fine when the NP is sufficiently heavy to follow the CP.

Of course we can call such examples ‘non-canonical’ and sweep them under the rug, but – at least in some languages – they are quite common. For example, in the largest Polish valency dictionary, Walenty, over 12% of valency schemata contain a position which has morphosyntactically different realisations which may be coordinated.

@amir-zeldes
Copy link
Contributor

amir-zeldes commented May 20, 2018

Hi all, thanks for the discussion! Just two thoughts from me:

  1. about @jnivre 's example with 'litar på / litar på att': I think this is what we sign up for the moment we give up prepositional heads, and more specifically labels like Stanford's pcomp. The reason for this is not that the verb has two complementation structures, rather we can interpret it to say it takes the same complement in both cases: 'på'. Then it's the preposition that can take an NP or CP. I'd also like to point out that languages like German, where we have a correlate structure, are already following the ccomp analysis as follows:
ich zähle darauf, dass du kommst
I count thereupon, that you come
ccomp(zähle,kommst)
advmod(zähle,darauf)

We can argue about whether we think 'darauf' is an advmod, but in any case they are already in line with @manning 's suggestion.

  1. I'm not sure what @adam-przepiorkowski 's example is showing: is it a problem that unlike coordination can occur? If I understood @nschneid correctly then I agree with him: it's the same as any case of coordination, where the left-most conjunct gets 'first dibs' in determining the relation. How is this different from the following:
Her coming and John's behavior annoy me (csubj + nsubj -> csubj(annoy,coming))
I want to stay home or at least in the city (advmod + obl -> advmod(stay, home))

If the point is that one conjunct is core and one non-core, then even prepositional dative coordination is problematic (I gave John a book and chocolates to Mary). I'm also not sure how this would mix with the idea that prepositionally marked clauses are necessarily oblique, what do we do when they're subjects?

For Kim to go is dangerous 
csubj?(dangerous,go)
advcl?(dangerous,go)

And of course these can be coordinated too:

Declaring war is dangerous and for Kim to go instead too.

@amir-zeldes
Copy link
Contributor

I'm just seeing @adam-przepiorkowski 's last response: yes, exactly, so why enforce that both conjuncts must have the same coreness? I think that doesn't work out empirically, at least in corner cases.

I think the real way to solve it, if we have the resources, is using enhanced non-trees which mark the the second (or subsequent) conjuncts separately when needed.

@LarsAhrenberg
Copy link
Contributor

I'd like to disagree with @amir-zeldes in his suggestion that taking the preposition as head in the constructions 'litar på / litar på att' would help. Although there are many problems with the UD analysis of prepositions as consistent case markers, in these cases the verb is critical for the selection of complements. There are other verbs that select prepositions that don't take clausal complements, such as 'hälsar (på)':

Hon hälsar på alla hon möter / She greets (on) everyone she sees
and others that accept a clausal complement but not an infinitival one (lita på; tvivla på / doubt) and vice versa. In all cases the preposition lacks independent meaning.
At the same time, I agree that a relation such as 'ccomp:obl' or 'cobl' could help make the parallelism between NP objects and Clausal complements clearer. In particular 'obj' and 'ccomp' (unmodified) would be clear correspondents.

@adam-przepiorkowski
Copy link
Contributor

@nschneid and @amir-zeldes: Yes, an important assumption of the argument I gave is that coreness is a property of a whole dependent, so marking a part of it as core and another part as non-core is incoherent. I can see now that this assumption is not necessarily universally accepted. But if it is not, it is not clear (to me) that coreness means anything substantial in UD.

The difference between ‘unlike coreness’ and ‘unlike category‘ is clear. These days linguists do not assume that the category of the whole coordinate structure must be the same as the category of each conjunct. In fact, some believe that the category of the whole coordinate structure is the same as that of the first conjunct (e.g., Peterson 2004 in NLLT – within LFG, Zhang 2009 in her CUP book – within Minimalism), so UD may be on the right track here. But I think that – with the exception of the typologically and lexically very limited construction known as lexico-semantic or hybrid coordination, which receives rather different analyses than the usual coordination – nobody believes that two conjuncts bear different grammatical functions with respect to the governing head. For example, nobody – I think, although I am less sure now – would want to say that in “Pat remembered the appointment and that it was important to be on time” the conjunct “the appointment” is the direct object but the conjunct “that it was…” bears a different grammatical function. And nobody – I think – would want to say that the grammatical function of this coordinated dependent varies with the order of the conjuncts. But this is exactly what UD is saying at the moment in the case of “He asked her for a kiss and to go on a date with him” (etc.): here one conjunct (“for a kiss”) is oblique and the other (“to go on a date with him”) is core, so they are analysed as having different grammatical functions. Isn't that a conceptual problem?

@sylvainkahane
Copy link
Contributor

@adam-przepiorkowski: UD is an annotation scheme, not a linguistic model. And UD relations are not grammatical functions. I hope that nobody beleives that nsubj and csubj correspond to two different functions. In French, an obj, an xcomp and a ccomp can very often fill the same grammatical function:

j'aime le chololat 'I like chocolate'
j'aime lire 'I like to read'
j'aime que tu lises 'I like you to read', lit. I like that you read

Maybe it would be better to have developped an annotation scheme based on grammatical functions, but it is definitely not the case of UD.

@jnivre
Copy link
Contributor

jnivre commented May 21, 2018

I think it is too strong to say that UD is not based on grammatical functions. It is rather based on grammatical functions cross-classified with structural properties of the head of dependent. This is what is meant by a "mixed structural-functional system" in the guidelines (http://universaldependencies.org/u/overview/syntax.html).

@adam-przepiorkowski
Copy link
Contributor

@sylvainkahane: Sure, but there is a connection between grammatical functions (in the linguistic sense) and UD's core/non-core distinction, isn't there? Doesn't the sameness of grammatical functions imply the sameness of the core/non-core status? If so, the reasoning is:
• conjuncts have the same grammatical function with respect to the governing head (common linguistic assumption)
• hence, they should all be core or all non-core.
If this reasoning does not go through, then I am not sure what the intensional content of the core/non-core distinction in UD is (as opposed to the extensional set of guidelines: “if it is a bare NP, it is core”, “If it is a PP, it is non-core”, etc.).

@amir-zeldes
Copy link
Contributor

amir-zeldes commented May 21, 2018

@LarsAhrenberg : you're right, certainly not all prepositions can take NP and CP complements freely, and this can be constrained by the verb. I'm also not advocating a return to pcomp, I'm just pointing out that in a pcomp analysis, the problem doesn't arise, since the verb governs the preposition as prep either way. I think ccomp:obl is an elegant solution, since it delineates the categories clearly, doesn't introduce a new main label for people who might not want to make the distinction or are growing weary of major changes, and is easy to automatically induce from plain ccomp (just check if there's an ADP child with mark).

@adam-przepiorkowski : I agree that like-function coordination is overwhelmingly more common, but in real data, unlike-coordination absolutely does occur, so it's not a hard constraint. The most prominent example is sylleptic zeugma, but those often sound constructed:

  • their cakes make money and us fat (obj+xcomp)
  • he made his apologies and for the door (obj+obl)

Real corpus examples are often more subtle, and I think this is perfectly natural, e.g. from GUM:

  • He washed his neck and behind his ears (obj+obl: annotated as obj->conj->case in GUM)

I see how in the latter example one might want to say 'behind his ears' is somehow converted to an NP
(=[place] behind his ears), but I think if it weren't for the first conjunct, most people would annotate this as obl. Maybe the 'first conjunct is more important'-tendency is reflected in the preference to order these <core,non-core>?

@adam-przepiorkowski
Copy link
Contributor

@amir-zeldes: I don't know about sylleptic zeugma :-), but in the case of “He washed his neck and behind his ears”, both conjuncts are best analysed as direct objects, I think. Such PP direct objects and subjects are extensively discussed in Ewa Jaworska's 1986 Journal of Linguistics paper, where she gives examples such as the following (any many more, from English and Polish, mainly locative and temporal), convincingly arguing for their direct object / subject status:

• They considered after the holidays to be too late for a family gathering.
After the holidays was considered to be too late for a family gathering.

@amir-zeldes
Copy link
Contributor

I fully agree that 'behind his ears' can fulfill the same thematic role as a theme phrase, but I think most UD annotators would probably label "he washed behind the ears" as obl + case, not obj + case, if only for consistency in not having obj + case in English... Because basic UD is just a dependency graph, these phrasal 'double duty' considerations have to come up short on one end or the other. In constituents you could do something like (NP (PP behind the ears)), but in pure dependencies we don't have that option.

Either way, I think the restriction on not having core + non-core coordination is an interesting and probably very strong tendency in language. But I don't think it can be a hard and fast annotation guideline, since natural language does contain intentional semantic zeugmas that defy this rule, and also the more mundane examples of the 'washing behind the ears' type.

@manning
Copy link
Contributor

manning commented May 27, 2018

I agree that the Swedish example cited by @jnivre 6 days before this comment is quite compelling for needing clausal obliques. (This made me curious about what you do with these at the moment in the Swedish treebank – if I successfully managed with the Turku search tool and my non-existent Swedish, it seems that you label them as advcl. Right?) So maybe it would indeed be better to rename advcl to cobl after all, notwithstanding my earlier objection to that? (Though I am still somewhat sympathetic to @nschneid's initial comments that largely support my earlier position…. Thanks!) The other mentioned alternative of using ccomp:obl is possible in UD v2 and doesn't require giving up the familiar advcl (both good things!), but on the other hand, it is conceptually rather problematic, since we would then have a 3 way distinction for clausal dependents of clauses (core argument, oblique argument, adjunct), which we most deliberately avoided having for nominal dependents, so that must surely not be the perfect solution.

@jnivre
Copy link
Contributor

jnivre commented May 27, 2018

Yes, we analyse them as advcl, because we essentially interpret advcl as cobl. I have a hard time making up my mind about this, because I do think advcl is a familiar and intuitive name for most people. On the other hand, if we really intend it to mean cobl, one could argue that the familiarity is misleading.

@jnivre
Copy link
Contributor

jnivre commented May 27, 2018

I completely agree that we don't want a three-way distinction core-oblique-adjunct for clauses as long as we don't have it for nominals.

@nschneid
Copy link
Contributor Author

This thread has gone on for long enough that I'm not sure where things stand anymore, but here's my current thinking:

I'm very happy with the core/oblique distinction as implemented for nominal dependents of verbs in English, because there's an easy rule: if it has a preposition, it's oblique; otherwise it's core. Rather than ask the annotators to think about the valency of the verb (which of the current arguments are optional, which arguments could be added or substituted), the test is based solely on the marking (preposition or not).

Can we define core/non-core for clausal dependents similarly? The test for core clausal dependents could be, does the clause exhibit a marking strategy (specific to the language) that primarily/canonically alternates with core nominal dependents of verbs? For English, I think non-relative that-clauses are such a marking strategy—they canonically alternate with objects (I know [many facts about cats]/[that cats like boxes]) or subjects ([My cat]/[That cats like boxes] drives me crazy.) I think on the basis of this overt marking, it would be reasonable to extend this notion of coreness to that-clauses headed by adjectives, and perhaps even nouns. For Swedish, which can have a preposition plus att, perhaps the language-internal criteria would be different.

For to-infinitivals I think it would be helpful from a downstream usability perspective to distinguish I want to eat (complement) vs. I arrived to eat (purpose adjunct). But I concede that this is more of an argument/adjunct distinction, and wouldn't object to declaring all to-infinitivals core and using subtypes (as others have proposed) to distinguish argument/adjunct. Thus: I want to eat would be xcomp:arg and I arrived to eat (currently advcl) would become xcomp:adjunct.

Subordinating conjunctions seem to be the main way of marking non-core clausal dependents in English, so those would be advcl, cobl, or whatever label we want to use.

@nschneid
Copy link
Contributor Author

Logistical aside: Assuming that lots more discussion of these issues is needed, will enough people be attending NAACL or COLING that we can hold such a discussion in person? (I'll be at both.)

@jnivre
Copy link
Contributor

jnivre commented May 29, 2018

Just a quick note to say that @dan-zeman is visiting us in Uppsala this week and we are hoping to put together some kind of proposal about this. I think it will be roughly consistent with what you suggest (although it is important to also think about how it generalises to other languages).

Unfortunately, I won't be at either NAACL or COLING this year, only ACL and EMNLP.

@amir-zeldes
Copy link
Contributor

amir-zeldes commented May 29, 2018

Yes, I also think either an in person meeting, or if need be maybe video chat could be useful here.

A few quick replies to the above:

  • @nschneid - I think the Swedish att problem applies to English cases like "we agreed on us ordering pizza" too, which I think is clearly an argument

  • Giving up xcomp : advcl - we currently maintain this distinction in the Stanford Dependencies of GUM so it's easy for us to keep it in UD too. I agree that it is absolutely an argument/adjunct distinction, but I think it's syntactically testable - we just have annotators try to insert 'in order to':

    • I arrived in order to eat
    • * I want in order to eat

I know we've been assuming that we don't want the argument/adjunct distinction on account of it being murky for obl in some cases, but it's starting to feel to me like we want to throw out the baby with the bathwater here. So I guess I'm thinking maybe we should go with optional :arg and :adjunct or something similar for all of these cases?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests