Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Validation of reduced form of "had better/best" #803

Closed
manning opened this issue Aug 2, 2021 · 25 comments
Closed

Validation of reduced form of "had better/best" #803

manning opened this issue Aug 2, 2021 · 25 comments
Assignees
Milestone

Comments

@manning
Copy link
Contributor

manning commented Aug 2, 2021

English has a periphrastic semi-auxiliary "had better/best" meaning "should": You had better leave now.

We currently analyze that as:
leave --aux--> had/AUX
had --advmod--> better/ADV
which seems to me appropriate. (You could suggest instead fixed or something rather than advmod, but there seems no particular reason to regard 'better' as anything other than an ADV, which is all that is relevant below.)

However, this form is often reduced in conversational English: You better leave now.
By the normal UD rules of ellipsis, 'better' is promoted and what you get is:
leave --aux--> better/ADV

But this isn't allowed by the tools validator: rel-upos-aux insists that an aux must be of PoS AUX. Could/should this test be relaxed to allow ADV? I think that's my preferred analysis.

The other apparent alternative would be to make the PoS of 'better' and 'best' AUX here, and then to add 'better' and 'best' as recognized AUX for English. This is apparently what Sag et al. suggest in their paper Lessons from the English Auxiliary System, but it makes 'better' a very defective auxiliary, and, to my mind, all the things it cannot do, such as inversion in questions and tag question formation, are much better explained by assuming the elision of 'had' analysis.

CC @amir-zeldes @nschneid

@nschneid
Copy link
Contributor

nschneid commented Aug 2, 2021

Apparently "have to" meaning 'must' is analyzed compositionally as a VERB + infinitive marker, though its function resembles a modal auxiliary. Is there an argument to be made that in "had better", "had" is a full VERB and "better" is an advmod of the complement?

Some observations:

  1. Morphology: "had" is fixed in form; it cannot be "have better" or "has better". So it resembles an AUX in this way.
  2. Tense: "(had) better" is finite similarly to "should". The following verb is bare:
    • You (had) better be careful!
  3. Even with "had", it is more frozen than other auxiliaries in its syntactic behavior:
    • *Had better we arrive early?
    • *Had we better arrive early?
    • ?We had better arrive early, hadn't we?
  4. Combinability with other auxes: for those of us who don't have multiple modals, it seems to fill the modal slot.
    • *We must had/have better arrive early.
  5. Subjectless imperative: if "had" is omitted "better" can be subjectless, unlike other modals:
    • Better arrive early! (Do arrive early!)
    • *Had better arrive early!
    • *Must arrive early!
  6. Negation: "not" goes strictly after "better".
    • You had better not arrive late! (You must not arrive late!)
    • *You had not better arrive late!

N.B. There is also "(had) best", with the same meaning as far as I know.

@dan-zeman dan-zeman added this to the v2.9 milestone Aug 3, 2021
@dan-zeman
Copy link
Member

Interesting. I never thought of an auxiliary when I saw better in sentences like You better leave. I would have attached it as an advmod to the main verb. Can't we simply do that? It would seem easier to grasp for non-linguists, and it would not cause any validation issues.

I was not aware of the literature that suggests that better or had better is a semi-auxiliary in English. But there is still the option to attach the two words as siblings to the main verb (we do this with multiple auxiliaries), and then I'd prefer to leave better as advmod rather than making it aux.

FWIW, Czech has a similar construction with raději “better”, and it is an ADV/advmod, whether it occurs together with an auxiliary or not (but the situation is slightly different because modals are not treated as auxiliaries in Czech). Měl raději přijít. “He had better come. / He should have come.” Měl by raději přijít. “He had/should better come.” Raději přijď. “Better come. / You better come.” Raději přijde. “He would rather come.”

@amir-zeldes
Copy link
Contributor

amir-zeldes commented Aug 3, 2021

I agree with @dan-zeman - I think the construction has a modal meaning, but the word "better" is still doing its regular job of saying that something would be better. I'm fine with regarding it as ellipsis of the auxiliary, but that should not cause promotion under UD, since the auxiliary is a leaf. If we just went by modal semantics, we might consider things like "better to have loved and lost" to begin with AUX, but I don't think that's right.

If we really want to indicate the missing auxiliary, I guess this goes back to @nschneid 's question in #801 and we could reconstruct it somehow. But I think non-marking of modal semantics using the AUX tag is probably widespread. In German you can get examples like

  • Ginge er wieder hinaus, so täte er es so, dass jeder wusste, wer er war - Lit. "go.subjunctive he out again, so do.subjunctive he it so, that everyone knew who he was", i.e. "if he were to go out, he would do it in such a way that.." (Vengeance: Das Ende vom Anfang/Ankaria Kison)

Here both modal verbs are lexical heads and should not be AUX, despite being hypothetical, and I think the "better" case after ellipsis is no different.

@manning
Copy link
Contributor Author

manning commented Aug 16, 2021

Hi everyone, thanks for the thoughts, sorry that I got distracted with other things for a while.

I think it is unquestionable that "had better" is a semi-auxiliary in English and this "better" is not behaving like a regular adverb. You'd have He has written it better not *He has better written it. And I agree that our current treatment of English semi-auxiliaries is overall a bit iffy and varied.

Nevertheless, I can certainly see the argument that @dan-zeman and @amir-zeldes make that we could just regard the two words as modifiers of the verb – that's normally what we do when there are multiple auxiliaries – and then it probably makes more sense to keep better as an advmod despite its modal meaning and its use as a semi-auxiliary here. So, I'll go with that for the moment, and make it an advmod and problem solved.

@jheinecke
Copy link
Contributor

Just a question of a non native speaker. I'm well aware of the colloquial way of saying "you better leave" instead of "you had better leave". How would the colloquial way be for a third person: "He better leaves [before they find out]" or "he better leave ...".
If it is "He better leaves" how this would be annotated ?

@nschneid
Copy link
Contributor

nschneid commented Aug 20, 2021

Always the bare form of the verb, though for me this construction feels most natural with "you" and less so with the 3rd person singular. (It is more prominent in certain dialects.)

If you want to see examples in the wild: Google n-grams for "(pronoun) better not", which should be unambiguous.

@nschneid
Copy link
Contributor

nschneid commented Jan 2, 2022

I've been thinking more about this after reading A Student's Introduction to English Grammar which is a textbook giving the highlights of CGEL. They give a crisp definition of English auxiliaries: verbs that 1) are targets of subject-auxiliary inversion, without do-support, and 2) take negation without do-support (especially with n't forms).

By these criteria, "had better" seems like AUX+ADV:

  • Simple: We had better leave.
  • Subj-aux inversion: Had we better leave? (sounds a bit odd/formal to my ears, but not terrible)
  • Negated subj-aux inversion: Hadn't we better leave? (also a bit formal)
  • Negation: We had/We'd better not leave. Cf. We had definitely not left at midnight.

Granted, "had" here is defective—it cannot be "has better" etc. And the had+better combination is idiomatic in meaning.

But I think omitting "had" shows that "better" is partially grammaticalized as an auxiliary in its own right, but defective in disallowing a question form:

  • Simple: You better watch out.
  • Negation: You better not cry.
  • Subj-aux inversion: *Better you watch out? *Do you better watch out?
  • Negated subj-aux inversion: *Better not you watch out? *Don't you better watch out?

Perhaps the phonologically reduced "'d" form of "had" stopped being salient and people started dropping it in the not-question forms, which would explain why plain "better" cannot (yet) form questions.

However, I don't think "better" is really a modal aux (yet):

  • *We can and better leave.
  • ?We can and had better leave.
  • We can and should leave.

Like the perfect, this construction cannot combine with a modal:

  • *We had might eat.
  • *We had better might eat.

But unlike the perfect have, and like a modal, "(had) better" takes a bare infinitival complement (without to). If we think this morphosyntactic property is important to capture in the dependency structure, it could be an argument for fixed(had, better), but that would obscure the fact that they separate in questions. I guess we have a special sense of "had" that requires "better" plus a bare infinitival complement.

Personally I don't love the elided "had" analysis, because I think if that were the case it would license a tag question:

  • You had better watch out, hadn't you?
  • ??You better watch out, hadn't you?

Also, as I noted earlier in the thread, an imperative omitting "you" can occur with plain "better", which is not possible with "had":

  • You had better leave early! *Had better leave early or you'll regret it!
  • You better leave early! Better leave early or you'll regret it!

This could be evidence that it is acquiring verblike properties (like the other auxes). Cf. "Come eat breakfast!", "Help me eat breakfast!"

TL;DR On balance, I would lean toward analyzing "better" as ADV/advmod when it follows "had", but as an AUX when it occurs without "had".

@nschneid
Copy link
Contributor

nschneid commented Jan 3, 2022

Also interesting: "had better" seems to resist licensing a than-PP:

  • ??We had better leave now than stay.
  • ??We had better leave now than later.
  • We had better leave now rather than stay/later.

Compare:

  • We are better off leaving than staying.
  • We are better off leaving now than later.

Discussion of "had better/best" for English learners

@amir-zeldes
Copy link
Contributor

TL;DR On balance, I would lean toward analyzing "better" as ADV/advmod when it follows "had", but as an AUX when it occurs without "had".

I disagree - I think the distributional effects you pointed out are basically the result of ellipsis: the construction without "had" just behave as if "had" is there, and therefore has the same verb-like distribution, so what you are saying about the tags is basically the same as saying that upos should be promoted along the same lines as deprel. But if that were the policy, then this would be correct too:

John/PROPN likes/VERB coffee and Mary/VERB tea

Which I think is wrong. POS tags are in my opinion for the traditional parts of speech, which are perhaps not 100% morphologically determined (esp. Eng. ADJ vs. ADV), but lean very heavily on morphological distinctions. AUX in English is assigned to verbs (modal or regularly inflecting); the word "better" is a comparative adjective, used here adverbially (effectively a suppletive comparative to the adverb "well"). It inflects for degree, i.e. it alternates with "best", which for me puts it firmly in the ADJ/ADV domain. I also think the lemma of this word is "well", which would be very strange for an auxiliary reading. I think a user searching for comparatives, or the lemma "well", would be surprised not to find this one because it has been tagged AUX (and note that if you wanted to keep the Degree=Cmp you would need to allow that on an AUX)

For all of these reasons, I think it should be tagged as ADV.

@nschneid
Copy link
Contributor

nschneid commented Jan 4, 2022

the construction without "had" just behave as if "had" is there, and therefore has the same verb-like distribution

Not entirely true—the subjectless imperative is a clear case where "had" does not work, at least for me:

  • Better leave early or you'll regret it!
  • *Had better leave early or you'll regret it!

It seems clear that "had better" started out as compositional AUX+ADV and underwent a shift, with some syntactic/semantic properties changing in idiosyncratic ways, and probably there is variation by dialect/register.

This construction is so interesting that I went ahead and did some digging in COCA, with interesting results. More on that soon.

@dan-zeman
Copy link
Member

Not entirely true—the subjectless imperative is a clear case where "had" does not work, at least for me:

  • Better leave early or you'll regret it!

But we cannot use the subject-auxiliary inversion test for such imperatives, so we cannot show that better has anything verbal or auxiliary to it. I would find it natural to say that leave is the root and better is its advmod.

@amir-zeldes
Copy link
Contributor

I certainly agree it's an interesting construction, but "better leave" is not in the imperative mood IMO. It's just an ellipsis of "you had better leave early or you'll regret it!", which is not imperative, since the morphological imperative form of the matrix verb (from a surfacy, non-UD perspective), is "have" - as in "have some coffee". It is certainly a directive, but in the same sense that you can use a non-imperative gerund in "no loitering!".

@nschneid
Copy link
Contributor

nschneid commented Jan 4, 2022

I certainly agree it's not imperative with "had", because it would have to be "have". It is similar to how modals block imperatives:

*Must leave tomorrow! (cannot mean "You must leave tomorrow!")

But "Better leave early!" feels different somehow from "Probably/definitely leave early!" The latter sounds fragmentary, whereas "Better leave early!" is highly idiomatic colloquial English, construed as a suggestion (similar to "Let's" or "Why not"). And "better" has different constraints regarding person; I think it is typically 2nd person, though could mean "we" as well. Compare:

  • A: Should they leave early or stay?
    B: Definitely leave early. (Or: "Leave early, definitely.")
  • A: Should they leave early or stay?
    B: ??Better leave early. *Leave early, better. ("(It's) better to leave early" or "(It's) better if they leave early" is fine.)
  • A: Should we leave early or stay?
    B: Definitely leave early. Leave early, definitely. Better leave early.

"They (had) better leave early" is fine, but without the subject I don't think the 3rd person makes sense.

I don't know if "imperative" is the right label here, but it seems to me that "better" is not acting entirely like a normal adverb when there's no subject.

@amir-zeldes
Copy link
Contributor

Actually "definitely leave early!" sounds like a normal imperative to me. As for "Better leave early!", it is certainly idiomatic, but it is still an ellipsis for the also idiomatic "You had better leave early!". Is there any fact about "better leave early" that is not captured by the ellipsis analysis? Also, is this just a hypothetical discussion or do you actually want to tag it as AUX? If so, what would you suggest to use as the lemma and feats?

@nschneid
Copy link
Contributor

nschneid commented Jan 4, 2022

"Definitely leave early" is imperative when referring to a third party?

@amir-zeldes
Copy link
Contributor

No, with the exclamation point I meant it as a second person, regular imperative. But I'm not sure it's relevant for the "better" forms, I think they can only stand in places where the version with the subject can stand, and more generally, I think allowing AUX on this word just because of ellipsis of a canonical auxiliary is too big a disruption to the English upos and feat ecosystems, which don't really account for promotion (again, I think it would mean you'd have to give it a VerbForm, remove the Degree, lemmatize it to... I don't even know what...)

@nschneid
Copy link
Contributor

nschneid commented Jan 4, 2022

I think I'm arguing that there's a "Better VP!" sentence construction which can only be interpreted as a directive toward a 1st or 2nd person party, and goes beyond a compositional adverb use of "better". That may not be relevant to the POS tag; I'm trying to work out what's going on with "(had) better" in general.

Given our "POS follows lexical form" precedent from the deverbal connectives discussion (UniversalDependencies/UD_English-EWT#179), I think I'm persuaded that ADV is OK for "better" across the board.

But because there is something special going on morphosyntactically with "had better" licensing a bare infinitival, which is distinct from the perfect use of "had", it makes me uncomfortable to treat plain "better" as advmod. ("They better be/*are on time".)

Why not say "better" is functioning like an aux there given that its presence affects the verb? It passes the negation test (no "do" required: "You better not..."). The SAI test is arguably indeterminate because "They better..." cannot directly be transformed into a question via SAI (and "Hadn't they better..." sounds awfully formal to me; not the kind of thing that would be uttered in the same context as "They better").

From a COCA search it turns out that "PRON better _v" is about twice as frequent as with "had" or "'d" (same with the "best" variant though that is much rarer overall). So it feels presumptuous to assume that in 20k+ attestations without "had", speakers are somehow deleting it. Even if we were to call it had-ellipsis, there would be no way to annotate that even in the Enhanced layer because empty nodes are added only for elided predicates, not elided auxiliaries.

Another point is that the comparative meaning is highly reduced in this construction. It is sort of comparative in the extended modal sense of one state of affairs being preferred over another (like "should"). Unlike ordinary comparatives, though, it doesn't really license than-PPs; *had much better is ungrammatical for me (it is attested in COCA, but rare); and AFAICT, "had best" has no difference in meaning for "had better" (it is not to a greater degree as the morphology would suggest).

Anyway, the "(had) better" construction is clearly weird and resembles modal auxiliary marking in some ways. Given that it makes sense to analyze "had" as AUX/aux, I don't mind treating subsequent "better" like ADV/advmod, but I hesitate to annotate a tree with no trace of an auxiliary if there is no "had" and "better" is what signals the auxiliary-like function to the hearer.

@nschneid
Copy link
Contributor

nschneid commented Jan 4, 2022

Oh BTW I figured I should look up "had better" in CGEL, which says (p. 113):

image

image

So they acknowledge in the footnote that "better" is grammaticalizing as an aux, but don't directly say how to analyze the "You better go now" case. I don't see a mention of the subjectless version.

@dan-zeman
Copy link
Member

because there is something special going on morphosyntactically with "had better" ..., it makes me uncomfortable to treat plain "better" as advmod

There is something special about it when seen from either side; better does not behave as other adverbs, so some people may be uncomfortable with advmod, and it does not behave as other auxiliaries, so some people may be uncomfortable with aux. We don't have a special relation just for better, so it has to end up in one of the not-so-perfectly-matching categories. Since in other contexts the word is used as an adjective or adverb, I suppose that advmod is overall less surprising for users, even if not a perfect match in the idiomatic reading, so I would keep the status quo.

@amir-zeldes
Copy link
Contributor

they acknowledge in the footnote that "better" is grammaticalizing as an aux

Actually I read their note to say that this is non-standard and more common in children's speech... meaning the normal reading of this is still not an AUX.

If we had frequent examples like "bettern't" then I guess that would be a bit different - for me, that is totally ungrammatical. But as it stands, splitting the reduced version from "had better", such that "has" is AUX if it appears but "better" is AUX if it doesn't appear, is less desirable IMO, because we would be splitting what is essentially a single construction into two conflicting analyses. And I still believe the ellipsis analysis captures the facts in a uniform way - the fact that UD is not great for expressing and distinguishing different kinds of ellipsis (except RNR etc.) is something to think about, but this seems to me like the wrong solution here.

@nschneid
Copy link
Contributor

nschneid commented Jan 6, 2022 via email

@amir-zeldes
Copy link
Contributor

I certainly wouldn't object to it as a principle; perhaps the only danger is that it might be done inconsistently (some ellipses would be addressed, leading users to think all such ellipses are, when many aren't). And this inconsistency could manifest within, and across datasets. However at present it is already the case that very high-end enhancements, such as ellipsis tokens and fully analyzed orphan constructions are only found in a handful of treebanks, and may well not be exhaustive even in those, so arguably this would not so much change anything as join an existing trend.

@dan-zeman
Copy link
Member

I would also be in favor of allowing some more situations where missing nodes may be reconstructed in the enhanced graph. One other situation where this would be useful is when a pro-drop language does not have an overt subject and it is thus not possible to add a nsubj:xsubj relation and show coreference of the dropped subjects in control verb constructions.

(In fact, empty nodes are already used for a phenomenon other than gapping in Chukchi, I think it is one of the options discussed in #701. But once again, this is just silently tolerated without any support in the guidelines.)

@Stormur
Copy link
Contributor

Stormur commented Jan 11, 2022

Not entirely true—the subjectless imperative is a clear case where "had" does not work, at least for me:

  • Better leave early or you'll regret it!

But we cannot use the subject-auxiliary inversion test for such imperatives, so we cannot show that better has anything verbal or auxiliary to it. I would find it natural to say that leave is the root and better is its advmod.

I agree, and one possible reanalysis that has taken place here (and in other cases with had better) might be tracing it back to a "standard" copula ellipsis. So now one interprets it as

  • [It is] better [to] leave early!

I don't know how correct it is and if this could explain some other behaviours of this construction, just a musing. Anyway, this would not change that better stays eminently adjectival/adverbial.

@nschneid
Copy link
Contributor

I think we all agree now that "had better" should not be fixed? If no objections I'll remove it from EWT and https://universaldependencies.org/en/dep/fixed.html#need-discussion

nschneid added a commit to UniversalDependencies/UD_English-EWT that referenced this issue Sep 11, 2022
amir-zeldes added a commit that referenced this issue Jan 8, 2023
based on decision from #803
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants