Make numbered figures, chapters etc. flat in dep #69

amir-zeldes · 2020-10-23T13:50:39Z

Currently phrases like "Figure 1" have many analyses:

nummod
dep
flat
amod

Choose one for consistency (prob. flat?)

https://corpling.uis.georgetown.edu/annis/#_q=ImVudGl0aWVzIg&_c=R1VN&cl=5&cr=5&s=0&l=10

amir-zeldes · 2020-10-30T18:03:16Z

Going with dep due to cases like "Chapters 10-15", which don't work as flat.

amir-zeldes · 2020-10-30T18:13:39Z

Note that these are nummod in EWT, which I don't think is right... adding @sebschu @nschneid - maybe change in EWT as well? Known lemmas in this construction:

section
part
chapter
volume
method
table
figure
listing (like a numbered code listing, occurs in GUM academic)

nschneid · 2020-10-30T19:45:34Z

Recall UniversalDependencies/docs/issues/654—nummod seems to be the policy for now.

What to do with non-quantity modifier numbers also falls under the broader discussion of nominal-nominal relations, I suppose (#71).

amir-zeldes · 2020-10-30T20:30:36Z

Does anyone really like that? I'm mainly seeing you and other people being sympathetic to changing nummod in cases where nothing is being counted. Otherwise what does it really contribute beyond the NUM tag which is already available from POS?

amir-zeldes · 2020-10-30T20:32:01Z

TBC if we endorse this guideline, we are saying that syntactically "3 houses" and "house 3" are the same construction in UD

* Totally reviewed entity and coreference information * Added discourse dependency annotations * Moved Typo from MISC to FEATS * Issues addressed: * amir-zeldes/gum#71 * amir-zeldes/gum#69 * amir-zeldes/gum#66 * amir-zeldes/gum#65 * UniversalDependencies/UD_English-EWT#101 * UniversalDependencies/UD_English-EWT#99 * #5 * #4

* #71 * #69 * #66 * #65 * UniversalDependencies/UD_English-EWT#101 * UniversalDependencies/UD_English-EWT#99 * UniversalDependencies/UD_English-GUM#5 * UniversalDependencies/UD_English-GUM#4

* Totally reviewed entity and coreference information * Added discourse dependency annotations * Moved Typo from MISC to FEATS * Issues addressed: * amir-zeldes/gum#71 * amir-zeldes/gum#69 * amir-zeldes/gum#66 * amir-zeldes/gum#65 * UniversalDependencies/UD_English-EWT#101 * UniversalDependencies/UD_English-EWT#99 * #5 * #4

nschneid · 2020-11-05T03:12:32Z

TBC if we endorse this guideline, we are saying that syntactically "3 houses" and "house 3" are the same construction in UD

Well, there are lots of constructions that UD lumps under the same deprel even though a finer-grained annotation scheme might distinguish them. Sometimes subtypes help with the distinction, as in nmod:poss and acl:relcl.

We can say that the two most important things UD cares about are the types of things being related (e.g. clause vs. nominal), and which is the head. At this level both "3 houses" and "house 3" are similar. But obviously the word order is semantically significant, and morphosyntactically (agreement) they are different. I would like to see some sort of distinction that involves a clear limitation on the scope of nummod but I'm not sure typologically what the criteria should be: limited to quantity-like numeric modifiers (and possibly extension of similar morphosyntax to other uses)?

BTW, I'm not sure SD would distinguish them either. The SD guidelines say:

num: numeric modifier

A numeric modifier of a noun is any number phrase that serves to modify the meaning of the noun.

...which would seem to fit both.

nschneid · 2020-11-05T03:20:14Z

Or, we could say that assigning even a temporary label to uniquely identify an instance of an entity makes it like a proper name rather than a "free" use of a number, so flat should apply. Then:

figure 1 = flat(figure, 1)—portrays "figure 1" as the figure's official name (at least in the context)

figure number 1 = compound(figure, number), flat(number, 1)? (Or however "French actor Ulliel" is handled for the relationship between "figure" and "number": #71.) This portrays "number 1" as the figure's official name and the word "figure" as a descriptor.

amir-zeldes · 2020-11-05T14:57:31Z

The problem with flat is cases like this:

Chapters 3 - 5

If you want 3 to govern 5, then 3 can't be flat, since that label can't have children. Because of this, I finally went with dep for the current UD release (already in UD dev for the freeze).

As for "house 3", if we use the naive definition of nummod as "any modifier that is a number", we'll end up with a different analysis for "house A", since "A" is not a number (and not tagged NUM).

nschneid · 2020-11-05T15:39:48Z

Good point, it's a "chapter X", and if flat dependents are unable to contain compositional subtrees it's a problem. Same with "house 3" or "house A", where the label may or may not be a numeric form.

If I had to choose a label other than dep, then, I'd probably go with left-headed compound. But it might be worth having a subtype as this is really not the garden variety compound construction.

amir-zeldes · 2020-11-05T18:49:11Z

In principle a new subtype might be optimal, but I hesitate about suggesting subtypes since most parsers target them as separate labels (for example Stanza's pretrained models output distinct subtypes). Introducing very sparsely attested subtypes could really mess with automatic parsers trained on the data, so my preference has been to add things into FEATS (which is admittedly predicted by NLP tools, but separately) or even MISC. For now I can live with 'dep', which has the advantage of being a garbage can category we can sift through in the future and reassign.

amir-zeldes · 2020-11-13T21:32:04Z

Handled consistently as dep in 6.2.0

nschneid mentioned this issue Nov 4, 2020

Syntax for "you guys" #71

Closed

amir-zeldes closed this as completed Nov 13, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make numbered figures, chapters etc. flat in dep #69

Make numbered figures, chapters etc. flat in dep #69

amir-zeldes commented Oct 23, 2020

amir-zeldes commented Oct 30, 2020

amir-zeldes commented Oct 30, 2020

nschneid commented Oct 30, 2020

amir-zeldes commented Oct 30, 2020

amir-zeldes commented Oct 30, 2020

nschneid commented Nov 5, 2020

nschneid commented Nov 5, 2020

amir-zeldes commented Nov 5, 2020

nschneid commented Nov 5, 2020 •

edited

Loading

amir-zeldes commented Nov 5, 2020

amir-zeldes commented Nov 13, 2020

Make numbered figures, chapters etc. flat in dep #69

Make numbered figures, chapters etc. flat in dep #69

Comments

amir-zeldes commented Oct 23, 2020

amir-zeldes commented Oct 30, 2020

amir-zeldes commented Oct 30, 2020

nschneid commented Oct 30, 2020

amir-zeldes commented Oct 30, 2020

amir-zeldes commented Oct 30, 2020

nschneid commented Nov 5, 2020

nschneid commented Nov 5, 2020

amir-zeldes commented Nov 5, 2020

nschneid commented Nov 5, 2020 • edited Loading

amir-zeldes commented Nov 5, 2020

amir-zeldes commented Nov 13, 2020

nschneid commented Nov 5, 2020 •

edited

Loading