-
Notifications
You must be signed in to change notification settings - Fork 50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make numbered figures, chapters etc. flat in dep #69
Comments
Going with |
Recall UniversalDependencies/docs/issues/654— What to do with non-quantity modifier numbers also falls under the broader discussion of nominal-nominal relations, I suppose (#71). |
Does anyone really like that? I'm mainly seeing you and other people being sympathetic to changing |
TBC if we endorse this guideline, we are saying that syntactically "3 houses" and "house 3" are the same construction in UD |
* Totally reviewed entity and coreference information * Added discourse dependency annotations * Moved Typo from MISC to FEATS * Issues addressed: * amir-zeldes/gum#71 * amir-zeldes/gum#69 * amir-zeldes/gum#66 * amir-zeldes/gum#65 * UniversalDependencies/UD_English-EWT#101 * UniversalDependencies/UD_English-EWT#99 * #5 * #4
* Totally reviewed entity and coreference information * Added discourse dependency annotations * Moved Typo from MISC to FEATS * Issues addressed: * amir-zeldes/gum#71 * amir-zeldes/gum#69 * amir-zeldes/gum#66 * amir-zeldes/gum#65 * UniversalDependencies/UD_English-EWT#101 * UniversalDependencies/UD_English-EWT#99 * #5 * #4
Well, there are lots of constructions that UD lumps under the same deprel even though a finer-grained annotation scheme might distinguish them. Sometimes subtypes help with the distinction, as in We can say that the two most important things UD cares about are the types of things being related (e.g. clause vs. nominal), and which is the head. At this level both "3 houses" and "house 3" are similar. But obviously the word order is semantically significant, and morphosyntactically (agreement) they are different. I would like to see some sort of distinction that involves a clear limitation on the scope of BTW, I'm not sure SD would distinguish them either. The SD guidelines say:
...which would seem to fit both. |
Or, we could say that assigning even a temporary label to uniquely identify an instance of an entity makes it like a proper name rather than a "free" use of a number, so figure 1 = figure number 1 = |
The problem with
If you want 3 to govern 5, then 3 can't be As for "house 3", if we use the naive definition of |
Good point, it's a "chapter X", and if If I had to choose a label other than |
In principle a new subtype might be optimal, but I hesitate about suggesting subtypes since most parsers target them as separate labels (for example Stanza's pretrained models output distinct subtypes). Introducing very sparsely attested subtypes could really mess with automatic parsers trained on the data, so my preference has been to add things into FEATS (which is admittedly predicted by NLP tools, but separately) or even MISC. For now I can live with 'dep', which has the advantage of being a garbage can category we can sift through in the future and reassign. |
Handled consistently as |
Currently phrases like "Figure 1" have many analyses:
Choose one for consistency (prob. flat?)
https://corpling.uis.georgetown.edu/annis/#_q=ImVudGl0aWVzIg&_c=R1VN&cl=5&cr=5&s=0&l=10
The text was updated successfully, but these errors were encountered: