-
Notifications
You must be signed in to change notification settings - Fork 42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PROPN or NOUN #91
Comments
The short answer is that the EWT corpus was converted from PTB-style annotation to UD, but the tagging decisions do not always line up perfectly. In the PTB style, for example, adjectives in proper names are tagged as proper nouns, though this is not really ideal per UD guidelines. I think I agree with all of your suggestions, as long as they can be applied consistently in the corpus (and are not too much of a departure from policies of other corpora). The main reason nobody has tried to clean up all the proper name annotations is that it seems like a lot of work—but if you want to volunteer that would be great! Related issues: |
Thaks for your reply. I did not want to reignite the discussion on how to annotate named entities. But these are cases where IMHO NOUN should be a better UPOS tag. I could provide a list here of PROPN (and sentences) which I would retag as NOUN (it won't be very long anyway) |
If you can submit a pull request for the dev branch we'll take a look! |
No problem! (after the summer break :-) |
#164 overhauled the use of PROPN. If there are problems that we missed, let us know. |
I've just come across a some 14 sentences where the (common noun?) president (except in titles like president Bush) is annotated as
PROPN
(George W. Bush alleged Thursday that John Edwards lacks the experience necessary to be president.,weblog-juancole.com_juancole_20040708181175_ENG_20040708_181175-0001
and others). Similar case for governor (e.g. Davis spokesman Steve Maviglio said the governor felt "betrayed" by the actions of Winter.,email-enronsent07_01-0031
)Others cases where the PROPN tag seems correct, but is spelled in lower case, are (YES, I am west of broad., reviews-351561-0022). If broad is a PRON, shoud the lemma column at least be capitalized? Cf. also christmas (christmas cake for christmas day.,
answers-20111107144339AA0qw5S_ans-0018
).In ...has appeared in all the english Pakistan and India papers.,
weblog-blogspot.com_dakbangla_20050311135387_ENG_20050311_135387-0126
english is a PROPN although used as an adjective. Are language names always tagged as PROPN?are these erroneous annotations are did I miss something in the guidelines?
The text was updated successfully, but these errors were encountered: