Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

inconsistency in the superscript #180

Open
arademaker opened this issue Nov 15, 2019 · 4 comments
Open

inconsistency in the superscript #180

arademaker opened this issue Nov 15, 2019 · 4 comments

Comments

@arademaker
Copy link
Member

arademaker commented Nov 15, 2019

See the superscript in the civil versus signal. One has a only and the other a2??!

image

@arademaker
Copy link
Member Author

It looks like the case with only a is caused by the sensekey used be not in the list of possible lemmas for the token. But why that happens in the data?

((kind "wf")
   (form . "ordinary")
   (lemmas "ordinary%1" "ordinary%3")
   (tag . "man")
   (senses "ordinary%5:00:02:common:01")
   (meta
    (pos . "JJ")))

@arademaker arademaker changed the title inconsistency in the upperscript inconsistency in the superscript Nov 15, 2019
@odanoburu
Copy link
Contributor

at least on my machine, where I'm using WN data from openWN, the problem is a small incompatibility between the former's sense keys and PWN's sense keys: the fact that openWN has split the adj.all lexicographer file into adj.all and adjs.all. because of this the already annotated sense of ordinary does not exist in the database, and therefore has no number. because we have already decided to undo this split of adj.all, the problem should disappear by itself in the next commits.

I'll make sensetion complain when it finds an unknown sense key, however.

@arademaker
Copy link
Member Author

What is openWN? Do you mean the https://github.com/own-pt/own-en? BTW, does it means that we will have to map all previous annotations in the glosstag corpus to the new sense keys? Another possibility would be to keep glosstag annotation related to the PWN 3.0 sensekeys. Not sure if I understood the implications of your description above.

@odanoburu
Copy link
Contributor

Do you mean the https://github.com/own-pt/own-en?

yes

BTW, does it means that we will have to map all previous annotations in the glosstag corpus to the new sense keys? Another possibility would be to keep glosstag annotation related to the PWN 3.0 sensekeys.

one of those, yes. we should probably discuss this better, since we're not sure how stable our new id scheme is.

Not sure if I understood the implications of your description above.

the split into adj.all and adjs.all created a new lexicographer file; since all sense keys include the lexicographer file number, all satellite adjectives that have been pre-annotated are not recognized by sensetion (but the sense keys are there and are not eliminated; duplicate annotations may happen, however).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants