Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Evaluating lextag tagging performance #40

Open
nelson-liu opened this issue Jun 14, 2019 · 3 comments
Open

Evaluating lextag tagging performance #40

nelson-liu opened this issue Jun 14, 2019 · 3 comments

Comments

@nelson-liu
Copy link

Hi!

I'd like to build a system to predict each token's lextag---I think the evaluation script for this is streusleval.py?

If so, it doesn't seem like it's part of the latest release? Also, is the data the same between 4.1 and the master ref? Not sure what the release cycle looks like for STREUSLE, but could be nice to have a minor release with all the improvements since last July :)

@nschneid
Copy link
Contributor

streuseval.py is for supersenses+MWEs (I didn't realize it postdated the last release). I don't think it gives lexcat precision and recall, but when it scores token-level tags there's a version including the lexcat.

I need to do a release soon that cleans up some of the preposition supersenses and updates UD to version 2.4.

@nelson-liu
Copy link
Author

Ah ok, thanks for clarifying that!

I don't think it gives lexcat precision and recall, but when it scores token-level tags there's a version including the lexcat.

This sounds like what I want...I'm predicting the token-level tags. How is this different from lexcat precision and recall?

@nschneid
Copy link
Contributor

The lexcat conceptually applies to the lexical expression, which could be a multiword expression (as opposed to the POS/dependency information, which is truly token-level). The lexcat is encoded in the token-level full tag as a matter of convenience for sequence taggers. To avoid redundancy, I tags continuing MWEs have no lexcat. If you are working with automatic MWEs, errors in the MWE analysis will affect how lexcats are counted.

nschneid added a commit that referenced this issue Jun 22, 2019
Scripts to support evaluation of automatic lextag prediction (#40)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants