Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

n-best tagging results #45

Closed
usptact opened this issue May 18, 2015 · 5 comments
Closed

n-best tagging results #45

usptact opened this issue May 18, 2015 · 5 comments

Comments

@usptact
Copy link

usptact commented May 18, 2015

Is possible to return n the most likely predictions using CRF? If so, which place should be modified in the source code to get this behavior since I could not find any parameter that gives this.

Thank you.

@kmike
Copy link
Contributor

kmike commented May 18, 2015

Hey @usptact,

CRFsuite doesn't currently support n-best tagging.

It seems the relevant code is

floatval_t crf1dc_viterbi(crf1d_context_t* ctx, int *labels)

@usptact
Copy link
Author

usptact commented May 19, 2015

kmike,

Thank you a lot for a pointer! I read elsewhere that for Vitterbi based algorithms one needs to increase the beam size. I am not sure what it means.

@kmike
Copy link
Contributor

kmike commented May 19, 2015

Currently the function computes a single max_score, stores a single backward link at each j, and finds a single best label sequence using these backwads links.

If I'm not mistaken, for n-best parsing you need to keep top-n max_score values, n best backward links at each position j and use them to compute n best label sequences.

There are also more efficient algorithms for n-best decoding, see e.g. http://www.keerthis.com/P12-1064.pdf for an overview.

@kmike
Copy link
Contributor

kmike commented May 19, 2015

As a side not, Wapiti CRF toolkit supports n-best decoding.
Implementation is not optimal though (see Jekub/Wapiti#2).

@usptact
Copy link
Author

usptact commented May 19, 2015

Thank you very much, kmike! I am playing with Wapiti right now and trying to assess the top n-best results. Up to this moment I was always relying on the top-1 result which was not the best in all the cases. I am curios whether good tagging is in the n-best results list.

@usptact usptact closed this as completed May 20, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants