-
Notifications
You must be signed in to change notification settings - Fork 263
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Wrong transition in crf when doing a sequence labeling task #32
Comments
Hi @SefaZeng The issue is sadly not trivial and I don't know how to fix it. The CRF is initialized with random probabilities for the transitions, i.e. O I-PER can be as likely as O B-PER. Of course, the CRF does not know anything from the encoding and about allowed transitions. During training, these transition probabilities are updated, so that the CRF learns that O I-PER is unlikely. However, it converges rather slowly to a 0 probability. This makes sense, as how should the CRF be able to distinguish that O I-PER is not possible at all and 'it is rare but I haven't seen enough data'. With more epochs, the number of invalid tags usually converge to a low number or even to zero in my experiments. As I solution what I use is a post-processing step: The code checks whether the tags from the CRF are valid BIO-encoded. If it finds an invalid tag, it sets this tag to O. |
Can I set the initial states to zero to avoid this problem? |
@SefaZeng I think that could work, however, you would need to ensure to get the mapping right. Especially when the number of tags changes (e.g. you add B-LOC and I-LOC to your tagset), you must ensure that you set the zeros at the right place. Otherwise it can easily happen that B-LOC => I-LOC is initialized with a zero probability. Further, the CRF is bi-directional, i.e. not only the previous label is important but also the next label determines which label is produced. This can make it rather complicated to initialize the CRF correctly. |
@nreimers Emmm.. I set the initializer of U, b_start, b_end and initial state in the viterbi_decode to zeros,but it doesn't work.Maybe post-processing is the only way. |
I use the ChainCRF.py as the CRF Layer in my model to do a sequence labeling task using the OBIE as the tags ,but I meet a problemthat there are some unexpected transition in the predict like E to I.
And it doesn't show up in the train data.
The keras version is 2.2.2.And tensorflow is 1.10.0
the code:
Is there something wrong with the model?Or somet badcase that i didnt find in the data?
Any help is appreciate!Thx!
The text was updated successfully, but these errors were encountered: