-
Notifications
You must be signed in to change notification settings - Fork 446
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
doubt in metric.py #23
Comments
infact in the above case |
The metric is correct. You use the wrong data format, only BIO/BMES format are supported. You need do more homework before doubt others code. |
@jiesutd Yes, thank you for pointing that out, the above example is not even correct with respect to BIO or BIOES as first I-MISC is followed by another I-MISC, I just checked and the dataset I was using had other such sentences, I corrected it to BIOES format and It seems to be giving f-score (!=-1) However, I have no reference to check if my converted dataset is correct, do you have such reference? Does CoNLL website provide BIO and BIOES format as well? (I don't have their dataset directly as it requires some manual authentication/NDA process) comments? |
You can refer this paper https://arxiv.org/pdf/1707.06799.pdf for the difference between BIO/BIOES/IOB. |
I think
get_ner_BIO()
inmetric.py
is wrong.consider the example where
label_list = [I-MISC, I-MISC, O, I-PER, I-PER, O, O, O, O, O I-ORG, O]
according to current function the following will happen :Since there is no tag involving B-,
whole_tag
andtag_index
will always be[]
and hence the output of the function is[]
which is wrong?The text was updated successfully, but these errors were encountered: