New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Evaluate does not work for custom dataset #56
Comments
try: |
Okay thanks, I tried that, but now I got another problem that y_true and y_pred are both empty.
|
Could you share your label_map ? |
Please kindly find below, together with the data: I also found the sentences are truncated when loaded:
These words are not loaded:
Thanks a lot! |
I modified the preprocess part anyway, to replace the white space in the label with underscore. |
@imayachita, |
Oh yes! Thanks! It is running now. |
@kamalkraj I think I know what caused the examples to be truncated. It is because of the bert tokenization that chunks the word into subwords, so the labels and the token don't match anymore. The original data:
Have you got idea how to fix this problem? Thanks a lot! I really appreciate your help. |
Bert subword tokenization is handled. Labels and tokenized words will have different lengths. |
Thanks for the explanation @kamalkraj!
When I printed the tokens and labels: Seems to me that because the original sentence length is 13 and the BERT tokenization tokenized the sentence to have more than 13 tokens, the iteration stopped at "," (which got index of 12). Therefore, the words "Retief", "Goosen", "(", "South" are not fed into the model. Do I understand it correctly? Thanks! |
@imayachita, Line 30 in b27c79c
|
Hi,
I tried using your code on my data. It finished training without problem, but it got problem during evaluation.
I printed the
label_ids
:Seems like it is because the element with j==16 is 0 in the label list and label 0 is not in the labelmap.
I wonder how did you build the label_ids?
Thanks
The text was updated successfully, but these errors were encountered: