-
Notifications
You must be signed in to change notification settings - Fork 872
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
mwt training fails in evaluating dev set, but the dev set passes validation #1167
Comments
The dev set might pass validation, but I don't see how, honestly. The dependency is for word 63, whereas the sentence |
The original dataset actually has 65 words in that sentence, so clearly we are transcribing something wrong as part of the MWT process. I will figure it out later today. |
Hi! Thanks so much for your help. I am a beginner in this line of work. May I ask: how can we check which sentence has errors, when the error signal does not indicate that? Often I found tracebacks saying there is a particular error, but I don`t get to know which instance has it. |
If you update to the latest dev branch, it should be fixed: In terms of debugging this particular problem, what I did was change the existing script to output the line number when there was an exception, and that made it pretty clear what happened. The eval script is from a different repo, though, and I'm not sure they'll want that particular edit made permanent. |
Thanks so much, John! |
Hi I am training stanza with the Arabic padt treebank. In training mwt, the evaluation of dev set failed and I got the following error.
However, the dev set passes the validation file in the ud github release.
I am also training other treebanks. None of them reported this problem. Is there anything I can do to fix it? Thank you.
The text was updated successfully, but these errors were encountered: