New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
NER model for Armenian #1206
Comments
Thank you for doing this! Although I should point out that the pull request is currently against your own fork, not our dev branch. If you'll fix that, I can check this out tomorrow and try to diagnose the problem you're seeing. |
Actually, if you would give an example of a sentence which causes the inconsistent labels, that would help a lot. |
It was pretty easy for me to get the changes you made, so I replicated your pull request locally (with you as the author, of course) https://github.com/stanfordnlp/stanza/pull/1212/commits Let me know if that looks good to you. I like having a non-bert model as the default so the pipeline is less expensive unless people know they want the bert model, so I will check that everything works by retraining the model. If you would find an example that was triggering the non-deterministic behavior, though, I can try to debug that. Thanks for sending this! |
Hi there, |
It's been merged, and the new models (retrained locally) are included in 1.5.0 Thanks for the help! |
Hello! I have trained a NER model for the Armenian language using the ArmTDP dataset and the xlm-roberta-base model.
After that, I attempted to test the model using stanza.Pipeline:
While working with the same data, I observed that the outputs after loading the model were different each time.
Although there was no such problem when testing the code using internal commands. Whenever I run the following code, I get the same output:
python3 -m stanza.utils.training.run_ner hy_armtdp --score_test
What could be the cause of this problem?
Additionally, I have added data conversion and BERT code for Armenian in this pull request (trained model can be downloaded from this drive).
If the problem is feasible, it would be great to integrate a NER model for Armenian in the main package
Thanks!
The text was updated successfully, but these errors were encountered: