-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
r wrapper / morphodita #10
Comments
BTW, I consider nametag to be very weak currently -- it is not very accurate (it is unchanged since ~2014) and requires a tagger+lemmatizer to work; we use it only for Czech. As for extracting the tagger -- the released UDPipe models actually contain two MorphoDita models -- one is a tagger predicting UPOS, XPOS & Feats, and the other one is a lemmatizer predicting UPOS & Lemmas. I do not think it is possible to extract the models using existing binaries, but it would be trivial to write one, if you want it. |
I have some .udpipe models where the parts of speech and the lemmatizer was trained with 1 morphodita model for which I can still use the tagger external now to test NameTag out. Some background:
I don't mind using pre-deep learning machine learning techniques, my laptop is still from 2013 and the users of the models are historians which have no clue about computer programming. Free free to provide any advise on tooling that would be more suitable. The requirements that I have are
|
I do not really have any suggestions -- NameTag generally fulfils the "not much required computational performance". The disadvantages are the required morphological model (but if you already have it, it is not a problem) and lower than state-of-the-art performance (it does not even use a CRF layer -- it uses a MEMM with dynamic decoding only; and the implemented feature templates are not that strong). But I do not have any low-resource alternative (we are still using it for Czech)... When the new UDPipe appears (yes, it is bordering with vaporware at this moment, I am unfortunately aware), we plan a NER + NEL modules too; but they will require substantially more computational resources (especially for training)... |
Thanks for the messages and the advice. Looking forward to the vaporware announcements :) |
FYI.
I've built an R wrapper around nametag https://github.com/bnosac/nametagger so that I can easily use it to construct a baseline NER model and compare it to a baseline CRF or other deep-learning approaches which require more computing resources.
While I was doing this. I'm wondering if there is an easy way on how to extract a morphodita model from a .udpipe file? Such that I can use them with tagger morphodita:model?
The text was updated successfully, but these errors were encountered: