Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tuning models #9

Open
maciejbiesek opened this issue Feb 3, 2020 · 4 comments
Open

Tuning models #9

maciejbiesek opened this issue Feb 3, 2020 · 4 comments

Comments

@maciejbiesek
Copy link

Is there any option to tune the models (NER, POS) you provided on own corpora?

@ryszardtuora
Copy link
Collaborator

There is an option to train existing models further on data using spaCy cli train command. Just provide the name, or link to the model as the argument of the --base-model parameter. You will need to convert your data to JSON format using convert command.

This should work, with the exception of POS tagger for morfeusz-based version, which is not a spaCy component.

@maciejbiesek
Copy link
Author

So, to sum up, we can tune eg. the NER model and POS tagger in the simplest form, but we cannot bias models that are morfeusz-based to our specific data?

@ryszardtuora
Copy link
Collaborator

You can tune NER in the morfeusz version, but you cannot do so for its POS tagger.

If it is the morfeusz tokenization that you're after, I suppose you could retrain the basic tagger, and then use it as a component in the pipeline.

Adding the ability to retrain the morfeusz-version tagger, would require more work, but we will consider this.

@maciejbiesek
Copy link
Author

Ok, I see, thank you :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants