Tuning models #9

maciejbiesek · 2020-02-03T11:21:52Z

Is there any option to tune the models (NER, POS) you provided on own corpora?

ryszardtuora · 2020-02-03T11:32:37Z

There is an option to train existing models further on data using spaCy cli train command. Just provide the name, or link to the model as the argument of the --base-model parameter. You will need to convert your data to JSON format using convert command.

This should work, with the exception of POS tagger for morfeusz-based version, which is not a spaCy component.

maciejbiesek · 2020-02-03T12:30:08Z

So, to sum up, we can tune eg. the NER model and POS tagger in the simplest form, but we cannot bias models that are morfeusz-based to our specific data?

ryszardtuora · 2020-02-03T13:02:30Z

You can tune NER in the morfeusz version, but you cannot do so for its POS tagger.

If it is the morfeusz tokenization that you're after, I suppose you could retrain the basic tagger, and then use it as a component in the pipeline.

Adding the ability to retrain the morfeusz-version tagger, would require more work, but we will consider this.

maciejbiesek · 2020-02-03T13:04:14Z

Ok, I see, thank you :)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tuning models #9

Tuning models #9

maciejbiesek commented Feb 3, 2020

ryszardtuora commented Feb 3, 2020

maciejbiesek commented Feb 3, 2020

ryszardtuora commented Feb 3, 2020

maciejbiesek commented Feb 3, 2020

Tuning models #9

Tuning models #9

Comments

maciejbiesek commented Feb 3, 2020

ryszardtuora commented Feb 3, 2020

maciejbiesek commented Feb 3, 2020

ryszardtuora commented Feb 3, 2020

maciejbiesek commented Feb 3, 2020