v2: Take existing model and retrain NER #1130

kootenpv · 2017-06-13T14:53:59Z

I've seen https://alpha.spacy.io/docs/usage/training-ner and I really like it.

I was going to try to take the existing alpha model (which already contains deps/NER), and I was hoping it is possible to train a few iterations over a small set of my data.

Like mentioned in the documentation, it is advised to annotate data it with one model, and then overwrite some things. (rather than really take a loaded model as a starting point, I tried that also, but it also does not train like that).

So, I made sure tags, heads and deps are correct in reformat_train_data, on a small data set, and tried to train a model from scratch, changing the code to:

nlp = English(pipeline=['tensorizer', 'tagger', 'parser', 'ner'])

I expected that then everything should work automatically.

The error:

<ipython-input-161-08855558d30e> in main(model_dir)
     12         return reformat_train_data(nlp.tokenizer, train_data)
     13 
---> 14     optimizer = nlp.begin_training(get_data)
     15 
     16     for itn in range(100):

~/python/lib/python3.7/site-packages/spacy/language.py in begin_training(self, get_gold_tuples, **cfg)
    336             if hasattr(proc, 'begin_training'):
    337                 context = proc.begin_training(get_gold_tuples(),
--> 338                                               pipeline=self.pipeline)
    339                 contexts.append(context)
    340         learn_rate = util.env_opt('learn_rate', 0.001)

~/python/lib/python3.7/site-packages/spacy/pipeline.pyx in spacy.pipeline.NeuralTagger.begin_training (spacy/pipeline.cpp:13901)()

~/python/lib/python3.7/site-packages/spacy/morphology.pyx in spacy.morphology.Morphology.__init__ (spacy/morphology.cpp:4655)()

~/python/lib/python3.7/site-packages/spacy/morphology.pyx in spacy.morphology.Morphology.add_special_case (spacy/morphology.cpp:5625)()

KeyError: 4062917326063685704

Am I trying something that has no chance to work?

The normal example works fine, it just seems that introducing tagger+parser in the pipeline does not work currently?

The text was updated successfully, but these errors were encountered:

twielfaert · 2017-06-15T13:16:09Z

Reminds me of #1052 in the v1.x, which is caused by a missing mapping in the tag_map.

honnibal · 2017-06-15T14:06:41Z

I think @twielfaert analysis sounds correct -- likely something missing from the tag map. By the way, the capability to add new entity labels to a pre-trained model is temporarily not working. We need to be able to resize the output weights, which isn't wired up yet.

kootenpv · 2017-06-16T11:16:06Z

@abhishekgupta10 Just follow what it says here: https://alpha.spacy.io/docs/usage/training-ner

mikeatm · 2017-07-08T22:43:37Z

@honnibal is there a specific issue tracking the capability to add new entity labels to pre-trained models for v2?
its a feature that im looking forward to.

ines · 2017-10-27T19:44:30Z

Sorry about the messy training examples an docs! I spent the past few days going over all examples, cleaning them up and adding more documentation.

Here's the new training examples directory:
https://github.com/explosion/spaCy/tree/develop/examples/training

The current state only works with the spaCy version on develop – which will be released as soon as the new models are done training. The new docs are already in the website directory on develop, but not live yet, since we want to push the new version first.

(Unless there are serious bugs or problems, the upcoming alpha version will probably also be the version we'll promote to the release candidate 🎉 )

lock · 2018-05-08T12:27:39Z

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

honnibal added the usage General spaCy usage label Jun 15, 2017

mikeatm mentioned this issue Jul 14, 2017

RFE Doc spaCy2: Append a custom NER trained model to an existing model #1182

Closed

ines added the 🌙 nightly Discussion and contributions related to nightly builds label Oct 16, 2017

ines closed this as completed Oct 27, 2017

lock bot locked as resolved and limited conversation to collaborators May 8, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v2: Take existing model and retrain NER #1130

v2: Take existing model and retrain NER #1130

kootenpv commented Jun 13, 2017 •

edited

Loading

twielfaert commented Jun 15, 2017

honnibal commented Jun 15, 2017

kootenpv commented Jun 16, 2017

mikeatm commented Jul 8, 2017

ines commented Oct 27, 2017

lock bot commented May 8, 2018

v2: Take existing model and retrain NER #1130

v2: Take existing model and retrain NER #1130

Comments

kootenpv commented Jun 13, 2017 • edited Loading

twielfaert commented Jun 15, 2017

honnibal commented Jun 15, 2017

kootenpv commented Jun 16, 2017

mikeatm commented Jul 8, 2017

ines commented Oct 27, 2017

lock bot commented May 8, 2018

kootenpv commented Jun 13, 2017 •

edited

Loading