NER fine tuning keeps failing (sometimes wired results, sometimes malloc errors) #910

tmbo · 2017-03-23T09:53:04Z

There is no consistent behaviour when fine tuning a spacy NER model ("en" in this case).

Sometimes the newly trained model will annotate every word in the test sentence as an entity, sometimes I am looking for a restaurant is recognized as TIME. The added minimal failing example shows the behaviour.

Info about spaCy

Python version: 2.7.12
Platform: Darwin-16.4.0-x86_64-i386-64bit
spaCy version: 1.7.2
Installed models: cache, de, de-1.0.0, en, en-1.1.0, en_glove_cc_300_1m_vectors-1.0.0

Code to reproduce:

import json
import os
import random

import pathlib
import spacy
from spacy.gold import GoldParse
from spacy.pipeline import EntityRecognizer

if __name__ == '__main__':
    nlp = spacy.load("en")
    ner = nlp.entity

    train_data = [["hey",[]],["howdy",[]],["hey there",[]],["hello",[]],["hi",[]],["i'm looking for a place to eat",[]],["i'm looking for a place in the north of town",[[31,36,"location"]]],["show me chinese restaurants",[[8,15,"cuisine"]]],["show me chines restaurants",[[8,14,"cuisine"]]],["yes",[]],["yep",[]],["yeah",[]],["show me a mexican place in the centre",[[31,37,"location"],[10,17,"cuisine"]]],["bye",[]],["goodbye",[]],["good bye",[]],["stop",[]],["end",[]],["i am looking for an indian spot",[[20,26,"cuisine"]]],["search for restaurants",[]],["anywhere in the west",[[16,20,"location"]]],["central indian restaurant",[[0,7,"location"],[8,14,"cuisine"]]],["indeed",[]],["that's right",[]],["ok",[]],["great",[]]]
    additional_entity_types = [u'cuisine', u'location']

    # Fine tune the ner model
    for entity_type in additional_entity_types:
        if entity_type not in ner.cfg['actions']['1']:
            ner.add_label(entity_type)

    for itn in range(5):
        random.shuffle(train_data)
        for raw_text, entity_offsets in train_data:
            doc = nlp.make_doc(unicode(raw_text))
            gold = GoldParse(doc, entities=entity_offsets)
            ner.update(doc, gold)

    # store the fine tuned model
    if not os.path.exists("fine_tuned_ner_model"):
        os.mkdir("fine_tuned_ner_model")
    with open("fine_tuned_ner_model/config.json", 'w') as f:
        json.dump(ner.cfg, f)
    ner.model.dump("fine_tuned_ner_model/model")

    # load the fine tuned model
    ner = None
    ner = EntityRecognizer.load(pathlib.Path("fine_tuned_ner_model"), nlp.vocab)

    # test the model
    s = u"I am looking for a restaurant in Berlin"
    print("Test sentence: '{}'".format(s))
    doc = nlp(s, entity=False)
    ner(doc)
    print("Entities on fine tuned NER:")
    for e in doc.ents:
        print("\t'{}': {}".format(e.text, e.label_))

    print("Entities on plain spacy NER:")
    spacy_doc = nlp(s)
    for e in spacy_doc.ents:
        print("\t'{}': {}".format(e.text, e.label_))

Sometimes the result is a malloc error, sometimes I get

Entities on fine tuned NER:
	'I': TIME
	'am looking': LANGUAGE
	'for': ORG
	'a restaurant': DATE
	'in Berlin': ORG
Entities on plane spacy NER:

sometimes its

Entities on fine tuned NER:
	'I am looking for a restaurant': TIME
	'in Berlin': ORG
Entities on plane spacy NER:

and when unlucky I run into this

python2.7(64385,0x7fffb1dbd3c0) malloc: *** error for object 0x7ff413a6ef68: incorrect checksum for freed object - object was probably modified after being freed.
*** set a breakpoint in malloc_error_break to debug

The text was updated successfully, but these errors were encountered:

honnibal · 2017-03-23T10:41:49Z

I haven't run this code yet, but I think I see the problem. I doubt that the labels are being added to the ner.cfg. This means the model's transition-system is out-of-synch with the classes in the model.

The workaround for now would be to readd the labels after loading. The better fix will be to add them to the cfg during add_label.

tmbo · 2017-03-23T12:47:15Z

Might that also be the cause for the malloc error or is that likely to be unrelated?

honnibal · 2017-03-23T13:09:20Z

It would make sense. The model would produce a class that's out-of-bounds for the transition system. I doubt I'm checking for that, so it would cause a memory error.

honnibal · 2017-03-23T22:42:14Z

Two further issues here, one with your code, one with the library

You're missing a call to nlp.tagger(doc) in your training loop. This means you're missing the tagger features during your fine-tuning, so the features won't match.
On the current code, the parser model doesn't respect a learning rate for the perceptron update. This causes the weight updates to be out-of-scale with the existing weights, so the resulting model is quite bad.

tmbo · 2017-03-24T13:32:49Z

Thanks for letting me know. You should probably change your example code then (or remove it if it is not up to date) https://github.com/explosion/spaCy/blob/master/examples/training/train_ner.py#L26

honnibal · 2017-03-31T10:03:07Z

Closing this because the specific bug has been fixed. Still need to fix the docs and the save/load process, but that's covered in other issues.

fgadaleta · 2017-03-31T14:56:15Z

I've got exactly the same, even after upgrading to 1.7.3
Model resumed is inconsistent

tmbo · 2017-03-31T14:57:37Z

If you have added new labels during fine tuning, you need to add them again after loading the model from disk (if you are saving the model in between train and use)

fgadaleta · 2017-03-31T15:03:05Z

The rationale behind is to save to disk nlp.entity, updating it like the code below
(where train_data is a list of raw_text, [startPOS, endPOS, ENT_TYPE])
and save it back. Then reload another time

for raw_text, entity_offsets in train_data:
            doc = nlp.make_doc(raw_text.decode())
            nlp.tagger(doc)
            gold = GoldParse(doc, entities=entity_offsets)
            ner.update(doc, gold)

fgadaleta · 2017-03-31T19:00:17Z

Still without adding any new entity type, it messed up

lock · 2018-05-09T00:39:01Z

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

honnibal added the bug Bugs and behaviour differing from documentation label Mar 23, 2017

honnibal added a commit that referenced this issue Mar 23, 2017

Add test for Issue #910: Resuming entity training

f40fbc3

honnibal added a commit that referenced this issue Mar 25, 2017

Update config when adding label. Re #910

2f63806

honnibal closed this as completed Mar 31, 2017

This was referenced Mar 31, 2017

MemoryError: Error allocating memory for feature #786

Closed

SpaCy NER training example from version 1.5.0 doesn't work in 1.6.0 #773

Closed

honnibal added a commit that referenced this issue Apr 23, 2017

Remove xfail on Test #910

040751a

lock bot locked as resolved and limited conversation to collaborators May 9, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

NER fine tuning keeps failing (sometimes wired results, sometimes malloc errors) #910

NER fine tuning keeps failing (sometimes wired results, sometimes malloc errors) #910

tmbo commented Mar 23, 2017 •

edited

Loading

honnibal commented Mar 23, 2017

tmbo commented Mar 23, 2017

honnibal commented Mar 23, 2017

honnibal commented Mar 23, 2017

tmbo commented Mar 24, 2017

honnibal commented Mar 31, 2017

fgadaleta commented Mar 31, 2017

tmbo commented Mar 31, 2017

fgadaleta commented Mar 31, 2017

fgadaleta commented Mar 31, 2017

lock bot commented May 9, 2018

NER fine tuning keeps failing (sometimes wired results, sometimes malloc errors) #910

NER fine tuning keeps failing (sometimes wired results, sometimes malloc errors) #910

Comments

tmbo commented Mar 23, 2017 • edited Loading

Info about spaCy

honnibal commented Mar 23, 2017

tmbo commented Mar 23, 2017

honnibal commented Mar 23, 2017

honnibal commented Mar 23, 2017

tmbo commented Mar 24, 2017

honnibal commented Mar 31, 2017

fgadaleta commented Mar 31, 2017

tmbo commented Mar 31, 2017

fgadaleta commented Mar 31, 2017

fgadaleta commented Mar 31, 2017

lock bot commented May 9, 2018

tmbo commented Mar 23, 2017 •

edited

Loading