Failed loading model: Could not find key /_1 in the model file #79

danielhers · 2019-08-09T10:40:06Z

@OfirArviv, this is an issue we've been talking about a bit. I guess it could be called a DyNet issue but I'll try working around it.

In the mrp branch, whenever I train a model without BERT, when I try to load it I get this error:

...
[dynet] 2.1
Loading from 'test_files/models/ucca.enum'... Done (0.000s).
Loading model from 'test_files/models/ucca':   0%|          | 0/13 [00:00<?, ?param/s]Traceback (most recent call last):
  File "tupa/tupa/model.py", line 234, in load
    self.classifier.load(self.filename)
  File "tupa/tupa/classifiers/classifier.py", line 125, in load
    self.load_model(filename, d)
  File "tupa/tupa/classifiers/nn/neural_network.py", line 474, in load_model
    values = self.load_param_values(filename, d)
  File "tupa/classifiers/nn/neural_network.py", line 503, in load_param_values
    desc="Loading model from '%s'" % filename, unit="param"))
  File "tupa/lib/python3.7/site-packages/tqdm/_tqdm.py", line 1005, in __iter__
    for obj in iterable:
  File "_dynet.pyx", line 450, in load_generator
  File "_dynet.pyx", line 453, in _dynet.load_generator
  File "_dynet.pyx", line 327, in _dynet._load_one
  File "_dynet.pyx", line 1482, in _dynet.ParameterCollection.load_lookup_param
  File "_dynet.pyx", line 1497, in _dynet.ParameterCollection.load_lookup_param
RuntimeError: Could not find key /_1 in the model file

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "tupa/tupa/parse.py", line 653, in <module>
    main()
  File "tupa/tupa/parse.py", line 649, in main
    list(main_generator())
  File "tupa/tupa/parse.py", line 631, in main_generator
    yield from train_test(test=args.input, args=args)
  File "tupa/tupa/parse.py", line 557, in train_test
    yield from filter(None, parser.train(train, dev=dev, test=test is not None, iterations=args.iterations))
  File "tupa/tupa/parse.py", line 457, in train
    self.model.load()
  File "tupa/tupa/model.py", line 244, in load
    raise IOError("Failed loading model from '%s'" % self.filename) from e
OSError: Failed loading model from 'test_files/models/ucca'

Looking at the model .data file, I couldn't find any problem. There is certainly a line starting with #LookupParameter# /_1 there.

However, looking at the training log, I found it that all updates resulted in this error:

Error in update(): Magnitude of gradient is bad: -nan

I could then reproduce the problem by adding (loss/0).backward() before self.trainer.update() in

tupa/tupa/classifiers/nn/neural_network.py

Line 422 in 20e7b12

self.trainer.update()

So now the question is just why all updates result in -nan gradients in the normal code.

The text was updated successfully, but these errors were encountered:

danielhers self-assigned this Aug 9, 2019

danielhers mentioned this issue Aug 9, 2019

Could not find key /_1 in the model file even though the key is there clab/dynet#1580

Open

danielhers added a commit that referenced this issue Aug 9, 2019

Check validity when training without BERT (#79)

49bdabf

danielhers added the bug label Aug 15, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Failed loading model: Could not find key /_1 in the model file #79

Failed loading model: Could not find key /_1 in the model file #79

danielhers commented Aug 9, 2019 •

edited

Loading

Failed loading model: Could not find key /_1 in the model file #79

Failed loading model: Could not find key /_1 in the model file #79

Comments

danielhers commented Aug 9, 2019 • edited Loading

danielhers commented Aug 9, 2019 •

edited

Loading