Getting RuntimeError for LukeRelationClassification #57

akshayparakh25 · 2021-03-22T08:33:10Z

While trying to replicate results using pre-trained model for Relation Classification, I am getting the following error. I looked at the function load_state_dict(), strict argument is set to False.

Traceback (most recent call last):


  File "/usr/lib/python3.6/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)

  File "/usr/lib/python3.6/runpy.py", line 85, in _run_code
    exec(code, run_globals)

  File "/home/akshay/re_rc/luke/examples/cli.py", line 132, in <module>
    cli()

  File "/home/akshay/pyTorch-env/lib/python3.6/site-packages/click/core.py", line 829, in __call__
    return self.main(*args, **kwargs)

  File "/home/akshay/pyTorch-env/lib/python3.6/site-packages/click/core.py", line 782, in main
    rv = self.invoke(ctx)

  File "/home/akshay/pyTorch-env/lib/python3.6/site-packages/click/core.py", line 1259, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))

  File "/home/akshay/pyTorch-env/lib/python3.6/site-packages/click/core.py", line 1259, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))

  File "/home/akshay/pyTorch-env/lib/python3.6/site-packages/click/core.py", line 1066, in invoke
    return ctx.invoke(self.callback, **ctx.params)

  File "/home/akshay/pyTorch-env/lib/python3.6/site-packages/click/core.py", line 610, in invoke
    return callback(*args, **kwargs)

  File "/home/akshay/re_rc/luke/examples/utils/trainer.py", line 32, in wrapper
    return func(*args, **kwargs)

  File "/home/akshay/pyTorch-env/lib/python3.6/site-packages/click/decorators.py", line 33, in new_func
    return f(get_current_context().obj, *args, **kwargs)

  File "/home/akshay/re_rc/luke/examples/relation_classification/main.py", line 110, in run
    model.load_state_dict(torch.load(args.checkpoint_file, map_location="cpu"))

  File "/home/akshay/re_rc/luke/luke/model.py", line 236, in load_state_dict
    super(LukeEntityAwareAttentionModel, self).load_state_dict(new_state_dict, *args, **kwargs)

  File "/home/akshay/pyTorch-env/lib/python3.6/site-packages/torch/nn/modules/module.py", line 769, in load_state_dict
    self.__class__.__name__, "\n\t".join(error_msgs)))

RuntimeError: Error(s) in loading state_dict for LukeForRelationClassification:
	size mismatch for embeddings.word_embeddings.weight: copying a param with shape torch.Size([50266, 1024]) from checkpoint, the shape in current model is torch.Size([50267, 1024]).

	size mismatch for entity_embeddings.entity_embeddings.weight: copying a param with shape torch.Size([2, 256]) from checkpoint, the shape in current model is torch.Size([3, 256]).

I cannot understand the reason behind this. Can somebody please explain!

The text was updated successfully, but these errors were encountered:

ikuyamada · 2021-03-22T09:23:35Z

Hi!
The current implementation adds special words and entities and their embeddings to the model, which changes the shapes of embeddings. Maybe you need to be aware of this when modifying the code.

https://github.com/studio-ousia/luke/blob/master/examples/relation_classification/main.py#L47
https://github.com/studio-ousia/luke/blob/master/examples/relation_classification/main.py#L56

akshayparakh25 · 2021-03-25T14:08:02Z

@ikuyamada Thanks for the heads up, I fine-tuned the model, and generated checkpoint is not throwing errors as expected. However, the results I get from fine-tuning and using the generated retained differs vastly.
After fine-tuning:

"test_f1": 0.7204502814258913,

"test_precision": 0.6925638179800222,

"test_recall": 0.7506766917293233

After using the generated pre-trained:

"test_f1": 0.6183343319352906,

"test_precision": 0.6159355416293644,

"test_recall": 0.6207518796992482

Any comments or is there something I am missing?

ikuyamada · 2021-03-25T14:13:27Z

@akshayparakh25 Would you provide commands used to run the fine-tuning and the inference based on the checkpoint?

akshayparakh25 · 2021-03-25T14:19:15Z

Command Used for fine-tuning:

 python -m examples.cli \                                               
--model-file=luke_large_500k.tar.gz \
--output-dir=output \
relation-classification run \
--data-dir=./../dataset/tacred/tacred \
--train-batch-size=4 \
--gradient-accumulation-steps=8 \
--learning-rate=1e-5 \
--num-train-epochs=5

For Inference based on checkpoint:

python -m examples.cli \
 --model-file=luke_large_500k.tar.gz \
--output-dir=output \
relation-classification run \
--data-dir=./../dataset/tacred/tacred \
--checkpoint-file=output/pytorch_model.bin   \
--no-train

ikuyamada · 2021-03-25T14:25:31Z

Thanks for your prompt reply! Can you reproduce the scores based on the publicized checkpoint file by running the same command for inference?

akshayparakh25 · 2021-03-25T14:29:13Z

I tried to follow your comment, but I wasn't sure about the special token you mentioned earlier. So I thought pre-training won't create the issue and went ahead with that.

ikuyamada · 2021-03-25T14:41:31Z

Regarding the error mentioned in the first comment, the released checkpoint file of the relation classification task contains a word embedding with shape (50267, 1024) and an entity embedding with shape (3, 256). I think your checkpoint file is different from the publicized checkpoint file.

>>> model_data = torch.load('pytorch_model.bin')
>>> model_data['embeddings.word_embeddings.weight'].shape
torch.Size([50267, 1024])
>>> model_data['entity_embeddings.entity_embeddings.weight'].shape
torch.Size([3, 256])

RuntimeError: Error(s) in loading state_dict for LukeForRelationClassification:
size mismatch for embeddings.word_embeddings.weight: copying a param with shape torch.Size([50266, 1024]) from checkpoint, the shape in current model is torch.Size([50267, 1024]).
size mismatch for entity_embeddings.entity_embeddings.weight: copying a param with shape torch.Size([2, 256]) from checkpoint, the shape in current model is torch.Size([3, 256]).

akshayparakh25 · 2021-03-25T15:01:41Z

Regarding the error mentioned in the first comment, the released checkpoint file of the relation classification task contains a word embedding with shape (50267, 1024) and an entity embedding with shape (3, 256). I think your checkpoint file is different from the publicized checkpoint file.
>>> model_data = torch.load('pytorch_model.bin')
>>> model_data['embeddings.word_embeddings.weight'].shape
torch.Size([50267, 1024])
>>> model_data['entity_embeddings.entity_embeddings.weight'].shape
torch.Size([3, 256])
RuntimeError: Error(s) in loading state_dict for LukeForRelationClassification:
size mismatch for embeddings.word_embeddings.weight: copying a param with shape torch.Size([50266, 1024]) from checkpoint, the shape in current model is torch.Size([50267, 1024]).
size mismatch for entity_embeddings.entity_embeddings.weight: copying a param with shape torch.Size([2, 256]) from checkpoint, the shape in current model is torch.Size([3, 256]).

Do you mean the checkpoint file that I have downloaded is different from the publicized one?

ikuyamada · 2021-03-25T15:06:52Z

Do you mean the checkpoint file that I have downloaded is different from the publicized one?

I do not know why this happens. I have downloaded the checkpoint file to my local computer and confirmed that the shapes are different from those shown in the error message.

akshayparakh25 · 2021-03-25T15:32:30Z

Thanks for your prompt reply! Can you reproduce the scores based on the publicized checkpoint file by running the same command for inference?

Thanks for your response. The link shared in this comment is working for me. And the results for test set,

"test_f1": 0.6442931771410481,
"test_precision": 0.6801470588235294,
"test_recall": 0.6120300751879699

ikuyamada · 2021-03-25T15:42:05Z

I can reproduce the reported results based on the checkpoint file... Did you use poetry to create the environment? This may be related to the mismatch of library versions.
Also, the above link is the same as the link on README.

akshayparakh25 · 2021-03-25T15:47:20Z

I can reproduce the reported results based on the checkpoint file... Did you use poetry to create the environment? This may be related to the mismatch of library versions.
Also, the above link points to the same URL as the link on README.

With reference to your first point. Possibly that could be the reason.
for 2nd point, I am not why it didn't work the first time.

Thanks for your efforts in resolving the issue.

lshowway · 2022-05-05T14:11:10Z

@akshayparakh25
However, the results I get from fine-tuning and using the generated retained differs vastly. After fine-tuning:

"test_f1": 0.7204502814258913,

"test_precision": 0.6925638179800222,

"test_recall": 0.7506766917293233

After using the generated pre-trained:

"test_f1": 0.6183343319352906,

"test_precision": 0.6159355416293644,

"test_recall": 0.6207518796992482

Any comments or is there something I am missing?

I got a similar problem, the expected f1 is 72, but I got 64. I have checked the data loading utils and evaluation metrics, but I didn't solve the problem.

akshayparakh25 closed this as completed Mar 25, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Getting RuntimeError for LukeRelationClassification #57

Getting RuntimeError for LukeRelationClassification #57

akshayparakh25 commented Mar 22, 2021

ikuyamada commented Mar 22, 2021

akshayparakh25 commented Mar 25, 2021 •

edited

ikuyamada commented Mar 25, 2021

akshayparakh25 commented Mar 25, 2021

ikuyamada commented Mar 25, 2021

akshayparakh25 commented Mar 25, 2021

ikuyamada commented Mar 25, 2021

akshayparakh25 commented Mar 25, 2021

ikuyamada commented Mar 25, 2021

akshayparakh25 commented Mar 25, 2021

ikuyamada commented Mar 25, 2021 •

edited

akshayparakh25 commented Mar 25, 2021

lshowway commented May 5, 2022

Getting RuntimeError for LukeRelationClassification #57

Getting RuntimeError for LukeRelationClassification #57

Comments

akshayparakh25 commented Mar 22, 2021

ikuyamada commented Mar 22, 2021

akshayparakh25 commented Mar 25, 2021 • edited

ikuyamada commented Mar 25, 2021

akshayparakh25 commented Mar 25, 2021

ikuyamada commented Mar 25, 2021

akshayparakh25 commented Mar 25, 2021

ikuyamada commented Mar 25, 2021

akshayparakh25 commented Mar 25, 2021

ikuyamada commented Mar 25, 2021

akshayparakh25 commented Mar 25, 2021

ikuyamada commented Mar 25, 2021 • edited

akshayparakh25 commented Mar 25, 2021

lshowway commented May 5, 2022

akshayparakh25 commented Mar 25, 2021 •

edited

ikuyamada commented Mar 25, 2021 •

edited