Bug in evaluate #4

Wangpeiyi9979 · 2021-05-19T02:13:53Z

Hi, when I run
python bin/predict_amrs.py \ --datasets <AMR-ROOT>/data/amrs/split/test/*.txt \ --gold-path data/tmp/amr2.0/gold.amr.txt \ --pred-path data/tmp/amr2.0/pred.amr.txt \ --checkpoint runs/<checkpoint>.pt \ --beam-size 5 \ --batch-size 500 \ --device cuda \ --penman-linearization --use-pointer-tokens

I meet a problem:

RuntimeError: Error(s) in loading state_dict for AMRBartForConditionalGeneration:
size mismatch for final_logits_bias: copying a param with shape torch.Size([1, 53587]) from checkpoint, the shape in current model is torch.Size([1, 53075])

could you help me?

The text was updated successfully, but these errors were encountered:

Wangpeiyi9979 · 2021-05-24T01:48:29Z

the <checkpoint>.pt is the saved checkpoint of the training process.

mbevila · 2021-05-24T10:41:46Z

So you have trained your own model? Does it work with the pretrained checkpoints we released?

Wangpeiyi9979 · 2021-05-25T08:36:59Z

Oh, sorry, it works well after I re-download the bart-large. It is strange.

eloitanguy · 2021-06-28T13:07:23Z

Hello, I believe I am experiencing the same issue.
I have been trying to use predict_amrs_from_plaintext.py in order to test out text-to-AMR pre-trained checkpoint on my own text files.

I am also meeting Wangpeiyi9979's error:

RuntimeError: Error(s) in loading state_dict for AMRBartForConditionalGeneration:
        size mismatch for final_logits_bias: copying a param with shape torch.Size([1, 53587]) from checkpoint, the shape in current model is torch.Size([1, 53075]).
        size mismatch for model.shared.weight: copying a param with shape torch.Size([53587, 1024]) from checkpoint, the shape in current model is torch.Size([53075, 1024]).
        size mismatch for model.encoder.embed_tokens.weight: copying a param with shape torch.Size([53587, 1024]) from checkpoint, the shape in current model is torch.Size([53075, 1024]).
        size mismatch for model.decoder.embed_tokens.weight: copying a param with shape torch.Size([53587, 1024]) from checkpoint, the shape in current model is torch.Size([53075, 1024]).

Since Wangpeiyi9979 mentioned this issue could come from BART-large, I deleted all the model cache, tried again, and met the same error.

Additional information:

I am running a separate conda env (python 3.7)
I have followed the README's installation instructions. (notably I do have transformers 2.11.0, which will be a problem on my end later on since I would like to integrate this AMR pipeline in my research project's pipeline)

I would like to apologise if this error comes from a poor use of your code or an improper installation.

Thank you very much for your work on AMR and thanks in advance for your response!

mbevila · 2021-06-29T16:15:52Z

Can you try to redownload the pretrained weights? We have pruned (I think a few months back) a few params in the checkpoint that did not play well with the current code.

eloitanguy · 2021-06-29T18:41:55Z

Thank you for your answer! I re-downloaded the 3.0 parsing weights and the same issue arose...

eloitanguy · 2021-07-05T07:50:58Z

Hello again, if you have the time i would greatly appreciate some help, my issue still hasn't resolved... thanks in advance!

mbevila · 2021-07-05T15:22:52Z

Try to use this checkpoint:

https://drive.google.com/file/d/1p7oyQPacWSF-WTXapaA55TPuRP_pJ-Rc/view?usp=sharing

Does it work?

eloitanguy · 2021-07-06T17:12:40Z

Thank you for taking the time to send me this checkpoint, I'm sorry to say I still have the same error...

mbevila · 2021-07-06T22:17:12Z

Clone again the repository and create a new env from scratch. It worked for me! As a last resort, try patching the checkpoint with https://github.com/SapienzaNLP/spring/blob/main/bin/patch_legacy_checkpoint.py, but the checkpoint I sent you was already patched.

eloitanguy · 2021-07-07T12:14:45Z

Hello again, I create a new conda env (with python 3.7), installed the requirements using the requirements.txt and ran:

python bin/predict_amrs_from_plaintext.py --texts test_text.txt --checkpoint AMR3.amr-lin3.patched.pt

And observed the usual result:

wandb: WARNING W&B installed but not logged in.  Run `wandb login` or set the WANDB_API_KEY env variable.
Traceback (most recent call last):
  File "bin/predict_amrs_from_plaintext.py", line 90, in <module>
    model.load_state_dict(torch.load(args.checkpoint, map_location='cpu')['model'])
  File "/home/eloi/miniconda3/envs/spring/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1407, in load_state_dict
    self.__class__.__name__, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for AMRBartForConditionalGeneration:
	size mismatch for final_logits_bias: copying a param with shape torch.Size([1, 53587]) from checkpoint, the shape in current model is torch.Size([1, 53075]).
	size mismatch for model.shared.weight: copying a param with shape torch.Size([53587, 1024]) from checkpoint, the shape in current model is torch.Size([53075, 1024]).
	size mismatch for model.encoder.embed_tokens.weight: copying a param with shape torch.Size([53587, 1024]) from checkpoint, the shape in current model is torch.Size([53075, 1024]).
	size mismatch for model.decoder.embed_tokens.weight: copying a param with shape torch.Size([53587, 1024]) from checkpoint, the shape in current model is torch.Size([53075, 1024]).

I am as puzzled as you are, i really don't see why it would work differently on our different machines...

eloitanguy · 2021-07-07T12:25:27Z

FYI, I tried the same thing on a different machine and encountered the same error :(

mbevila · 2021-07-08T10:32:26Z

You are missing two arguments: --penman-linearization --use-pointer-tokens

eloitanguy · 2021-07-08T12:50:57Z

Thank you very much, this seems to have been my issue all along, embarrassingly enough. This issue is solved, I'll now be looking into whether it would be possible to run this with the latest version of transformers (I'm telling you in case you have insight of that matter), I'll let you know if it works out! ^^

mbevila · 2021-07-08T19:03:45Z

Thanks a lot!

In the meanwhile I'll close the issue.

Wangpeiyi9979 closed this as completed May 25, 2021

mbevila reopened this Jul 5, 2021

mbevila closed this as completed Jul 8, 2021

mbevila mentioned this issue Jul 18, 2022

Size mismatch when loading state dict #17

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bug in evaluate #4

Bug in evaluate #4

Wangpeiyi9979 commented May 19, 2021

Wangpeiyi9979 commented May 24, 2021 •

edited

mbevila commented May 24, 2021

Wangpeiyi9979 commented May 25, 2021

eloitanguy commented Jun 28, 2021

mbevila commented Jun 29, 2021

eloitanguy commented Jun 29, 2021

eloitanguy commented Jul 5, 2021

mbevila commented Jul 5, 2021

eloitanguy commented Jul 6, 2021

mbevila commented Jul 6, 2021

eloitanguy commented Jul 7, 2021

eloitanguy commented Jul 7, 2021

mbevila commented Jul 8, 2021

eloitanguy commented Jul 8, 2021

mbevila commented Jul 8, 2021

Bug in evaluate #4

Bug in evaluate #4

Comments

Wangpeiyi9979 commented May 19, 2021

Wangpeiyi9979 commented May 24, 2021 • edited

mbevila commented May 24, 2021

Wangpeiyi9979 commented May 25, 2021

eloitanguy commented Jun 28, 2021

mbevila commented Jun 29, 2021

eloitanguy commented Jun 29, 2021

eloitanguy commented Jul 5, 2021

mbevila commented Jul 5, 2021

eloitanguy commented Jul 6, 2021

mbevila commented Jul 6, 2021

eloitanguy commented Jul 7, 2021

eloitanguy commented Jul 7, 2021

mbevila commented Jul 8, 2021

eloitanguy commented Jul 8, 2021

mbevila commented Jul 8, 2021

Wangpeiyi9979 commented May 24, 2021 •

edited