Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

size mismatch for encoder.model.layers.1.downsample.norm.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]). #206

Open
Sridhar-Ranganaboina opened this issue Jun 6, 2023 · 6 comments

Comments

@Sridhar-Ranganaboina
Copy link

size mismatch for encoder.model.layers.1.downsample.norm.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]).

Getting this error at line no: 596 in model.py
model = super(DonutModel, cls).from_pretrained(pretrained_model_name_or_path, revision="official", *model_args, **kwargs)

@jzsyuan
Copy link

jzsyuan commented Jun 11, 2023

ignore_mismatched_sizes=True

@gwkrsrch
Copy link
Collaborator

Hi, thank you for bringing this issue to our attention. It appears that the problem is likely related to the environment configuration. We will resolve this issue, while also updating the repository accordingly. In the meantime, we kindly request you to refer to the recently updated Google Colab demos and verify that the versions of the essential libraries are in alignment. You can find the necessary information at this link: GitHub Issue Comment

@yaoliUoA
Copy link

the google colab demo has the same size mismatch issue too.

I have reproduced the error in the "colab-demo-for-donut-base-finetuned-docvqa.ipynb" too.

/usr/local/lib/python3.10/dist-packages/transformers/modeling_utils.py in _load_pretrained_model(cls, model, state_dict, loaded_keys, resolved_archive_file, pretrained_model_name_or_path, ignore_mismatched_sizes, sharded_metadata, _fast_init, low_cpu_mem_usage, device_map, offload_folder, offload_state_dict, dtype, is_quantized, keep_in_fp32_modules)
3530 "\n\tYou may consider adding ignore_mismatched_sizes=True in the model from_pretrained method."
3531 )
-> 3532 raise RuntimeError(f"Error(s) in loading state_dict for {model.class.name}:\n\t{error_msg}")
3533
3534 if is_quantized:

RuntimeError: Error(s) in loading state_dict for DonutModel:
size mismatch for encoder.model.layers.1.downsample.norm.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]).
size mismatch for encoder.model.layers.1.downsample.norm.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]).
size mismatch for encoder.model.layers.1.downsample.reduction.weight: copying a param with shape torch.Size([512, 1024]) from checkpoint, the shape in current model is torch.Size([256, 512]).
size mismatch for encoder.model.layers.2.downsample.norm.weight: copying a param with shape torch.Size([2048]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for encoder.model.layers.2.downsample.norm.bias: copying a param with shape torch.Size([2048]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for encoder.model.layers.2.downsample.reduction.weight: copying a param with shape torch.Size([1024, 2048]) from checkpoint, the shape in current model is torch.Size([512, 1024]).
You may consider adding ignore_mismatched_sizes=True in the model from_pretrained method.

@ariefwijaya
Copy link

any update for this?

@crackthedata
Copy link

any update on this? Tried multiple versions of timm and transformers and still getting the same error

@david-fm
Copy link

david-fm commented Dec 4, 2023

Be sure to have the proper version:

!pip install transformers==4.25.1
!pip install pytorch-lightning==1.6.4
!pip install timm==0.5.4
!pip install gradio
!pip install donut-python

and compare the app code with the corresponding Google Colab notebook

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants