Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

About pretrain model #3

Closed
willemeng opened this issue Feb 24, 2023 · 3 comments
Closed

About pretrain model #3

willemeng opened this issue Feb 24, 2023 · 3 comments

Comments

@willemeng
Copy link

When I train according to the training script provided by the readme, I get the following information:

size mismatch for layers.0.blocks.0.attn.relative_coords_table: copying a param with shape torch.Size([1, 23, 23, 2]) from checkpoint, the shape in current model is torch.Size([1, 43, 43, 2]).
size mismatch for layers.0.blocks.0.attn.relative_position_index: copying a param with shape torch.Size([144, 144]) from checkpoint, the shape in current model is torch.Size([484, 484]).
size mismatch for layers.0.blocks.1.attn.relative_coords_table: copying a param with shape torch.Size([1, 23, 23, 2]) from checkpoint, the shape in current model is torch.Size([1, 43, 43, 2]).
size mismatch for layers.0.blocks.1.attn.relative_position_index: copying a param with shape torch.Size([144, 144]) from checkpoint, the shape in current model is torch.Size([484, 484]).
size mismatch for layers.1.blocks.0.attn.relative_coords_table: copying a param with shape torch.Size([1, 23, 23, 2]) from checkpoint, the shape in current model is torch.Size([1, 43, 43, 2]).
。。。。。。

The pre-trained model does not match the current model shape size.
How can I solve this problem?I run exactly according to your script and did not modify any code。
Thanks!

@VitorGuizilini-TRI
Copy link

I am getting a similar problem, and also haven't modified the code at all.

@VitorGuizilini-TRI
Copy link

Bumping this up, does anyone know how to fix this?

@Gengzigang
Copy link
Member

This issue has no impact because the relative_coords_table is computed directly during the initialization and does not require loading.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants