Fix ignore_mismatched_sizes#14085
Conversation
|
As you can see from the multiple failing tests, this makes the use case where we have a model with a task-specific head fail, so it looks like the fix is more complicated than just swapping the lines. |
|
I believe these failures are due to added AutoModel (TFAutoModel / FlaxAutoModel) test cases. I'm finding an appropriate output of AutoModel that can test mismatched sizes, but it seems that there are some model-specific restrictions or assertions. |
|
Indeed, some of the models are failing because of inner math between the hidden size and others, whereas others fail differently. You should adapt your test to just the vocab_size maybe? |
44530c8 to
f163c26
Compare
|
Thanks for fixing the tests! You should also ignore the test for LayoutLmv2 (who wants a |
|
Fixed, thanks for the suggestions. |
src/transformers/modeling_utils.py
Outdated
| @@ -1513,9 +1513,9 @@ def _load_state_dict_into_model( | |||
| for checkpoint_key in loaded_keys: | |||
| model_key = checkpoint_key | |||
| if remove_prefix and checkpoint_key.startswith(prefix): | |||
There was a problem hiding this comment.
@sgugger - do you know why we need a and checkpoint_key.startswith(prefix) here? The way I understand it if remove_prefix is True then it's never possible that any loaded key can start with the prefix no?
remove_prefix == True => has_prefix_module == False => checkpoint_key.startswith(prefix) can never be True no? What is the case that I'm missing here?
There was a problem hiding this comment.
That's correct. The and should probably be removed.
| elif add_prefix: | ||
| model_key = f"{prefix}.{checkpoint_key}" | ||
| elif add_prefix: | ||
| model_key = ".".join(checkpoint_key.split(".")[1:]) |
There was a problem hiding this comment.
This change looks correct to me!
patrickvonplaten
left a comment
There was a problem hiding this comment.
Change looks correct to me!
@sgugger - IMO it would make sense to rename remove_prefix to remove_prefix_from_init_model to make the code easier to understand..do you agree? Can open a follow-up PR if that's the case
|
I agree with @patrickvonplaten. Maybe we can rename |
|
I agree with @patrickvonplaten suggestion. Let's merge this PR once the comment is addressed, and then we can do the renaming in a followup PR! |
|
You will need to include this commit in your PR branch as the latest release of PyTorch 1.10 broke our CI. |
What does this PR do?
Fixes #14073
Who can review?
Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.
@sgugger