Skip to content

Fix ignore_mismatched_sizes#14085

Merged
sgugger merged 10 commits intohuggingface:masterfrom
qqaatw:fix_ignore_mismatched_size_option
Oct 21, 2021
Merged

Fix ignore_mismatched_sizes#14085
sgugger merged 10 commits intohuggingface:masterfrom
qqaatw:fix_ignore_mismatched_size_option

Conversation

@qqaatw
Copy link
Copy Markdown
Contributor

@qqaatw qqaatw commented Oct 20, 2021

What does this PR do?

Fixes #14073

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

@sgugger

@sgugger
Copy link
Copy Markdown
Collaborator

sgugger commented Oct 20, 2021

As you can see from the multiple failing tests, this makes the use case where we have a model with a task-specific head fail, so it looks like the fix is more complicated than just swapping the lines.

@qqaatw
Copy link
Copy Markdown
Contributor Author

qqaatw commented Oct 20, 2021

I believe these failures are due to added AutoModel (TFAutoModel / FlaxAutoModel) test cases. I'm finding an appropriate output of AutoModel that can test mismatched sizes, but it seems that there are some model-specific restrictions or assertions.

@sgugger
Copy link
Copy Markdown
Collaborator

sgugger commented Oct 20, 2021

Indeed, some of the models are failing because of inner math between the hidden size and others, whereas others fail differently. You should adapt your test to just the vocab_size maybe?

@qqaatw qqaatw force-pushed the fix_ignore_mismatched_size_option branch from 44530c8 to f163c26 Compare October 21, 2021 10:21
@sgugger
Copy link
Copy Markdown
Collaborator

sgugger commented Oct 21, 2021

Thanks for fixing the tests! You should also ignore the test for LayoutLmv2 (who wants a bbox) and there is the encoder-decoder template test to fix, I left a suggestion for that.

@qqaatw
Copy link
Copy Markdown
Contributor Author

qqaatw commented Oct 21, 2021

Fixed, thanks for the suggestions.

@@ -1513,9 +1513,9 @@ def _load_state_dict_into_model(
for checkpoint_key in loaded_keys:
model_key = checkpoint_key
if remove_prefix and checkpoint_key.startswith(prefix):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sgugger - do you know why we need a and checkpoint_key.startswith(prefix) here? The way I understand it if remove_prefix is True then it's never possible that any loaded key can start with the prefix no?

remove_prefix == True => has_prefix_module == False => checkpoint_key.startswith(prefix) can never be True no? What is the case that I'm missing here?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's correct. The and should probably be removed.

elif add_prefix:
model_key = f"{prefix}.{checkpoint_key}"
elif add_prefix:
model_key = ".".join(checkpoint_key.split(".")[1:])
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change looks correct to me!

patrickvonplaten
patrickvonplaten approved these changes Oct 21, 2021
Copy link
Copy Markdown
Contributor

@patrickvonplaten patrickvonplaten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change looks correct to me!

@sgugger - IMO it would make sense to rename remove_prefix to remove_prefix_from_init_model to make the code easier to understand..do you agree? Can open a follow-up PR if that's the case

@qqaatw
Copy link
Copy Markdown
Contributor Author

qqaatw commented Oct 21, 2021

I agree with @patrickvonplaten. Maybe we can rename add_prefix to add_prefix_to_init_model as well to make it more clear.

@sgugger
Copy link
Copy Markdown
Collaborator

sgugger commented Oct 21, 2021

I agree with @patrickvonplaten suggestion. Let's merge this PR once the comment is addressed, and then we can do the renaming in a followup PR!

@sgugger
Copy link
Copy Markdown
Collaborator

sgugger commented Oct 21, 2021

You will need to include this commit in your PR branch as the latest release of PyTorch 1.10 broke our CI.

@sgugger sgugger merged commit 234cfef into huggingface:master Oct 21, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ignore_mismatched_sizes do not work propoerly

3 participants