Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix weight loading issue #14016

Merged

Conversation

ydshieh
Copy link
Collaborator

@ydshieh ydshieh commented Oct 15, 2021

What does this PR do?

Fix #14002 + add 2 more tests

(I uploaded the converted TF checkpoint to ydshieh). Might be better for @patrickvonplaten to upload to patrickvonplaten

Who can review?

@Rocketknight1 @patrickvonplaten @LysandreJik

@ydshieh
Copy link
Collaborator Author

ydshieh commented Oct 15, 2021

The issue in #14002 comes from the fact, when using from_pt=True, the block shown at the end doesn't use load_weight_prefix.

However, it is required to extend the variable scope for the 2 components (encoder & decoder) of a TF composite models - to make the subsequent save_pretrained -> from_pretrained work after a creation from from_encoder_decoder_pretrained.

I feel it would be better to modify load_pytorch_weights_in_tf2_model to address this situation, but I tried to avoid modify this Hugging Face's TF core method.

if from_pt:
from .modeling_tf_pytorch_utils import load_pytorch_checkpoint_in_tf2_model
# Load from a PyTorch checkpoint
return load_pytorch_checkpoint_in_tf2_model(model, resolved_archive_file, allow_missing_keys=True)

@patrickvonplaten
Copy link
Contributor

Looks great to me!

@ydshieh
Copy link
Collaborator Author

ydshieh commented Oct 28, 2021

Hi, I just realized that TFEncoderDecoderModel is released with v4.12.0 without this fix being merged.

ydshieh added a commit to ydshieh/transformers that referenced this pull request Nov 10, 2021
Copy link
Contributor

@patrickvonplaten patrickvonplaten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Merging after offline approval from @Rocketknight1

@patrickvonplaten patrickvonplaten merged commit a67d47b into huggingface:master Nov 15, 2021
patrickvonplaten added a commit that referenced this pull request Nov 15, 2021
ydshieh added a commit to ydshieh/transformers that referenced this pull request Dec 1, 2021
ydshieh added a commit to ydshieh/transformers that referenced this pull request Dec 1, 2021
ydshieh added a commit to ydshieh/transformers that referenced this pull request Dec 2, 2021
ydshieh added a commit to ydshieh/transformers that referenced this pull request Dec 7, 2021
ydshieh added a commit to ydshieh/transformers that referenced this pull request Dec 11, 2021
ydshieh added a commit to ydshieh/transformers that referenced this pull request Dec 19, 2021
ydshieh added a commit to ydshieh/transformers that referenced this pull request Dec 22, 2021
ydshieh added a commit to ydshieh/transformers that referenced this pull request Dec 26, 2021
ydshieh added a commit to ydshieh/transformers that referenced this pull request Jan 6, 2022
sgugger added a commit that referenced this pull request Jan 10, 2022
* Start the work on TFVisionEncoderDecoderModel

* Expose TFVisionEncoderDecoderModel

* fix import

* Add modeling_tf_vision_encoder_decoder to _ignore_modules in get_model_modules()

* reorder

* Apply the fix for checkpoint loading as in #14016

* remove attention_mask + fix VISION_DUMMY_INPUTS

* A minimal change to make TF generate() work for vision models as encoder in encoder-decoder setting

* fix wrong condition: shape_list(input_ids) == 2

* add tests

* use personal TFViTModel checkpoint (for now)

* Add equivalence tests + projection layer

* style

* make sure projection layer can run

* Add examples

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Clean comments (need to work on TODOs for PyTorch models)

* Remove TF -> PT in check_pt_tf_equivalence for TFVisionEncoderDecoderModel

* fixes

* Revert changes in PT code.

* Update tests/test_modeling_tf_vision_encoder_decoder.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Add test_inference_coco_en for TF test

* fix quality

* fix name

* build doc

* add main_input_name

* Fix ckpt name in test

* fix diff between master and this PR

* fix doc

* fix style and quality

* fix missing doc

* fix labels handling

* Delete auto.rst

* Add the changes done in #14016

* fix prefix

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* make style

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Albertobegue pushed a commit to Albertobegue/transformers that referenced this pull request Jan 27, 2022
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Albertobegue pushed a commit to Albertobegue/transformers that referenced this pull request Jan 27, 2022
* Start the work on TFVisionEncoderDecoderModel

* Expose TFVisionEncoderDecoderModel

* fix import

* Add modeling_tf_vision_encoder_decoder to _ignore_modules in get_model_modules()

* reorder

* Apply the fix for checkpoint loading as in huggingface#14016

* remove attention_mask + fix VISION_DUMMY_INPUTS

* A minimal change to make TF generate() work for vision models as encoder in encoder-decoder setting

* fix wrong condition: shape_list(input_ids) == 2

* add tests

* use personal TFViTModel checkpoint (for now)

* Add equivalence tests + projection layer

* style

* make sure projection layer can run

* Add examples

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Clean comments (need to work on TODOs for PyTorch models)

* Remove TF -> PT in check_pt_tf_equivalence for TFVisionEncoderDecoderModel

* fixes

* Revert changes in PT code.

* Update tests/test_modeling_tf_vision_encoder_decoder.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Add test_inference_coco_en for TF test

* fix quality

* fix name

* build doc

* add main_input_name

* Fix ckpt name in test

* fix diff between master and this PR

* fix doc

* fix style and quality

* fix missing doc

* fix labels handling

* Delete auto.rst

* Add the changes done in huggingface#14016

* fix prefix

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* make style

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
@ydshieh ydshieh deleted the fix_tf_enc_dec_weight_loading branch May 5, 2022 10:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

TFEncoderDecoderModel loading TF weights issue
2 participants