Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added support for Tapas Model #520

Conversation

JuheonChu
Copy link
Contributor

@JuheonChu JuheonChu commented Nov 27, 2022

What does this PR do?

This PR adds BetterTransformerAPI for Tapas model.

Fixes #20372

Before submitting

  • This PR adds new support for BetterTransformer integration for the Tapas model.
  • This PR adds documentation that indicates BetterTransofrmer integration for Tapas is added.

Questions

  • Can I ask you how I can test to add Bettertransformer feature for Tapas Model?

To: @younesbelkada, @sgugger

@JuheonChu
Copy link
Contributor Author

JuheonChu commented Nov 27, 2022

Do you mind if I ask you whether I am on the right track?
Any comment or feedback will be very appreciated.

Copy link
Member

@michaelbenayoun michaelbenayoun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For me it looks like you did everything well!
Left a few nits and comments.
@younesbelkada will be back on Wednesday, so we should wait for his approval, but great work!

tests/bettertransformer/test_bettertransformer_encoder.py Outdated Show resolved Hide resolved
optimum/bettertransformer/models/encoder_models.py Outdated Show resolved Hide resolved
@@ -23,6 +23,8 @@
ViTLayerBetterTransformer,
Wav2Vec2EncoderLayerBetterTransformer,
WhisperEncoderLayerBetterTransformer,
TapasLayerBetterTransformer,

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change

@HuggingFaceDocBuilderDev
Copy link

HuggingFaceDocBuilderDev commented Nov 28, 2022

The documentation is not available anymore as the PR was closed or merged.

Juheon Chu and others added 4 commits November 28, 2022 08:50
@JuheonChu
Copy link
Contributor Author

Thanks for your suggestion and feedback! I reformatted with black, and can I proceed with the CircleCI test one more time to check if my reformatting file works?

Copy link
Contributor

@sgugger sgugger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your contribution! @younesbelkada good to merge if you are satisfied.

tests/bettertransformer/test_bettertransformer_encoder.py Outdated Show resolved Hide resolved
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Copy link
Collaborator

@fxmarty fxmarty left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Waiting for the tests to run but it looks good to me!

@fxmarty
Copy link
Collaborator

fxmarty commented Nov 28, 2022

Apparently a test is not passing: https://github.com/huggingface/optimum/actions/runs/3566116329/jobs/5992236636

You can run them with pytest tests/bettertransformer/test_XXX.py -k "name_of_test" --exitfirst -s locally to see if a specific test passes.

Can you run as well make style for the code quality?

@JuheonChu
Copy link
Contributor Author

@fxmarty I applied make style command and made commits with it!
Thanks for your suggestion and I will try to test it out based on your suggestion.
I sincerely appreciate with your guidance.

@JuheonChu
Copy link
Contributor Author

JuheonChu commented Nov 28, 2022

It seems to me that module packages in tests/bettertransformer/test_bettertransformer_encoder.py are importing all fail. However, I would not like to change the structure of a file system based on the suggestion from here. May I ask you for any suggestions?

Thank you for your time.

Co-authored-by: fxmarty <9808326+fxmarty@users.noreply.github.com>
JuheonChu and others added 2 commits November 28, 2022 16:39
Call super() in the _init() to inherit from BetterTransformerBaseLayer

Co-authored-by: fxmarty <9808326+fxmarty@users.noreply.github.com>
Copy link
Contributor

@younesbelkada younesbelkada left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks so much for your efforts @JuheonChu !
I left few comments, I think that the conversion is currently failing because the keys that you are trying to retrieve (for eg tapas_layer.attention.query.weight does not match, here you should use tapas_layer.attention.self.query.weight instead. Check the detailed modeling file here. )
One way to make sure that your conversion worked is to constantly run this snippet and debug further 💪 :

from transformers import AutoModel
from optimum.bettertransformer import BetterTransformer

model_id = "hf-internal-testing/tiny-random-TapasModel"
model = AutoModel.from_pretrained(model_id)
bt_model = BetterTransformer.transform(model)

Thanks and let us know if you face into any issue!

@@ -49,6 +49,7 @@
ALL_ENCODER_DECODER_MODELS_TO_TEST = [
"hf-internal-testing/tiny-random-FSMTModel",
"hf-internal-testing/tiny-random-BartModel",
"hf-internal-testing/tiny-random-TapasModel",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you move this to ALL_ENCODER_MODELS_TO_TEST instead? Tapas is an encoder-only model ;)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reading https://huggingface.co/docs/transformers/v4.24.0/en/model_doc/tapas#transformers.TapasModel

The model can behave as an encoder (with only self-attention) as well as a decoder, in which case a layer of cross-attention is added between the self-attention layers, following the architecture described in Attention is all you need by Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser and Illia Polosukhin.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I see! Thanks for an insightful explanation! I just moved it! :)

tests/bettertransformer/test_bettertransformer_encoder.py Outdated Show resolved Hide resolved
self.in_proj_weight = nn.Parameter(
torch.cat(
[
tapas_layer.attention.query.weight,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looking at the Tapas implementation, https://github.com/huggingface/transformers/blob/28247e78819ab9756b81f8df39611c333d099400/src/transformers/models/tapas/modeling_tapas.py#L442 , I think we need here and in the rest tapas_layer.attention.self.query.weight, that is why the test is failing!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh........ thanks you for providing me with your insights!

@fxmarty
Copy link
Collaborator

fxmarty commented Nov 29, 2022

Actually, looking at https://github.com/huggingface/transformers/blob/28247e78819ab9756b81f8df39611c333d099400/src/transformers/models/tapas/modeling_tapas.py#L539

I think the TapasLayerBetterTransformer is not needed. I think we simply need to use the mapping "TapasLayer": BetLayerBetterTransformer,

What do you think @younesbelkada ?

JuheonChu and others added 2 commits November 29, 2022 04:52
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
@younesbelkada
Copy link
Contributor

yes probably, thanks for the heads up @fxmarty !
@JuheonChu , you can also modify the mapping with "TapasLayer": BertLayerBetterTransformer and let us know if the conversion works fine!

@JuheonChu
Copy link
Contributor Author

JuheonChu commented Nov 29, 2022

Just made changes accordingly! @younesbelkada Thank you so much! I will try the conversions and let you know!
Do you mind if you can check the commit above is the one you intended to say?

@@ -19,6 +19,7 @@
BertLayerBetterTransformer,
DistilBertLayerBetterTransformer,
FSMTEncoderLayerBetterTransformer,
TapasLayerBetterTransformer,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be removed

@fxmarty
Copy link
Collaborator

fxmarty commented Nov 29, 2022

Seems all good now @JuheonChu ! Thank you for this PR! So in the end using BertLayerBetterTransformer was enough to work with Tapas. If you can just address the points above, and we are ready to merge :)

@JuheonChu
Copy link
Contributor Author

JuheonChu commented Nov 29, 2022

Thank you so much, everyone!! @fxmarty @younesbelkada @sgugger @michaelbenayoun.
I just made changes accordingly! @fxmarty.
Do you mind if you can double check if my commits are correct as you intended?

@@ -17,7 +17,6 @@
from .base import BetterTransformerBaseLayer


class TapasLayerBetterTransformer(BetterTransformerBaseLayer):
def __init__(self, tapas_layer, config):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant to remove the class TapasLayerBetterTransformer in full 😅 It is not needed in the end since BertLayerBetterTransformer can handle Tapas.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I removed class TapasLayerBetterTransformer. Is there anything to be adjusted in this file?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On your branch, with optimum dev installed (pip install -e . in the optimum top folder), can you try to run the script @younesbelkada provided you and see what happens?

from transformers import AutoModel
from optimum.bettertransformer import BetterTransformer

model_id = "hf-internal-testing/tiny-random-TapasModel"
model = AutoModel.from_pretrained(model_id)
bt_model = BetterTransformer.transform(model)

@younesbelkada
Copy link
Contributor

Hey @JuheonChu !
I have pushed directly on your branch ;) The tests should now pass 💪

Copy link
Contributor

@younesbelkada younesbelkada left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot @JuheonChu for adding Tapas support for BetterTransformer! Great job!
Looking forward to your next contributions 💪

Copy link
Member

@michaelbenayoun michaelbenayoun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@michaelbenayoun michaelbenayoun merged commit d33801d into huggingface:main Nov 30, 2022
@JuheonChu
Copy link
Contributor Author

@younesbelkada May I ask you where I can check the merged BetterTransformer feature of Tapas model?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants