Add cross attentions to TFGPT2Model #14038

ydshieh · 2021-10-17T10:14:38Z

What does this PR do?

Add cross attention to TFGPT2.

This was previously done in #13222, but we decided to move this to a new PR.
I also added TFGPT2EncoderDecoderModelTest with test_bert2gpt2_summarization.

Rocketknight1 · 2021-10-19T15:02:20Z

tests/test_modeling_tf_encoder_decoder.py

+        tokenizer_in = AutoTokenizer.from_pretrained("bert-base-cased")
+        tokenizer_out = AutoTokenizer.from_pretrained("gpt2")
+
+        """Not working, because pt checkpoint has `encoder.encoder.layer...` while tf model has `encoder.bert.layer...`.


Add another data point to the "We need to think about how to do cross-framework model loading better" chart

Rocketknight1

This looks very similar to the previous PR adding cross-attention to other models. Tests looks good and are passing, so I'm happy to approve it.

ydshieh · 2021-10-19T16:36:27Z

Just correct the line about

"""Not working, because pt checkpoint has `encoder.encoder.layer...` while tf model has `encoder.bert.layer...`.

For tf here, it has encoder.bert.encoder.layer... instead.

LysandreJik

This looks good to me but asking @patrickvonplaten for a review as wel

patrickvonplaten · 2021-11-02T11:03:11Z

Sorry, I tried fixing some merge conflicts, but I think this introduced a new error. @ydshieh could you maybe quickly go into the PR again and fix those last tests? :-) The PR looks good for me otherwise!

ydshieh · 2021-11-02T11:48:13Z

Sorry, I tried fixing some merge conflicts, but I think this introduced a new error. @ydshieh could you maybe quickly go into the PR again and fix those last tests? :-) The PR looks good for me otherwise!

It's OK now :)

sgugger requested review from patrickvonplaten and Rocketknight1 October 18, 2021 21:56

Rocketknight1 reviewed Oct 19, 2021

View reviewed changes

Rocketknight1 approved these changes Oct 19, 2021

View reviewed changes

ydshieh mentioned this pull request Oct 26, 2021

Add TFVisionEncoderDecoderModel #14148

Merged

ydshieh added 3 commits October 28, 2021 15:22

Add cross attentions to TFGPT2Model

d7bdd09

change to is_pt_tf_cross_test

06b79e6

A minor correction to a comment

403600d

ydshieh force-pushed the add_crossattn_to_tf_gpt2 branch from 4b93999 to 403600d Compare October 28, 2021 13:26

LysandreJik approved these changes Oct 28, 2021

View reviewed changes

Merge branch 'master' into add_crossattn_to_tf_gpt2

ffc8686

Remove n_ctx when creating self.crossattention

d49cd9b

patrickvonplaten merged commit bd21ed4 into huggingface:master Nov 3, 2021

ydshieh deleted the add_crossattn_to_tf_gpt2 branch November 3, 2021 09:14

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add cross attentions to TFGPT2Model #14038

Add cross attentions to TFGPT2Model #14038

ydshieh commented Oct 17, 2021

Rocketknight1 Oct 19, 2021

Rocketknight1 left a comment

ydshieh commented Oct 19, 2021

LysandreJik left a comment

patrickvonplaten commented Nov 2, 2021

ydshieh commented Nov 2, 2021

Add cross attentions to TFGPT2Model #14038

Add cross attentions to TFGPT2Model #14038

Conversation

ydshieh commented Oct 17, 2021

What does this PR do?

Rocketknight1 Oct 19, 2021

Choose a reason for hiding this comment

Rocketknight1 left a comment

Choose a reason for hiding this comment

ydshieh commented Oct 19, 2021

LysandreJik left a comment

Choose a reason for hiding this comment

patrickvonplaten commented Nov 2, 2021

ydshieh commented Nov 2, 2021