Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add T5 Encoder for Feature Extraction #8717

Merged
merged 38 commits into from
Nov 30, 2020

Conversation

agemagician
Copy link
Contributor

@agemagician agemagician commented Nov 22, 2020

What does this PR do?

While using T5 for feature extraction, I found out that T5 encoder provides better features than T5 decoder. Hence, it makes sense to have T5 encoder only, which should reduce the memory and inference time by half, if feature extraction is needed rather than conditional generation.

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you read the contributor guideline,
    Pull Request section?
  • Was this discussed/approved via a Github issue or the forum? Please add a link
    to it if that's the case.
  • Did you make sure to update the documentation with your changes? Here are the
    documentation guidelines, and
    here are tips on formatting docstrings.
  • Did you write any new necessary tests?

Who can review?

T5: @patrickvonplaten
tensorflow: @jplu

@patrickvonplaten
Copy link
Contributor

I like it!

Copy link
Contributor

@patrickvonplaten patrickvonplaten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this addition a lot!

We should also include this new model in the documentation here:
https://github.com/huggingface/transformers/blob/master/docs/source/model_doc/t5.rst#t5model
and add it to the tests here:

all_model_classes = (T5Model, T5ForConditionalGeneration) if is_torch_available() else ()

=> There might be some issues with the tests after adding the model...I'm happy to go into your PR and fix them accordingly :-)

And we can then also add the model to mT5 :-)

Copy link
Contributor

@jplu jplu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you very much for this addition!!

@patrickvonplaten is the T5 master here so I let him review for the modeling point of view. About TF I think we should wait that the PR about the new inputs to be merged. Also a tiny comment.

Comment on lines 1453 to 1465
self,
inputs,
attention_mask=None,
encoder_outputs=None,
past_key_values=None,
head_mask=None,
inputs_embeds=None,
use_cache=None,
output_attentions=None,
output_hidden_states=None,
return_dict=None,
training=False,
**kwargs,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can remove the encoder_outputs , pask_key_values and use_cache parameters from the list to avoid confusion. Wdyt @patrickvonplaten?

agemagician and others added 7 commits November 24, 2020 16:01
change output from Seq2SeqModelOutput to BaseModelOutput

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
1. remove encoder_outputs from the function call.
2. remove the encoder_outputs If statement.
3. remove isinstance from return_dict.
remove pask_key_values and use_cache
remove use_cache from the forward method
add doctoring for T5 encoder with T5_ENCODER_INPUTS_DOCSTRING
@agemagician
Copy link
Contributor Author

agemagician commented Nov 24, 2020

Great, I am glad that you did like it.

Thanks @patrickvonplaten and @jplu for your feedback.

@patrickvonplaten :
I have adjusted all your code review, and also add it to the T5 documentation.
The only missing part is the tests, it will be great if you could add it.

@jplu :
I have removed the unnecessary parameters from the TF model.

Is there anything else needed from my side to merge the pull request ?

@patrickvonplaten
Copy link
Contributor

Great, I am glad that you did like it.

Thanks @patrickvonplaten and @jplu for your feedback.

@patrickvonplaten :
I have adjusted all your code review, and also add it to the T5 documentation.
The only missing part is the tests, it will be great if you could add it.

@jplu :
I have removed the unnecessary parameters from the TF model.

Is there anything else needed from my side to merge the pull request ?

I think that's great! I'll fix the tests and merge :-)

@@ -554,7 +554,7 @@ class RagDPRT5Test(RagTestMixin, unittest.TestCase):
def config_and_inputs(self):
question_encoder_tester = DPRModelTester(self)
dpr_config_and_inputs = question_encoder_tester.prepare_config_and_inputs()
generator_tester = T5ModelTester(self, vocab_size=1100, n_positions=30)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

n_positions does not exist anymore => it was useless so remove it from tests as well

@patrickvonplaten
Copy link
Contributor

Update:

PR is ready for review IMO. Would be great if @LysandreJik @jplu and @sgugger you can take a look :-)

@agemagician
Copy link
Contributor Author

Update:

PR is ready for review IMO. Would be great if @LysandreJik @jplu and @sgugger you can take a look :-)

Thanks a lot @patrickvonplaten ^_^

Copy link
Member

@LysandreJik LysandreJik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very clean implementation, thanks a lot @agemagician!

src/transformers/models/mt5/modeling_mt5.py Show resolved Hide resolved
src/transformers/models/t5/modeling_t5.py Show resolved Hide resolved
src/transformers/models/t5/modeling_tf_t5.py Outdated Show resolved Hide resolved
patrickvonplaten and others added 4 commits November 29, 2020 17:38
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Copy link
Contributor

@jplu jplu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Great work!!

config_and_inputs = self.model_tester.prepare_config_and_inputs()
self.model_tester.create_and_check_model(*config_and_inputs)

# is not able to be part of a pipeline
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does-it means not able to be part of a pipeline? It is because the test fails?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since T5Encoder just outputs the encoded hidden states I cannot really be used in combination with pipeline()

@agemagician
Copy link
Contributor Author

agemagician commented Nov 29, 2020

Very clean implementation, thanks a lot @agemagician!

You are welcome.
I am glad that I could help making the library better, even with a small contribution.
I have to say without @patrickvonplaten help, I could not make it ^_^

@patrickvonplaten patrickvonplaten merged commit 40ecaf0 into huggingface:master Nov 30, 2020
stas00 pushed a commit to stas00/transformers that referenced this pull request Dec 5, 2020
* Add T5 Encoder class for feature extraction

* fix T5 encoder add_start_docstrings indent

* update init with T5 encoder

* update init with TFT5ModelEncoder

* remove TFT5ModelEncoder

* change T5ModelEncoder order in init

* add T5ModelEncoder to transformers init

* clean T5ModelEncoder

* update init with TFT5ModelEncoder

* add TFModelEncoder for Tensorflow

* update init with TFT5ModelEncoder

* Update src/transformers/models/t5/modeling_t5.py

change output from Seq2SeqModelOutput to BaseModelOutput

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* remove encoder_outputs

1. remove encoder_outputs from the function call.
2. remove the encoder_outputs If statement.
3. remove isinstance from return_dict.

* Authorize missing decoder keys

* remove unnecessary input parameters

remove pask_key_values and use_cache

* remove use_cache

remove use_cache from the forward method

* add doctoring for T5 encoder

add doctoring for T5 encoder with T5_ENCODER_INPUTS_DOCSTRING

* change return_dict to dot access

* add T5_ENCODER_INPUTS_DOCSTRING for TF T5

* change TFT5Encoder output type to BaseModelOutput

* remove unnecessary parameters for TFT5Encoder

* remove unnecessary if statement

* add import BaseModelOutput

* fix BaseModelOutput typo to TFBaseModelOutput

* update T5 doc with T5ModelEncoder

* add T5ModelEncoder to tests

* finish pytorch

* finish docs and mt5

* add mtf to init

* fix init

* remove n_positions

* finish PR

* Update src/transformers/models/mt5/modeling_mt5.py

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Update src/transformers/models/t5/modeling_t5.py

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Update src/transformers/models/t5/modeling_tf_t5.py

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Update src/transformers/models/mt5/modeling_tf_mt5.py

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* make style

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants