T5 Model : What is maximum sequence length that can be used with pretrained T5 (3b model) checkpoint? #5204

shamanez · 2020-06-23T02:36:22Z

As the paper described, T5 uses a relative attention mechanism and the answer for this issue says, T5 can use any sequence length were the only constraint is memory.

According to this, can I use T5 to summarize inputs that have more than 512 tokens in a sequence?

patrickvonplaten · 2020-06-23T10:10:46Z

Yes you can, but you should be aware that memory requirements quadruple when doubling the input sequence length for "normal" self-attention (as in T5).

So you will quickly run out of memory.

Here a snippet that shows that you can run input ids longer than config.max_postion_embeddings.

import torch
from transformers import T5ForConditionalGeneration

model = T5ForConditionalGeneration.from_pretrained("t5-base")
model.config.max_position_embeddings  # 512
input_ids = torch.tensor([600 * [0]])  # shape (1, 600)
model(input_ids, decoder_input_ids=input_ids)  # => no error

For more memory efficient models, you should take a look at Reformer and Longformer

patrickvonplaten · 2020-06-23T10:11:02Z

I hope we will soon have these models ready for summarization

shamanez · 2020-06-23T11:46:57Z

Thanks for the quick help.

So basically, the T5 model in hugging face can handled arbitrary sequence length outputs right?
So the second line (model.config.max_position_embeddings) basically shows the default max input seq length right ?

What do you think of the following code (Here I simply modify the tokenizer max_length):

model = T5ForConditionalGeneration.from_pretrained('t5-small')
 tokenizer = T5Tokenizer.from_pretrained('t5-small')
 t5_prepared_Text = "summarize: "+some_preprocess_text 
 tokenized_text = tokenizer.encode(t5_prepared_Text,  max_length=1024,return_tensors="pt")

 summary_ids = model.generate(tokenized_text,
                                    num_beams=4,
                                    no_repeat_ngram_size=2,
                                    min_length=30,
                                    max_length=100,
                                    early_stopping=True)

shamanez · 2020-06-23T13:10:31Z

Hi, I checked two summary outputs of T5, after using 1024 and 512 sequence lengths. I do not see any difference in generated summaries. Any idea for this behavior?

mars997 · 2021-02-15T03:32:50Z

Hi, I checked two summary outputs of T5, after using 1024 and 512 sequence lengths. I do not see any difference in generated summaries. Any idea for this behavior?

Hi I have the same question. Do you happen to figure out why?

shamanez · 2021-02-15T03:36:12Z

Hi, Those days I haven't had much of idea on huggiface models. Since we can add any length as the input.. the main parameter should be minimum generation length. Try to change it.

mars997 · 2021-02-15T03:40:02Z

Hi, Those days I haven't had much of idea on huggiface models. Since we can add any length as the input.. the main parameter should be minimum generation length. Try to change it.

I am still very new to huggiface. I have a pretty long text about 1500 words. The issue I was having is when I set max_length=512 or 1024, they kinda return the same summary. Do you know why?

shamanez · 2021-02-15T04:02:05Z

I think it is because minimum length is unchanged. Regardless of the input.. algorthm tries to generate a text until it gets the EOS (end of sentence) token. So it is common to get same length summary even if u add few more sentence to the original input.

…

On Mon, Feb 15, 2021, 16:40 mars997 ***@***.***> wrote: Hi, Those days I haven't had much of idea on huggiface models. Since we can add any length as the input.. the main parameter should be minimum generation length. Try to change it. I am still very new to huggiface. I have a pretty long text about 1500 words. The issue I was having is when I set max_length=512 or 1024, they kinda return the same summary. Do you know why? — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#5204 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AEA4FGXCWKQKTGQML5LWTPLS7CJSLANCNFSM4OFG7QHA> .

PastelBelem8 · 2022-01-30T18:17:10Z

Hi, do we have to fine-tune the model when changing the model.config.max_position_embeddings?

shamanez · 2022-02-07T20:47:55Z

No really, cz T5 uses relative positional embeddings.

RenzeLou · 2023-01-03T04:40:08Z

I think it is because minimum length is unchanged. Regardless of the input.. algorthm tries to generate a text until it gets the EOS (end of sentence) token. So it is common to get same length summary even if u add few more sentence to the original input.
…
On Mon, Feb 15, 2021, 16:40 mars997 @.***> wrote: Hi, Those days I haven't had much of idea on huggiface models. Since we can add any length as the input.. the main parameter should be minimum generation length. Try to change it. I am still very new to huggiface. I have a pretty long text about 1500 words. The issue I was having is when I set max_length=512 or 1024, they kinda return the same summary. Do you know why? — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#5204 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AEA4FGXCWKQKTGQML5LWTPLS7CJSLANCNFSM4OFG7QHA .

Personally, I think there is another reason:

First, if you use the off-the-shelf T5-base model to summarize directly (i.e., no fine-tuning), a longer input would result in the same output as the original input. Because the T5-base model was pre-trained with max_source_length==512, those tokens exceeding 512 may not be attended by the T5Attention layer.

But after fine-tuning the T5-base model with a longer max_source_length, an input with a longer max_source_length perhaps gives you a different output than 512.

shanto-Rahman · 2023-08-29T21:43:31Z

What is the maximum sequence length for the T5-large?

sshleifer assigned patrickvonplaten Jun 23, 2020

patrickvonplaten closed this as completed Jun 23, 2020

patrickvonplaten mentioned this issue Apr 19, 2022

CodeT5 tokenizer.model_max_length is 1000000000000000019884624838656 #16842

Closed

4 tasks

patrickvonplaten mentioned this issue Apr 28, 2022

Warning tells you you will get indexing errors in T5 for going beyond max length #16986

Closed

4 tasks

mshen2 mentioned this issue Jun 4, 2022

T5ForConditionalGeneration does not require resize_position_embeddings when input sequence length is longer than 512? #17557

Closed

ybracke mentioned this issue Oct 18, 2023

Dealing with long input sequences ybracke/transnormer#49

Open

abejburton mentioned this issue May 10, 2024

Check if any summaries are larger than Flan context window uchicago-capp-30320/CivicLens#157

Open

AmericanPresidentJimmyCarter mentioned this issue Jun 13, 2024

[SD3 Inference] T5 Token limit huggingface/diffusers#8506

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

T5 Model : What is maximum sequence length that can be used with pretrained T5 (3b model) checkpoint? #5204

T5 Model : What is maximum sequence length that can be used with pretrained T5 (3b model) checkpoint? #5204

shamanez commented Jun 23, 2020

patrickvonplaten commented Jun 23, 2020

patrickvonplaten commented Jun 23, 2020

shamanez commented Jun 23, 2020 •

edited

Loading

shamanez commented Jun 23, 2020

mars997 commented Feb 15, 2021

shamanez commented Feb 15, 2021 via email

mars997 commented Feb 15, 2021

shamanez commented Feb 15, 2021 via email

PastelBelem8 commented Jan 30, 2022

shamanez commented Feb 7, 2022

RenzeLou commented Jan 3, 2023 •

edited

Loading

shanto-Rahman commented Aug 29, 2023

T5 Model : What is maximum sequence length that can be used with pretrained T5 (3b model) checkpoint? #5204

T5 Model : What is maximum sequence length that can be used with pretrained T5 (3b model) checkpoint? #5204

Comments

shamanez commented Jun 23, 2020

patrickvonplaten commented Jun 23, 2020

patrickvonplaten commented Jun 23, 2020

shamanez commented Jun 23, 2020 • edited Loading

shamanez commented Jun 23, 2020

mars997 commented Feb 15, 2021

shamanez commented Feb 15, 2021 via email

mars997 commented Feb 15, 2021

shamanez commented Feb 15, 2021 via email

PastelBelem8 commented Jan 30, 2022

shamanez commented Feb 7, 2022

RenzeLou commented Jan 3, 2023 • edited Loading

shanto-Rahman commented Aug 29, 2023

shamanez commented Jun 23, 2020 •

edited

Loading

RenzeLou commented Jan 3, 2023 •

edited

Loading