Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can PRIMERA accept 16k input? #17

Closed
GabrielLin opened this issue Jul 27, 2022 · 4 comments
Closed

Can PRIMERA accept 16k input? #17

GabrielLin opened this issue Jul 27, 2022 · 4 comments

Comments

@GabrielLin
Copy link

Could you please tell me can the models on HF (https://huggingface.co/allenai/PRIMERA, https://huggingface.co/allenai/PRIMERA-arxiv) accept 16k input. Can I just set the max_length to 16384 to let it accept such a length of a long document? Thanks.

@Wendy-Xiao
Copy link
Contributor

Hi there,

Yes, it can accept 16k input. However, in the models on HF, as it only pretrained with max_length=4096, it does not have trained position embedding for the tokens after that. If you would like to use a larger max_length, you can follow the same method used in Longformer-Encoder-Decoder, i.e. simply copying the position embeddings four times, and fine-tune the model with the new position embeddings.

@GabrielLin
Copy link
Author

Dear @Wendy-Xiao , thank you for your answer. It solves my concern.

@attekei
Copy link

attekei commented Nov 8, 2022

@GabrielLin Curious if you have built a model with max_length=16384. I'm summarising lots of documents at once, and frequently have ~10,000 tokens in total 🙂

I mean, if you have built trained such model, would be really cool if you could publish it in HuggingFace

@GabrielLin
Copy link
Author

Hi @attekei . Thank you for your interest. I am just doing research to compare different models.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants