-
Notifications
You must be signed in to change notification settings - Fork 30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Can PRIMERA accept 16k input? #17
Comments
Hi there, Yes, it can accept 16k input. However, in the models on HF, as it only pretrained with max_length=4096, it does not have trained position embedding for the tokens after that. If you would like to use a larger max_length, you can follow the same method used in Longformer-Encoder-Decoder, i.e. simply copying the position embeddings four times, and fine-tune the model with the new position embeddings. |
Dear @Wendy-Xiao , thank you for your answer. It solves my concern. |
@GabrielLin Curious if you have built a model with max_length=16384. I'm summarising lots of documents at once, and frequently have ~10,000 tokens in total 🙂 I mean, if you have built trained such model, would be really cool if you could publish it in HuggingFace |
Hi @attekei . Thank you for your interest. I am just doing research to compare different models. |
Could you please tell me can the models on HF (https://huggingface.co/allenai/PRIMERA, https://huggingface.co/allenai/PRIMERA-arxiv) accept 16k input. Can I just set the max_length to 16384 to let it accept such a length of a long document? Thanks.
The text was updated successfully, but these errors were encountered: