New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Why the RoBERTa's max_position_embeddings size is 512+2=514? #1363
Comments
What's your precise question? |
the self.padding_idx's meaning in modeling_roberta.py |
It's the position of the padding vector. It's not unique to RoBERTa but far more general, especially for embeddings. Take a look at the PyTorch documentation. |
I know that, but I confuse about why there is 1 and the <s> is 0, is it ignore and why the max_position_embeddings size is 512+2=514? |
Because that's their index in the vocab. The max_position_embeddings size is indeed 514, I'm not sure why. The tokenizer seems to handle text correctly with a max of 512. Perhaps someone of the developers can help with that. I would advise you to change the title of your topic. transformers/transformers/tokenization_roberta.py Lines 84 to 85 in ae50ad9
|
@LysandreJik can chime in if I’m wrong, but afaik |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
Answer here in case anyone from the future is curious: facebookresearch/fairseq#1187 |
@morganmcg1 Tks for this, was getting all kinds of CUDA errors because i setted |
❓ Questions & Help
When I see the code of Roberta, I have a question about the padding_idx = 1, I don't know very well. And the comment is still confused for me.
The text was updated successfully, but these errors were encountered: