Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RuntimeError: The size of tensor a (1024) must match the size of tensor b (1060) at non-singleton dimension 3 #5

Open
mominabbass opened this issue Mar 20, 2023 · 0 comments

Comments

@mominabbass
Copy link

mominabbass commented Mar 20, 2023

For DBpedia 8-shot on GPT-2, I incur a warning "token indices sequence length is longer than the specified maximum sequence length" along with an error "RuntimeError: The size of tensor a (1024) must match the size of tensor b (1060) at non-singleton dimension 3" on line 81 of the utils.py file where the function gpt2_model.generate() is called.

One possible Solution: In the current version of the code, we do not have a check when we're encoding a sequence that is larger than the max sequence the GPT-2 model can handle (which is 1024 tokens). If you pass that sequence to the model it will crash as it cannot handle such a long sequence.

You can truncate the sequence: seq = seq[:1023] or use the max_length tokenizer parameter so that it handles it on its own.:
if (len(input_ids['input_ids'][0]) > 1023):
input_ids['input_ids'] = input_ids['input_ids'][:, :1023]
input_ids['attention_mask'] = input_ids['attention_mask'][:, :1023]

Not sure if this is an optimal way to circumvent this issue. I'd appreciate if you could help. Thank you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant