You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thanks for the question. Yes, you are right!I think it's fine for training to set position ids [0, seqlen], but you might want to be consistent at decoding time, it might conflict with caching.
Hi, I found it will cause the error:
position_ids = torch.arange(past_length, input_shape[-1] + past_length, dtype=torch.long, device=device)
for example: past_length==10, input_shape[-1] ==1024
I guess the reason for the error is caused by: position_ids was changed to [10, 1034], but the tokenizer.model_max_length is 1024. And I didn't notice there were other operations on the input_ids length, mode_max_length, and position_ids. Could you help me with this error?
I found position ids is in [prefix_len, prefix_len+seq_len) in modeling_gpt2.py
position_ids = torch.arange(past_length, input_shape[-1] + past_length, dtype=torch.long, device=device)
PrefixTuning/transformers/src/transformers/modeling_gpt2.py
Line 579 in 6519d30
Is it OK to just make position ids in [0, seq_len) ? Since I have not found the use of position embeddings for prefix matrix.
The text was updated successfully, but these errors were encountered: