Error: IndexError: piece id is out of range. #1

nuochenpku · 2023-08-18T02:42:56Z

Hi, when I set batchsize more than 1, the error will occur: piece id is out of range.
Could you help me fix it?

Haskely · 2023-08-18T06:02:51Z

Could you provide the complete error output, please? A screenshot would also be acceptable.

nuochenpku · 2023-08-18T08:47:42Z

nuochenpku · 2023-08-18T08:48:09Z

I use transformers 4.32.0

Haskely · 2023-08-19T12:07:59Z

It can only be seen that there is an out-of-range situation for the token_id output by the model, meaning its value exceeds the vocabulary size of the tokenizer. May be sth wrong with the tokenizer. However, it is unclear why this situation is occurring.

Have you made any modifications to the script? If so, please provide the complete script. If not, that's really confusing...🤯

nuochenpku · 2023-08-19T12:21:46Z

I just modify the batchsize. or can you tell me the tokenizer version?

Haskely · 2023-08-21T06:51:49Z

I just modify the batchsize. or can you tell me the tokenizer version?

tokenizer and model are all from model_path="OFA-Sys/gsm8k-rft-llama7b-u13b", which is https://huggingface.co/OFA-Sys/gsm8k-rft-llama7b-u13b/tree/main , has only one version, it can't be a version issue.

Additionally, I realized that you mentioned the issue only arises when batch_size > 1, meaning it runs normally when batch_size=1, Right ? If so, we can conclude that it's a problem with the special tokens. Try use tokenizer.pad_token_id = 0 to set the pad id manually and see if it helps.

nuochenpku · 2023-08-27T06:55:15Z

Sorry for the late reponse. In fact, tokenizer.pad_token_id = 0 already in LlmaTokenizer, so the error still exits

nuochenpku · 2023-08-27T08:32:29Z

Update: I find this error will happen when I use OFA-Sys/gsm8k-rft-llama7b2-u13b. No error for OFA-Sys/gsm8k-rft-llama7b-u13b ckpt.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error: IndexError: piece id is out of range. #1

Error: IndexError: piece id is out of range. #1

nuochenpku commented Aug 18, 2023

Haskely commented Aug 18, 2023

nuochenpku commented Aug 18, 2023

nuochenpku commented Aug 18, 2023

Haskely commented Aug 19, 2023 •

edited

Loading

nuochenpku commented Aug 19, 2023

Haskely commented Aug 21, 2023

nuochenpku commented Aug 27, 2023

nuochenpku commented Aug 27, 2023

Error: IndexError: piece id is out of range. #1

Error: IndexError: piece id is out of range. #1

Comments

nuochenpku commented Aug 18, 2023

Haskely commented Aug 18, 2023

nuochenpku commented Aug 18, 2023

nuochenpku commented Aug 18, 2023

Haskely commented Aug 19, 2023 • edited Loading

nuochenpku commented Aug 19, 2023

Haskely commented Aug 21, 2023

nuochenpku commented Aug 27, 2023

nuochenpku commented Aug 27, 2023

Haskely commented Aug 19, 2023 •

edited

Loading