AssertionError: Non-consecutive added token '<obj>' found. Should have index 50272 but has index 50265 in saved vocabulary. #1

rahul765 · 2021-11-14T12:28:52Z

Traceback:
File "/home/rahulpal/anaconda3/envs/rebel/lib/python3.7/site-packages/streamlit/script_runner.py", line 338, in _run_script
exec(code, module.dict)
File "/home/rahulpal/Documents/rebel-main/demo.py", line 57, in
tokenizer, model, dataset = load_models()
File "/home/rahulpal/anaconda3/envs/rebel/lib/python3.7/site-packages/streamlit/caching.py", line 573, in wrapped_func
return get_or_create_cached_value()
File "/home/rahulpal/anaconda3/envs/rebel/lib/python3.7/site-packages/streamlit/caching.py", line 555, in get_or_create_cached_value
return_value = func(*args, **kwargs)
File "/home/rahulpal/Documents/rebel-main/demo.py", line 18, in load_models
tokenizer = AutoTokenizer.from_pretrained("Babelscape/rebel-large")
File "/home/rahulpal/anaconda3/envs/rebel/lib/python3.7/site-packages/transformers/models/auto/tokenization_auto.py", line 416, in from_pretrained
return tokenizer_class_fast.from_pretrained(pretrained_model_name_or_path, *inputs, **kwargs)
File "/home/rahulpal/anaconda3/envs/rebel/lib/python3.7/site-packages/transformers/tokenization_utils_base.py", line 1705, in from_pretrained
resolved_vocab_files, pretrained_model_name_or_path, init_configuration, *init_inputs, **kwargs
File "/home/rahulpal/anaconda3/envs/rebel/lib/python3.7/site-packages/transformers/tokenization_utils_base.py", line 1811, in _from_pretrained
f"Non-consecutive added token '{token}' found. "

LittlePea13 · 2021-11-17T09:06:28Z

Hi, there seems to be a bug with the transformers version used for training Rebel (4.4.0) regarding the added tokens. I will check if I can update it in the requirements file and nothing breaks, but if you just want to load the model and tokenizer as in the demo.py file, just update to a newer transformers version, ie. 4.12.4, and the issue should be gone.

dxlong2000 · 2022-03-08T17:34:04Z

Hi @LittlePea13 , thanks for your great work. So far I have used newest version of Transformer and this bug is still there. Could you reopen the thread and help us solving the bug?

Thanks!

LittlePea13 · 2022-03-09T09:59:13Z

Hi @dxlong2000 could you give some more context on how the error happened? thanks.

dxlong2000 · 2022-03-09T14:32:09Z

By some reason I can run it now. Thanks very much for your reply, we can close it now.

zozni · 2022-03-30T11:05:13Z

How did you solve it? (same issue)
I didn't solve it. ㅠㅠ

LittlePea13 · 2022-03-30T11:23:04Z

Hi @zozni, did you update the transformers library?

zozni · 2022-03-30T12:50:22Z

After reinstalling with the latest version, it worked successfully. thank you!

LittlePea13 self-assigned this Nov 17, 2021

LittlePea13 added the bug Something isn't working label Nov 17, 2021

LittlePea13 closed this as completed Nov 17, 2021

dxlong2000 mentioned this issue Mar 8, 2022

Regarding this thread: https://github.com/Babelscape/rebel/issues/1 #11

Closed

LittlePea13 reopened this Mar 9, 2022

LittlePea13 closed this as completed Mar 9, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AssertionError: Non-consecutive added token '<obj>' found. Should have index 50272 but has index 50265 in saved vocabulary. #1

AssertionError: Non-consecutive added token '<obj>' found. Should have index 50272 but has index 50265 in saved vocabulary. #1

rahul765 commented Nov 14, 2021

LittlePea13 commented Nov 17, 2021

dxlong2000 commented Mar 8, 2022

LittlePea13 commented Mar 9, 2022

dxlong2000 commented Mar 9, 2022

zozni commented Mar 30, 2022

LittlePea13 commented Mar 30, 2022

zozni commented Mar 30, 2022

AssertionError: Non-consecutive added token '<obj>' found. Should have index 50272 but has index 50265 in saved vocabulary. #1

AssertionError: Non-consecutive added token '<obj>' found. Should have index 50272 but has index 50265 in saved vocabulary. #1

Comments

rahul765 commented Nov 14, 2021

LittlePea13 commented Nov 17, 2021

dxlong2000 commented Mar 8, 2022

LittlePea13 commented Mar 9, 2022

dxlong2000 commented Mar 9, 2022

zozni commented Mar 30, 2022

LittlePea13 commented Mar 30, 2022

zozni commented Mar 30, 2022