You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Today I actually encountered the same error as issue #7 ., when testing a model prompt-tuned on SST-2 directly on imdb movie review dataset, by replacing the dev.tsv in /original with the imdb dataset, as mentioned in issue #14 .
What I did:
prompt tune a model ckpt on SST-2, and save the model
replace the data/original/SST-2/dev.tsv with my own imdb dataset, and format it correctly
run tools/generate_k_shot.py again. The data/k-shot/SST-2/test.tsv turns to imdb.
load the model in 1) and put --no_train, --do_predict, --overwrite_cache, and other necessary flags to zero-shot on the imdb dataset. I also cleared the cache before I run it.
Error occurs.
Traceback (most recent call last):
File "run.py", line 628, in
main()
File "run.py", line 466, in main
if training_args.do_predict
File "/home/yb1025/Research/ML_2/robustness/LM-BFF/src/dataset.py", line 465, in init
verbose=True if _ == 0 else False,
File "/home/yb1025/Research/ML_2/robustness/LM-BFF/src/dataset.py", line 585, in convert_fn
other_sent_limit=self.args.other_sent_limit,
File "/home/yb1025/Research/ML_2/robustness/LM-BFF/src/dataset.py", line 243, in tokenize_multipart_input
mask_pos = [input_ids.index(tokenizer.mask_token_id)]
ValueError: 50264 is not in list
This "50264" is the same error as in issue Index not in list error when evaluating models zero-shot #7
Sorry for the inconvenience but do you happen to know what might went wrong?
Many thanks.
The text was updated successfully, but these errors were encountered:
This is caused by truncating the mask token. For most of the templates, the mask token is put at the end of the sentence. If the input is too long and exceeds the max length, the mask token might be truncated. This can be solved by increasing the maximum length of the model
Taken from: https://github.com/shi-kejian
Hi,
Thanks again for the great work.
Today I actually encountered the same error as issue #7 ., when testing a model prompt-tuned on SST-2 directly on imdb movie review dataset, by replacing the dev.tsv in /original with the imdb dataset, as mentioned in issue #14 .
What I did:
prompt tune a model ckpt on SST-2, and save the model
replace the data/original/SST-2/dev.tsv with my own imdb dataset, and format it correctly
run tools/generate_k_shot.py again. The data/k-shot/SST-2/test.tsv turns to imdb.
load the model in 1) and put --no_train, --do_predict, --overwrite_cache, and other necessary flags to zero-shot on the imdb dataset. I also cleared the cache before I run it.
Error occurs.
Traceback (most recent call last):
File "run.py", line 628, in
main()
File "run.py", line 466, in main
if training_args.do_predict
File "/home/yb1025/Research/ML_2/robustness/LM-BFF/src/dataset.py", line 465, in init
verbose=True if _ == 0 else False,
File "/home/yb1025/Research/ML_2/robustness/LM-BFF/src/dataset.py", line 585, in convert_fn
other_sent_limit=self.args.other_sent_limit,
File "/home/yb1025/Research/ML_2/robustness/LM-BFF/src/dataset.py", line 243, in tokenize_multipart_input
mask_pos = [input_ids.index(tokenizer.mask_token_id)]
ValueError: 50264 is not in list
This "50264" is the same error as in issue Index not in list error when evaluating models zero-shot #7
Sorry for the inconvenience but do you happen to know what might went wrong?
Many thanks.
The text was updated successfully, but these errors were encountered: