Reproduce the +test Unlimiformer setup #17

Leonard907 · 2023-05-28T22:47:01Z

Hi I want to reproduce the results of the +test Unlimiformer from the paper. Based on my understanding this setup does not require training, so is it possible to load an available checkpoint (like this) and convert it to Unlimiformer like the example demonstrated in inference-example.py? Are there any settings that I omitted here? Thanks!

urialon · 2023-05-30T13:52:32Z

Hi @Leonard907 ,
Thank you for your interest in our work!

Yes, to reproduce these experiments, follow this section of the README.

Specifically, you should take the main command line:

python src/run.py \
    src/configs/model/bart_base_sled.json 
    src/configs/training/base_training_args.json \
    src/configs/data/gov_report.json \
    --output_dir output_train_bart_base_local/ \
    --learning_rate 1e-5 \
    --model_name_or_path facebook/bart-base \
    --max_source_length 1024 \
    --eval_max_source_length 1024 --do_eval=True \
    --eval_steps 1000 --save_steps 1000 \
    --per_device_eval_batch_size 1 --per_device_train_batch_size 2 \
    --extra_metrics bertscore

and add --test_unlimiformer --eval_max_source_length 999999 --model_name_or_path abertsch/bart-base-govreport.
Let us know if you have any issues or questions!

Best,
Uri

Leonard907 · 2023-05-30T14:30:06Z

Thank you very much!

lzp870 · 2023-07-21T09:11:40Z

Hi @Leonard907 , Thank you for your interest in our work!

Yes, to reproduce these experiments, follow this section of the README.

Specifically, you should take the main command line:
python src/run.py \
    src/configs/model/bart_base_sled.json 
    src/configs/training/base_training_args.json \
    src/configs/data/gov_report.json \
    --output_dir output_train_bart_base_local/ \
    --learning_rate 1e-5 \
    --model_name_or_path facebook/bart-base \
    --max_source_length 1024 \
    --eval_max_source_length 1024 --do_eval=True \
    --eval_steps 1000 --save_steps 1000 \
    --per_device_eval_batch_size 1 --per_device_train_batch_size 2 \
    --extra_metrics bertscore
and add --test_unlimiformer --eval_max_source_length 999999 --model_name_or_path abertsch/bart-base-govreport. Let us know if you have any issues or questions!

Best, Uri

Hello, I take the main command line you listed, but get "srcIndex < srcSelectDimSize" issue, when I delete "--eval_max_source_length 999999", the issue is addressed, what should I do while using this command?

lzp870 · 2023-07-21T09:42:43Z

Hi @Leonard907 , Thank you for your interest in our work!
Yes, to reproduce these experiments, follow this section of the README.
Specifically, you should take the main command line:
python src/run.py \
    src/configs/model/bart_base_sled.json 
    src/configs/training/base_training_args.json \
    src/configs/data/gov_report.json \
    --output_dir output_train_bart_base_local/ \
    --learning_rate 1e-5 \
    --model_name_or_path facebook/bart-base \
    --max_source_length 1024 \
    --eval_max_source_length 1024 --do_eval=True \
    --eval_steps 1000 --save_steps 1000 \
    --per_device_eval_batch_size 1 --per_device_train_batch_size 2 \
    --extra_metrics bertscore
and add --test_unlimiformer --eval_max_source_length 999999 --model_name_or_path abertsch/bart-base-govreport. Let us know if you have any issues or questions!
Best, Uri
Hello, I take the main command line you listed, but get "srcIndex < srcSelectDimSize" issue, when I delete "--eval_max_source_length 999999", the issue is addressed, what should I do while using this command?

is it necessary to set "use_ Datastore=True" ?

urialon · 2023-07-21T14:26:58Z

Hi @Leonard907 ,
It works for me,
the only thing that was missing was adding --tokenizer_name facebook/bart-base, but we will add the tokenizer to the model so it won't be needed in the future.

Setting --use_datastore is useful with extremely long inputs, but it should work either way.

Can you try to (1) git pull the latest version, and (2) run the exact following command line (test only, no training):

python src/run.py \
    src/configs/model/bart_base_sled.json \
    src/configs/training/base_training_args.json \
    src/configs/data/gov_report.json \
    --output_dir output_train_bart_base_local/ \
    --learning_rate 1e-5 \
    --model_name_or_path facebook/bart-base \
    --max_source_length 1024 \
    --eval_max_source_length 999999 --do_eval=True --do_train=False \
    --eval_steps 1000 --save_steps 1000 \
    --per_device_eval_batch_size 1 --per_device_train_batch_size 2 \
    --extra_metrics bertscore --test_unlimiformer \
    --model_name_or_path abertsch/bart-base-govreport \
    --tokenizer facebook/bart-base

?

abertsch72 · 2023-07-21T14:33:42Z

Following up on the above: tokenizer is now be added to the model! It should now run without explicitly setting tokenizer_name

urialon · 2023-08-17T00:00:47Z

Closing due to inactivity, feel free to re-open or create a new issue if you have any questions or problems.

urialon closed this as completed Aug 17, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reproduce the +test Unlimiformer setup #17

Reproduce the +test Unlimiformer setup #17

Leonard907 commented May 28, 2023

urialon commented May 30, 2023

Leonard907 commented May 30, 2023

lzp870 commented Jul 21, 2023

lzp870 commented Jul 21, 2023

urialon commented Jul 21, 2023 •

edited

abertsch72 commented Jul 21, 2023 •

edited

urialon commented Aug 17, 2023

Reproduce the +test Unlimiformer setup #17

Reproduce the +test Unlimiformer setup #17

Comments

Leonard907 commented May 28, 2023

urialon commented May 30, 2023

Leonard907 commented May 30, 2023

lzp870 commented Jul 21, 2023

lzp870 commented Jul 21, 2023

urialon commented Jul 21, 2023 • edited

abertsch72 commented Jul 21, 2023 • edited

urialon commented Aug 17, 2023

urialon commented Jul 21, 2023 •

edited

abertsch72 commented Jul 21, 2023 •

edited