Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reproduce the +test Unlimiformer setup #17

Closed
Leonard907 opened this issue May 28, 2023 · 7 comments
Closed

Reproduce the +test Unlimiformer setup #17

Leonard907 opened this issue May 28, 2023 · 7 comments

Comments

@Leonard907
Copy link

Hi I want to reproduce the results of the +test Unlimiformer from the paper. Based on my understanding this setup does not require training, so is it possible to load an available checkpoint (like this) and convert it to Unlimiformer like the example demonstrated in inference-example.py? Are there any settings that I omitted here? Thanks!

@urialon
Copy link
Collaborator

urialon commented May 30, 2023

Hi @Leonard907 ,
Thank you for your interest in our work!

Yes, to reproduce these experiments, follow this section of the README.

Specifically, you should take the main command line:

python src/run.py \
    src/configs/model/bart_base_sled.json 
    src/configs/training/base_training_args.json \
    src/configs/data/gov_report.json \
    --output_dir output_train_bart_base_local/ \
    --learning_rate 1e-5 \
    --model_name_or_path facebook/bart-base \
    --max_source_length 1024 \
    --eval_max_source_length 1024 --do_eval=True \
    --eval_steps 1000 --save_steps 1000 \
    --per_device_eval_batch_size 1 --per_device_train_batch_size 2 \
    --extra_metrics bertscore

and add --test_unlimiformer --eval_max_source_length 999999 --model_name_or_path abertsch/bart-base-govreport.
Let us know if you have any issues or questions!

Best,
Uri

@Leonard907
Copy link
Author

Thank you very much!

@lzp870
Copy link

lzp870 commented Jul 21, 2023

Hi @Leonard907 , Thank you for your interest in our work!

Yes, to reproduce these experiments, follow this section of the README.

Specifically, you should take the main command line:

python src/run.py \
    src/configs/model/bart_base_sled.json 
    src/configs/training/base_training_args.json \
    src/configs/data/gov_report.json \
    --output_dir output_train_bart_base_local/ \
    --learning_rate 1e-5 \
    --model_name_or_path facebook/bart-base \
    --max_source_length 1024 \
    --eval_max_source_length 1024 --do_eval=True \
    --eval_steps 1000 --save_steps 1000 \
    --per_device_eval_batch_size 1 --per_device_train_batch_size 2 \
    --extra_metrics bertscore

and add --test_unlimiformer --eval_max_source_length 999999 --model_name_or_path abertsch/bart-base-govreport. Let us know if you have any issues or questions!

Best, Uri

Hello, I take the main command line you listed, but get "srcIndex < srcSelectDimSize" issue, when I delete "--eval_max_source_length 999999", the issue is addressed, what should I do while using this command?

@lzp870
Copy link

lzp870 commented Jul 21, 2023

Hi @Leonard907 , Thank you for your interest in our work!
Yes, to reproduce these experiments, follow this section of the README.
Specifically, you should take the main command line:

python src/run.py \
    src/configs/model/bart_base_sled.json 
    src/configs/training/base_training_args.json \
    src/configs/data/gov_report.json \
    --output_dir output_train_bart_base_local/ \
    --learning_rate 1e-5 \
    --model_name_or_path facebook/bart-base \
    --max_source_length 1024 \
    --eval_max_source_length 1024 --do_eval=True \
    --eval_steps 1000 --save_steps 1000 \
    --per_device_eval_batch_size 1 --per_device_train_batch_size 2 \
    --extra_metrics bertscore

and add --test_unlimiformer --eval_max_source_length 999999 --model_name_or_path abertsch/bart-base-govreport. Let us know if you have any issues or questions!
Best, Uri

Hello, I take the main command line you listed, but get "srcIndex < srcSelectDimSize" issue, when I delete "--eval_max_source_length 999999", the issue is addressed, what should I do while using this command?

is it necessary to set "use_ Datastore=True" ?

@urialon
Copy link
Collaborator

urialon commented Jul 21, 2023

Hi @Leonard907 ,
It works for me,
the only thing that was missing was adding --tokenizer_name facebook/bart-base, but we will add the tokenizer to the model so it won't be needed in the future.

Setting --use_datastore is useful with extremely long inputs, but it should work either way.

Can you try to (1) git pull the latest version, and (2) run the exact following command line (test only, no training):

python src/run.py \
    src/configs/model/bart_base_sled.json \
    src/configs/training/base_training_args.json \
    src/configs/data/gov_report.json \
    --output_dir output_train_bart_base_local/ \
    --learning_rate 1e-5 \
    --model_name_or_path facebook/bart-base \
    --max_source_length 1024 \
    --eval_max_source_length 999999 --do_eval=True --do_train=False \
    --eval_steps 1000 --save_steps 1000 \
    --per_device_eval_batch_size 1 --per_device_train_batch_size 2 \
    --extra_metrics bertscore --test_unlimiformer \
    --model_name_or_path abertsch/bart-base-govreport \
    --tokenizer facebook/bart-base  

?

@abertsch72
Copy link
Owner

abertsch72 commented Jul 21, 2023

Following up on the above: tokenizer is now be added to the model! It should now run without explicitly setting tokenizer_name

@urialon
Copy link
Collaborator

urialon commented Aug 17, 2023

Closing due to inactivity, feel free to re-open or create a new issue if you have any questions or problems.

@urialon urialon closed this as completed Aug 17, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants