Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Checkpoints for CALM model #3

Closed
metodj opened this issue Apr 4, 2024 · 6 comments
Closed

Checkpoints for CALM model #3

metodj opened this issue Apr 4, 2024 · 6 comments
Labels
request Request something

Comments

@metodj
Copy link

metodj commented Apr 4, 2024

Hi!

Thanks for the great repo! I was just wondering if it would be possible to release checkpoints for CALM model as well? To enable experimentation with this baseline model without having to fine-tune it on my own.

Best,

Metod

@raymin0223
Copy link
Owner

Hi @metodj,

Thanks for the interest in our paper! You can download checkpoints here.
Note that "weighted_ce" checkpoints are for CALM model, and please let me know if there are any problems.

@raymin0223 raymin0223 added the request Request something label Apr 4, 2024
@metodj
Copy link
Author

metodj commented Apr 15, 2024

Hi @raymin0223 ,

thanks for your answer!

I downloaded the checkpoints from the link you provided, however, it seems like a tokeniser is missing. Concretely, downloading the CNN/DM weighted_ce pretrained model to a folder ./save/cnndm_t5_large_weighted_ce and then running run_sum_cnndm.sh

CUDA_VISIBLE_DEVICES=0 python -m torch.distributed.run --nproc_per_node=1 \
    run_summarization.py \
    --model_name_or_path ./save/cnndm_t5_large_weighted_ce/ \
    --do_eval \
    --dataset_name cnn_dailymail \
    --dataset_config_name "3.0.0" \
    --output_dir ./save/cnndm_t5_large_weighted_ce/ \
    --per_device_eval_batch_size 1 \
    --deploy_scenario True \
    --use_synchronize True \
    --overwrite_output_dir \
    --predict_with_generate \
    --source_prefix "summarize: " \
    --use_early_exit True \
    --exit_conf_type softmax \
    --exit_conf_threshold 0.9 \
    --exit_min_layer 4 \

results in the following error:

Traceback (most recent call last):                                                                                                                                               
  File "run_summarization.py", line 585, in <module>                                                                                                                             
    main(model_args, data_args, training_args, additional_args, model_cls, trainer_cls)                                                                                          
  File "run_summarization.py", line 222, in main                                                                                                                                 
    tokenizer = AutoTokenizer.from_pretrained(                                                                                                                                   
  File "/home/mjazbec/miniconda3/envs/calm/lib/python3.8/site-packages/transformers/models/auto/tokenization_auto.py", line 720, in from_pretrained                              
    return tokenizer_class_fast.from_pretrained(pretrained_model_name_or_path, *inputs, **kwargs)                                                                                
  File "/home/mjazbec/miniconda3/envs/calm/lib/python3.8/site-packages/transformers/tokenization_utils_base.py", line 1795, in from_pretrained                                   
    raise EnvironmentError(                                                                                                                                                      
OSError: Can't load tokenizer for './save/cnndm_t5_large_weighted_ce/'. If you were trying to load it from 'https://huggingface.co/models', make sure you don't have a local dire
ctory with the same name. Otherwise, make sure './save/cnndm_t5_large_weighted_ce/' is the correct path to a directory containing all relevant files for a T5TokenizerFast tokenizer.

Could you upload the tokenisers as well?

Thanks!

@metodj
Copy link
Author

metodj commented Apr 15, 2024

Aha, I resolved the issue by downloading tokenizer.json from here. Might be worth adding this step to the README.md in case someone else will want to use your checkpoints in the future and will run into a similar issue

@raymin0223
Copy link
Owner

Hi @metodj,

Thanks so much for finding and fixing the errors! I've updated the README.md as you advised.
If you have any more questions, please feel free to reopen the issues. Thanks again!

@metodj
Copy link
Author

metodj commented Apr 16, 2024

In run_qa_squad.sh, run_summarization.py is called which I suppose should be run_question_answering.py instead, right? And same for run_tr_iwlst.sh?

@raymin0223
Copy link
Owner

Oh no, so silly mistakes 😂 You are right and thanks so so much for reporting bugs!!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
request Request something
Projects
None yet
Development

No branches or pull requests

2 participants