-
Notifications
You must be signed in to change notification settings - Fork 6.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Base-size pre-trained models #1651
Comments
|
@XinnuoXu Hi, Have you evaluated the bart.large.cnn model? Did you get the same R-2 score on CNN/DM datase as published? I used pre-trained model to fine-tune CNN/DM training. But the ROUGE-2 is 19.19 (R-2 in published paper is 21.28). |
@YizhuLiu you need to use the right max-len, min-len, Len-penalty and beam size values. |
@yinhanliu Thank you for your reply. We set these values as shown in "Evaluating the bart.large.cnn model": beam=4, lenpen=2.0, max_len_b=140, min_len=55. With this setting, the R-2 score is 20.03. Are they right? If not, how can I get the same R-2 score on CNN/DM as published? |
Will the Bart base-size(6-layer encoder, 6-layer decoder, hidden size 768) pre-trained models be released? I would like to play with them and it is hard for me to fine-tune the large model. |
This issue has been automatically marked as stale. If this issue is still affecting you, please leave any comment (for example, "bump"), and we'll keep it open. We are sorry that we haven't been able to prioritize it yet. If you have any new additional information, please include it with your comment! |
Closing this issue after a prolonged period of inactivity. If this issue is still present in the latest release, please create a new issue with up-to-date information. Thank you! |
❓ Questions and Help
What is your question?
Does Bart offer base-size(6-layer encoder, 6-layer decoder, hidden size 768) pre-trained models? Since in the summarization task, the baseline BERTSUMABS is trained on bert-base(12-layer encoder, 6-layer decoder, both hidden size 768), have you ever compared base-size Bart with it?
Could you please offer a README file for XSum (similar with the CNN one)?
How much time does the XSum fine-tuning take with smaller GPUs (like 4 11GB GPUs)?
@myleott @yinhanliu @ngoyal2707
The text was updated successfully, but these errors were encountered: