Base-size pre-trained models #1651

XinnuoXu · 2020-01-27T18:39:34Z

❓ Questions and Help

What is your question?

Does Bart offer base-size(6-layer encoder, 6-layer decoder, hidden size 768) pre-trained models? Since in the summarization task, the baseline BERTSUMABS is trained on bert-base(12-layer encoder, 6-layer decoder, both hidden size 768), have you ever compared base-size Bart with it?
Could you please offer a README file for XSum (similar with the CNN one)?
How much time does the XSum fine-tuning take with smaller GPUs (like 4 11GB GPUs)?

yinhanliu · 2020-01-30T16:24:39Z

our base model is trained on wiki-bookcorpus only.
will do
we use 16 32gpus for 1 hour (30K steps). so in your case it is 8 hours.

YizhuLiu · 2020-02-09T08:21:00Z

@XinnuoXu Hi, Have you evaluated the bart.large.cnn model? Did you get the same R-2 score on CNN/DM datase as published? I used pre-trained model to fine-tune CNN/DM training. But the ROUGE-2 is 19.19 (R-2 in published paper is 21.28).
Thank you very much!

yinhanliu · 2020-02-09T20:25:33Z

@YizhuLiu you need to use the right max-len, min-len, Len-penalty and beam size values.

YizhuLiu · 2020-02-10T09:38:21Z

@yinhanliu Thank you for your reply. We set these values as shown in "Evaluating the bart.large.cnn model": beam=4, lenpen=2.0, max_len_b=140, min_len=55. With this setting, the R-2 score is 20.03. Are they right? If not, how can I get the same R-2 score on CNN/DM as published?

ricardorei · 2020-02-17T18:55:34Z

Will the Bart base-size(6-layer encoder, 6-layer decoder, hidden size 768) pre-trained models be released? I would like to play with them and it is hard for me to fine-tune the large model.

stale · 2022-04-17T20:20:33Z

This issue has been automatically marked as stale. If this issue is still affecting you, please leave any comment (for example, "bump"), and we'll keep it open. We are sorry that we haven't been able to prioritize it yet. If you have any new additional information, please include it with your comment!

stale · 2022-04-28T00:21:45Z

Closing this issue after a prolonged period of inactivity. If this issue is still present in the latest release, please create a new issue with up-to-date information. Thank you!

XinnuoXu added needs triage question labels Jan 27, 2020

XinnuoXu changed the title ~~Smaller pre-trained models~~ Bert-base size pre-trained models Jan 28, 2020

XinnuoXu changed the title ~~Bert-base size pre-trained models~~ Base-size pre-trained models Jan 28, 2020

lematt1991 removed the needs triage label Feb 10, 2020

stale bot added the stale label Apr 17, 2022

stale bot closed this as completed Apr 28, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Base-size pre-trained models #1651

Base-size pre-trained models #1651

XinnuoXu commented Jan 27, 2020 •

edited

yinhanliu commented Jan 30, 2020

YizhuLiu commented Feb 9, 2020

yinhanliu commented Feb 9, 2020

YizhuLiu commented Feb 10, 2020

ricardorei commented Feb 17, 2020

stale bot commented Apr 17, 2022

stale bot commented Apr 28, 2022

Base-size pre-trained models #1651

Base-size pre-trained models #1651

Comments

XinnuoXu commented Jan 27, 2020 • edited

❓ Questions and Help

What is your question?

yinhanliu commented Jan 30, 2020

YizhuLiu commented Feb 9, 2020

yinhanliu commented Feb 9, 2020

YizhuLiu commented Feb 10, 2020

ricardorei commented Feb 17, 2020

stale bot commented Apr 17, 2022

stale bot commented Apr 28, 2022

XinnuoXu commented Jan 27, 2020 •

edited