Works for T5/BART? #13

danyaljj · 2020-12-04T22:32:01Z

Very cool work!

Does this work for T5/BART models as well?

danyaljj · 2020-12-04T23:27:52Z

Side note: it'd be good to update the transformers dependency to the latest (v4.0.0).

lvwerra · 2020-12-17T08:18:50Z

You are right, when I have time I'll upgrade it to v4.0.0. I haven't tested it but I suspect if you take a model with a text generation head it should work. Note that you need add a value head to your model architecture (see here).

danyaljj · 2020-12-18T00:33:25Z

I can try it. Other than running with no errors, what other ways I can test that the code is working fine? Is there a benchmark or a quantitative way of verifying the code?

lvwerra · 2021-01-17T15:12:22Z

Monitoring the rewards on the IMDb dataset would be a good start. For GPT-2 it takes only 1-2h to train.

lvwerra closed this as completed Aug 9, 2021

lvwerra mentioned this issue May 15, 2022

How can I implement trl with BART model #33

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Works for T5/BART? #13

Works for T5/BART? #13

danyaljj commented Dec 4, 2020

danyaljj commented Dec 4, 2020

lvwerra commented Dec 17, 2020

danyaljj commented Dec 18, 2020

lvwerra commented Jan 17, 2021

Works for T5/BART? #13

Works for T5/BART? #13

Comments

danyaljj commented Dec 4, 2020

danyaljj commented Dec 4, 2020

lvwerra commented Dec 17, 2020

danyaljj commented Dec 18, 2020

lvwerra commented Jan 17, 2021