-
Notifications
You must be signed in to change notification settings - Fork 173
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
adding decoder trainer #53
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can deduplicate some code between T5 and GPT trainers, since a lot of these are shared. Can we write a common parent class that we can specialize into these two?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Things are looking good! My suggested changes were all concerning:
- breaking up a test case
- renaming a class
- fixing comments/docstrings.
Otherwise should be able to merge soon!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Description
All the
Seq2Seq
models are appropriate for our project, thus I added decoder-only (text-generation) model in ourTrainer
with new unit tests.References
Seq2SeqModel
#52Blocked by