You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
As far as I understand the code it natively supports AutoModelForCausalLM (decoder only models), but currently does not handle AutoModelForSeq2SeqLM (Encoder+Decoder models), right?
Conceptionally they shouldn't be that different to implement from AutoModelForCausalLM, but would be cool for my use case. Are they on the roadmap, or could you possibly give me some hints on which pitfalls to avoid when trying to patch it in myself? E.g. how to keep the gradients for the encoder etc.
Thanks!
The text was updated successfully, but these errors were encountered:
Hi,
Nice library, thanks for your work :)
As far as I understand the code it natively supports AutoModelForCausalLM (decoder only models), but currently does not handle AutoModelForSeq2SeqLM (Encoder+Decoder models), right?
Conceptionally they shouldn't be that different to implement from AutoModelForCausalLM, but would be cool for my use case. Are they on the roadmap, or could you possibly give me some hints on which pitfalls to avoid when trying to patch it in myself? E.g. how to keep the gradients for the encoder etc.
Thanks!
The text was updated successfully, but these errors were encountered: