You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I think this might be a good option for models that might not be supported by GradCache (or maybe both can be combined to save even more memory, though I'm not sure if they are compatible).
The text was updated successfully, but these errors were encountered:
It would be nice to have support for Huggingface's gradient checkpointing capabilities.
With models like BERT, you can simply do:
Here's the docs on the subject.
I think this might be a good option for models that might not be supported by GradCache (or maybe both can be combined to save even more memory, though I'm not sure if they are compatible).
The text was updated successfully, but these errors were encountered: