You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Your code helped a lot to understand the chunking process. When i'm trying to fine tune using token length of 4000+ the model breaks with Out of memory exception. I have tried a batch size of 2 and on a larger 48GB GPU as well. I can see we are continuously pushing into GPU which causes memory exhaustion. Is there a way to better manage the memory for samples which are represented by 4000+ tokens.
The text was updated successfully, but these errors were encountered:
Hi, we made some major changes in this repo. One added feature is the parameter maximal_text_length. It allows to use truncation before the chunking process. As you mentioned, the process for longer texts requires a lot of GPU memory. Maybe setting the parameter to something like 4096 or 2048 would be the compromise between memory constraints and using the longer context.
Hi
Your code helped a lot to understand the chunking process. When i'm trying to fine tune using token length of 4000+ the model breaks with Out of memory exception. I have tried a batch size of 2 and on a larger 48GB GPU as well. I can see we are continuously pushing into GPU which causes memory exhaustion. Is there a way to better manage the memory for samples which are represented by 4000+ tokens.
The text was updated successfully, but these errors were encountered: