You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I would like to resume training from the last checkpoint and last batch ID to handle training interruptions.
Resuming the training/fine-tuning from a batch_id isn't possible because it was somehow removed from the upstream ColBERT repository. I raised stanford-futuredata/ColBERT#307 issue but haven't got any traction so far. Now I'm raising the same here to discuss how we can resolve this discomfort by bringing this feature back. Any ideas?
Hello all,
I would like to resume training from the last checkpoint and last batch ID to handle training interruptions. I see some remainders from possible implementations here, but they're commented out.
Also, #43 mentions about resume_optimizer is implemented, however there is no other reference to the parsed argument.
grep -r "resume_optimizer" .
./colbert/utils/parser.py: # NOTE: Providing a checkpoint is one thing, --resume is another, --resume_optimizer is yet another.
./colbert/utils/parser.py: self.add_argument('--resume_optimizer', dest='resume_optimizer', default=False, action='store_true')
Could you help me how can I implement these resume and resume_optimizer again? So, I can handle training interruptions in my pipeline, and also contribute back to the repository with examples.
The text was updated successfully, but these errors were encountered:
Thank you for raising this... I'll be closing the issue here since this isn't directly RAGatouille related and I'm managing bug fixes via issues here.
I'm not familiar with the OG training runs of ColBERT and the reasons why (if any, other than time) resuming isn't supported right now. Sorry about that!
Hello all,
I would like to resume training from the last checkpoint and last batch ID to handle training interruptions.
Resuming the training/fine-tuning from a
batch_id
isn't possible because it was somehow removed from the upstream ColBERT repository. I raised stanford-futuredata/ColBERT#307 issue but haven't got any traction so far. Now I'm raising the same here to discuss how we can resolve this discomfort by bringing this feature back. Any ideas?The text was updated successfully, but these errors were encountered: