-
Notifications
You must be signed in to change notification settings - Fork 126
Closed
Labels
staleNo recent activityNo recent activity
Description
My model was successfully training and it generated checkpoint files
ckpt-1.data-00000-of-00001
ckpt-1.index
ckpt-2.data-00000-of-00001
ckpt-2.index
ckpt-3.data-00000-of-00001
ckpt-3.index
......
Since I used Google Colab, I cannot finish the training at once. So I downloaded the latest checkpoint file before interrupting, say,
ckpt-8.data-00000-of-00001
ckpt-8.index
Then uploaded them next time. However, the training did not start from the checkpoint and it started from 0 again (and generated checkpoints 1, 2, ... again). I have already edited the value of fine_tune_checkpoint in pipeline.config. Some issues on the Internet say that it actually trained from the checkpoint even its number is from 0. But this arises another question: If it starts from 0 every time, then the training will be endless. Does anyone know the regular method of continuing the training?
Metadata
Metadata
Assignees
Labels
staleNo recent activityNo recent activity