-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Where to download the pretrained model? #31
Comments
Checkpoint files are saved to the |
@rafaelvalle
|
This is probably related to your CPU not being able to load the data. |
@rafaelvalle |
@OptimusPrimeCao The model uses approximately 300MB per sample. Try reducing your batch size to 16. |
@MXGray We can share the hparams. |
Please share the hparams. |
Is there a way I can continue training my model from a particular point? To be specific, my training crashed at checkpoint_32000 because of memory issue which I have fixed. Can I now somehow resume training from this point or should I again begin from the start? If yes, how do I do it? I wasn't sure if opening a new issue for this was required hence posting my comment here. Any help is appreciated. Thanks! |
Never mind, figured it out. python train.py --output_directory=outdir --log_directory=logdir --checkpoint_path='outdir/checkpoint_32500' |
Pre-trained model has been made available on our README page. |
@vijaysumaravi Mine was also stopped after 32.5 epochs, did you figure out the reason? |
Reducing my batch size helped. I was training it on a single GPU. |
Tutorial: Training on GPU with Colab, Inference with CPU on Server here. |
Is there a way to get checkpoint_15500 in inference file?
The text was updated successfully, but these errors were encountered: