Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to load the trained checkpoints #18

Open
kanika02 opened this issue Nov 5, 2021 · 11 comments
Open

Unable to load the trained checkpoints #18

kanika02 opened this issue Nov 5, 2021 · 11 comments

Comments

@kanika02
Copy link

kanika02 commented Nov 5, 2021

Hey, Can you help me with the trained checkpoints? I am not able to load the checkpoint.pth
What is the command to resume from the checkpoints? please help me solve the issue .

@JingyeChen
Copy link
Member

Please see Readme.md
CUDA_VISIBLE_DEVICES=GPU_NUM python main.py --batch_size=16 --STN --exp_name EXP_NAME --text_focus 【--resume YOUR_MODEL】 --test --test_data_dir ./dataset/mydata/test

or you can fill the absolute address of the pre-trained model in the argpaser of main.py

Hope this advice will help you out.

@kanika02
Copy link
Author

kanika02 commented Nov 8, 2021

My epoch got interrupted and i want to resume it from the last interrupted so that I don't have to start my epoch from the starting (that will take a lot of my time) . Please help me to resume that . I tried but always cleans the old logs and start from the starting epochs

@JingyeChen
Copy link
Member

Hello, if you want to resume the checkpoint, you could [modify the name] of the experiment to solve this problem.

@kanika02
Copy link
Author

!python main.py --batch_size=16 --STN --exp_name f2 --text_focus --resume /content/drive/MyDrive/scene-text-telescope/checkpoint/f1/checkpoint.pth

i have used this command (changed the name ) still the training is starting from the first epochs

@JingyeChen
Copy link
Member

It is quite weird... So have you tried to directly modify the resume path in main.py?

@kanika02
Copy link
Author

Yes, I have modified the main.py, still not working. The epochs always start from the starting I have tried all ways. Please help me because I am not able to train the model for large epochs.

@JingyeChen
Copy link
Member

In fact, it works in my environment ...
so have you modified the code or made some big changes?

@kanika02
Copy link
Author

image

my epochs got interrupted here then I tried to resume it from this point but it starts from the starting like this .
image

Tried everything you told still not working.
can you share your environment so that I can try in that environment

@kanika02
Copy link
Author

In fact, it works in my environment ... so have you modified the code or made some big changes?

No I have not made any change

@Uxasxie
Copy link

Uxasxie commented Sep 30, 2022

i got the same problem.do you solve it? im so disgusting

@cptbtptp125
Copy link

Hello, could you tell me how to solve this problem? I have also met the same situation. Resume doesn't work at all

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants