New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Model training stops after validation after 4000 iterations #27
Comments
There lacks enough error information. I have no idea what the error is. Could you post the entire log here? |
Thank you very much for the prompt reply. I am sharing the entire error log as you asked:
|
I met this problem and solved it by installing mmcv=1.3.9 as the information in readme document. The problem happened when the version of mmcv is 1.3.12. I can't find the exact problem, but using a lower version of mmcv really works for me. |
|
By default, the weights are stored in |
@MendelXu thank you very much for the prompt reply. In the |
No. The weight is stored every 4000 iterations. SoftTeacher/configs/soft_teacher/base.py Line 265 in 863d90a
You can change the interval to a smaller number like 50 to see whether there is still no model saved.
|
@MendelXu Thank you very much. The issue is solved. |
I am encountering another issue. Whenever I am trying to train using full labelled data setting using |
@purbayankar you can download it from http://images.cocodataset.org/annotations/image_info_unlabeled2017.zip and put the extracted json file in |
If I train model on the full labeled data setting, is it necessary for me to execute |
@duany049 It is still necessary. |
If I added more data, but don't execute |
"I still train model with old data which is mall ?" Do you want to say 'small'? For different portions of data, we provide different config files. So if you are training with the related config file (like https://github.com/microsoft/SoftTeacher/blob/main/configs/soft_teacher/soft_teacher_faster_rcnn_r50_caffe_fpn_coco_full_720k.py), it should raise errors like @psvnlsaikumar. However, this is all about the COCO dataset. If you want to add external data, you have to edit the config file and change the dataset settings. |
Thank you for your reply. |
What the script does is just to 1) prepare data split for partial setting on COCO 2) Convert |
Thank you for solving my question |
After training for 4000 iterations the validation happens and after that the training stops throwing the following error:
I am training with 2 gpus. Do you have any insight why this error is being thrown?
The text was updated successfully, but these errors were encountered: