New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Got high loss after restoring parameters from ckpt #2952
Comments
Also happened to me. Please help with this issue. Makes it impossible to work with the SSD meta-model if I can't use transfer learning properly. |
@drpngx @derekjchow. Are you looking into this? Thank you. |
I also had the same issue when training SSD model. |
@derekjchow knows best. What optimizer are you using? |
Tried Adam, regular momentum. |
It seems the variables are restored from pre-trained model, but the program is not using the restored variables. |
Also happend to me. I took pretrained model ssd_mobilenet_v2_coco and tried to continue to train on COCO dataset. First few losses are 300+. I expected less than 10. Did not change anything in ssd_mobilenet_v2_coco_2018_03_29.config |
Same here with a Mask RCNN inception V2 model pretrained on a custom data set. |
+1 with Faster RCNN inception resnet v2 atrous coco on a custom data set. |
+1 with SSD inception v2 coco on custom data set:
|
same as dcarnino. Happened on multiple occasions. |
+1 with SphereFace. Update:There is a possible solution for people who have this problem: Please make sure your label is same each time especially you use array indices as labels. Usually happens when use |
Closing as this is resolved |
Please go to Stack Overflow for help and support:
http://stackoverflow.com/questions/tagged/tensorflow
Also, please understand that many of the models included in this repository are experimental and research-style code. If you open a GitHub issue, here is our policy:
Here's why we have that policy: TensorFlow developers respond to issues. We want to focus on work that benefits the whole community, e.g., fixing bugs and adding features. Support only helps individuals. GitHub also notifies thousands of people when issues are filed. We want them to see you communicating an interesting problem, rather than being redirected to Stack Overflow.
System information
You can collect some of this information using our environment capture script:
https://github.com/tensorflow/tensorflow/tree/master/tools/tf_env_collect.sh
You can obtain the TensorFlow version with
python -c "import tensorflow as tf; print(tf.GIT_VERSION, tf.VERSION)"
Describe the problem
Hi everyone here:
I have got a issue when training (fine-tuning) Tensorflow Object Detection API model.
I restored parameters from fine-tuning-ckpt (saved models to another directory, e.g., TRAIN_DIR), and after fine-tuning with big learning rate, the loss decreased as I expected. As it went lower and lower (e.g., 2.0), it is time to use smaller learning rate. So I moved the saved parameters from TRAIN_DIR to fine-tuning-ckpt, and removed all files under TRAIN_DIR. When I re-run the program, I expected the loss to be low (about 2.0), but got 300+. I tried several times and got the same results every time. So I guess the reason is that, the program I was running did not use the saved parameters (or did not restore from TRAIN_DIR), so I printed the variables_to_restore but found that all variable names. Does anybody have any idea or have the same problem ?
Best Regards
Source code / logs
The text was updated successfully, but these errors were encountered: