Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

train on youtube-vos #2

Closed
colorblank opened this issue Nov 20, 2021 · 3 comments
Closed

train on youtube-vos #2

colorblank opened this issue Nov 20, 2021 · 3 comments

Comments

@colorblank
Copy link

colorblank commented Nov 20, 2021

Grear work! Thanks for sharing your code!

I use the default training configure of ytvos to train the network.
But I only got best performance J&F=65.5 at epoch 490.
Is the default configure supposed to have this performance?
How can I get the best performance like your provided checkpoint?
I'd appreciate it if you could point out what I did wrong.
image

@arnike
Copy link
Collaborator

arnike commented Nov 20, 2021

Hi @colorblank,
thanks! Are you referring to "0.655" in the snapshot name, or did you actually use infer_vos.sh script to evaluate on the whole validation set? Please note that we save the snapshot based on only 5 sequences from DAVIS-2017 during training, so the accuracy on the complete validation set will be different and tends to be higher.
Best,
Nikita

@colorblank
Copy link
Author

My bad. I tried to the test code and got th results like:

J&F-Mean   J-Mean  J-Recall  J-Decay   F-Mean  F-Recall  F-Decay
 0.680324 0.658446  0.789883 0.182041 0.702202  0.835232 0.258544

Thanks for your clarification.
Besides, there is a small bug in this line :

model = nn.DataParallel(model).cuda()
. You should change to single gpu mode like "model = model.cuda()".

@arnike
Copy link
Collaborator

arnike commented Nov 22, 2021

Thanks for the pointer about the inference code.

There may be some deviation in accuracy between the training runs, albeit your results are well below what I typically observed. You can try simply re-runing the training and/or decreasing the interval between the validation runs (check LOG.ITER_VAL in the YAML config).

I've now also experimented with decaying the learning rate for training on YouTube-VOS to make it more reproducible. My only change to the default setup was in config/ytvos.yaml:

MODEL.LR_SCHEDULER: "step"
MODEL.LR_GAMMA: 0.5
MODEL.LR_STEP: 100

This is the result I got on the complete DAVIS-2017 val:

--------------------------- Global results for val ---------------------------
 J&F-Mean    J-Mean  J-Recall   J-Decay    F-Mean  F-Recall   F-Decay
 0.694455  0.672305  0.818764  0.164835  0.716605  0.843103  0.239955

(The maximum on the validation subset we use in training for this snapshot was 64.5% J&F reached at epoch 330.)
Let me know if that works for you.
Best regards,
Nikita

@arnike arnike closed this as completed in 88e5e35 Dec 1, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants