train on youtube-vos #2

colorblank · 2021-11-20T11:53:47Z

Grear work! Thanks for sharing your code!

I use the default training configure of ytvos to train the network.
But I only got best performance J&F=65.5 at epoch 490.
Is the default configure supposed to have this performance?
How can I get the best performance like your provided checkpoint?
I'd appreciate it if you could point out what I did wrong.

arnike · 2021-11-20T13:08:10Z

Hi @colorblank,
thanks! Are you referring to "0.655" in the snapshot name, or did you actually use infer_vos.sh script to evaluate on the whole validation set? Please note that we save the snapshot based on only 5 sequences from DAVIS-2017 during training, so the accuracy on the complete validation set will be different and tends to be higher.
Best,
Nikita

colorblank · 2021-11-20T14:31:46Z

My bad. I tried to the test code and got th results like:

J&F-Mean   J-Mean  J-Recall  J-Decay   F-Mean  F-Recall  F-Decay
 0.680324 0.658446  0.789883 0.182041 0.702202  0.835232 0.258544

Thanks for your clarification.
Besides, there is a small bug in this line :

dense-ulearn-vos/infer_vos.py

Line 327 in 9e39d35

model = nn.DataParallel(model).cuda()

. You should change to single gpu mode like "model = model.cuda()".

arnike · 2021-11-22T10:18:16Z

Thanks for the pointer about the inference code.

There may be some deviation in accuracy between the training runs, albeit your results are well below what I typically observed. You can try simply re-runing the training and/or decreasing the interval between the validation runs (check LOG.ITER_VAL in the YAML config).

I've now also experimented with decaying the learning rate for training on YouTube-VOS to make it more reproducible. My only change to the default setup was in config/ytvos.yaml:

MODEL.LR_SCHEDULER: "step"
MODEL.LR_GAMMA: 0.5
MODEL.LR_STEP: 100

This is the result I got on the complete DAVIS-2017 val:

--------------------------- Global results for val ---------------------------
 J&F-Mean    J-Mean  J-Recall   J-Decay    F-Mean  F-Recall   F-Decay
 0.694455  0.672305  0.818764  0.164835  0.716605  0.843103  0.239955

(The maximum on the validation subset we use in training for this snapshot was 64.5% J&F reached at epoch 330.)
Let me know if that works for you.
Best regards,
Nikita

arnike closed this as completed in 88e5e35 Dec 1, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

train on youtube-vos #2

train on youtube-vos #2

colorblank commented Nov 20, 2021 •

edited

arnike commented Nov 20, 2021

colorblank commented Nov 20, 2021

arnike commented Nov 22, 2021 •

edited

train on youtube-vos #2

train on youtube-vos #2

Comments

colorblank commented Nov 20, 2021 • edited

arnike commented Nov 20, 2021

colorblank commented Nov 20, 2021

arnike commented Nov 22, 2021 • edited

colorblank commented Nov 20, 2021 •

edited

arnike commented Nov 22, 2021 •

edited