The model is very sensitive to the batch size #15

nebuladream · 2020-04-21T03:12:28Z

We train the model on 4-GPUs with different batch size. The results on validation change largely with batch size, for batch=128 we get all_recall=286; batch = 256 we get all_recall=268; batch=1280 all_recall=240;
We train the model on 1-GPU with different batch size: batch=128, all_recall=295; batch=256, all_recall=285;
We tried different learning rate, but it seems has no affect for the decreasing result.
Do you have the similar result?

danieljf24 · 2020-04-23T02:11:19Z

Sorry, I just trained the model on 1 GPU. The results you posted are interesting. I think it may be caused by the triplet loss with hard example mining.
Additionally, I am wondering why you posted all_recall is so high, I only obtained all_recall about 150.

nebuladream · 2020-04-23T04:04:39Z

Sorry, I just trained the model on 1 GPU. The results you posted are interesting. I think it may be caused by the triplet loss with hard exampling mining.
Additionally, I am wondering why you posted all_recall is so high, I only obtained all_recall about 150.

it may because we report all recall on all direction, more details as follow:
Text to video:
r_1_5_10: [20.433, 47.042, 57.455]
medr, meanr: [7.0, 37.884]
Video to text:
r_1_5_10: [32.998, 62.777, 74.245]
medr, meanr: [3.0, 18.048]
best sum recall: 294.9496981891348

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The model is very sensitive to the batch size #15

The model is very sensitive to the batch size #15

nebuladream commented Apr 21, 2020

danieljf24 commented Apr 23, 2020 •

edited

Loading

nebuladream commented Apr 23, 2020

The model is very sensitive to the batch size #15

The model is very sensitive to the batch size #15

Comments

nebuladream commented Apr 21, 2020

danieljf24 commented Apr 23, 2020 • edited Loading

nebuladream commented Apr 23, 2020

danieljf24 commented Apr 23, 2020 •

edited

Loading