You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am wondering what's the mini-batch size are you using in practice for your pre-trained model. Your papers said 10 but your code set it to 1 while both use the same number of iterations. The two settings should have different results.
Why I ask this is because your paper mentions that you use just 7 hours to train. I also use a Tesla K40c but training takes 1 minutes for 20 iterations (the mini batch size is 1). For this speed, it needs 4 days to finish 10,000 iterations.
It is still running.
Could you help me figure it out?
The text was updated successfully, but these errors were encountered:
We updated our code base to the newer version of caffe and use mini-batch size 1 for full resolution images, in this way we are able to boost our performance to 0.790 ODS. If you resize the images to 400x400 as in the paper and set the mini-batch size to 10, then it will be much faster. From my experiment log 7 hours will be enough for 10,000 iterations in the previous setting.
Hi Saining,
Thanks for your answer.
So if I understand correctly, you mean you use image size 400x400 with a mini-batch size of 10 will be much faster than use full size and mini-batch size of 1?
Why? Large Mini-batch should slow down the time of one iteration, right?
I am still do not understand why my iteration is so slow, with the data you provided and the parameters you set in the code (mini-batch is 1), the one iteration time is 1 minutes. So totally 10,000 mins are needed. I am using Tesla K40c. Do you have any idea about this?
By the way, could you show me your loss plot? It is interesting to find the loss is high and vibrate but the results are already visually good after a few thousands of iterations.
Hi Saining,
I am wondering what's the mini-batch size are you using in practice for your pre-trained model. Your papers said 10 but your code set it to 1 while both use the same number of iterations. The two settings should have different results.
Why I ask this is because your paper mentions that you use just 7 hours to train. I also use a Tesla K40c but training takes 1 minutes for 20 iterations (the mini batch size is 1). For this speed, it needs 4 days to finish 10,000 iterations.
It is still running.
Could you help me figure it out?
The text was updated successfully, but these errors were encountered: