A question about pre-train model? #40

Ouya-Bytes · 2017-03-04T12:14:14Z

Dear davheld:
I have a question about how do you get the pretrain model, can you give some detials? Question as follows:
(1) Which dataset ( such:ILSVRC2014_DET or ILSVRC2014_CLS) to pretrain the convolutional layer?
(2) If I only use the dataset of ALOV300+ and 2014_DET to train the net get regression value , or don't do the step of pretrain convolution layer, will decresed the tracking performance?
(3) When you train the siamese net, the two network branch will shared the layer params? or has independence parameters?
Looking forward to your reply, best wish!

davheld · 2017-03-05T01:08:42Z

I use the pre-trained CaffeNet architecture, which is available from Caffe:
http://caffe.berkeleyvision.org/model_zoo.html
I think it probably will - if you do this then I recommend using a much smaller architecture.
As mentioned above, I do not pre-train the convolutional layers myself but I take the layers pre-trained from Caffe. The two sets of convolutional layers have identical weights.

Ouya-Bytes · 2017-03-05T13:01:01Z

@davheld I according to your code(tracker.prototxt and solver.prototxt ,iteration 500000) and use the train.cpp to train the network use dataset(2014_DET and ALOV300+), the train loss value is not convergent, and oscillation between on 20 ~ 50 finally.so that have a very pool tracking performance. Can you give me some advise? Thks

davheld · 2017-03-05T14:09:55Z

It sounds like you are overfitting. Just to be sure - I don't train the conv layers at all, those are pre-trained using CaffeNet.

Ouya-Bytes · 2017-03-05T14:58:26Z

yet, i only use your code and prototxt( run the train.cpp, keep params of solver.prototxt and tracker.prototxt) to re-train the network, i don't change anymore. the convolutional layer is from CaffeNet, and lr_mult is set 0 no change.

davheld · 2017-03-05T15:16:42Z

How do you create the pre-trained network?

Ouya-Bytes · 2017-03-05T15:27:50Z

the pretrian param from your offer address http://cs.stanford.edu/people/davheld/public/GOTURN/weights_init/tracker_init.caffemodel, i dont change the prototxt, i only want to run train.cpp code to get tracker_iter_500000.caffemodel, then can test the tracker

davheld · 2017-03-05T15:31:44Z

That's odd, not sure.

Jiangfeng-Xiong · 2017-03-06T02:27:49Z

I have the same problem, changing val_ratio from 0.2 to 0 in "loader/loader_alov.cpp" may help, but still , model trained by myself doesn't perform as good as pre-train model.

Ouya-Bytes · 2017-03-06T02:30:18Z

not convergent? oscillation?

davheld · 2017-03-06T02:39:01Z

The oscillation is normal and simply occurs because the training evaluation is occurring on mini-batches which are randomly sampled at each iteration. However, the numbers that you listed seem lower than what I remember so I believe that you are overfitting, although I am not sure why.

Jiangfeng-Xiong · 2017-03-06T02:53:23Z

train loss is like this , range from 20 to 90 @OuYag

davheld · 2017-03-06T02:55:29Z

Try reducing the learning rate? The oscillations in that graph look fairly large. Although if you are using the default learning rate then this is unusual that you should need to change it. Also, make sure that all convolutional layers are fixed (e.g. in both streams of the network).

ujsyehao · 2017-03-15T06:44:58Z

@OuYag your words"lr_mult is set 0 no change." I don't think it is right, lr_mult set to 0 means no learning rate.
Caffe says that we will set the weight learning rate to be the same as the learning rate given by the solver during runtime

freescar · 2017-11-01T04:34:26Z

@Jiangfeng-Xiong @OuYag do you solve the issue? I have the same problem, the test loss value is between 10 and 20. I guess it is overfitting, however changing lr or batchsize cannot reduce losses

wendianwei · 2017-11-03T07:34:21Z

Hi, I want to know how to evaluate your performance?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

A question about pre-train model? #40

A question about pre-train model? #40

Ouya-Bytes commented Mar 4, 2017

davheld commented Mar 5, 2017

Ouya-Bytes commented Mar 5, 2017

davheld commented Mar 5, 2017

Ouya-Bytes commented Mar 5, 2017

davheld commented Mar 5, 2017

Ouya-Bytes commented Mar 5, 2017

davheld commented Mar 5, 2017

Jiangfeng-Xiong commented Mar 6, 2017

Ouya-Bytes commented Mar 6, 2017

davheld commented Mar 6, 2017

Jiangfeng-Xiong commented Mar 6, 2017

davheld commented Mar 6, 2017

ujsyehao commented Mar 15, 2017 •

edited

Loading

freescar commented Nov 1, 2017

wendianwei commented Nov 3, 2017

A question about pre-train model? #40

A question about pre-train model? #40

Comments

Ouya-Bytes commented Mar 4, 2017

davheld commented Mar 5, 2017

Ouya-Bytes commented Mar 5, 2017

davheld commented Mar 5, 2017

Ouya-Bytes commented Mar 5, 2017

davheld commented Mar 5, 2017

Ouya-Bytes commented Mar 5, 2017

davheld commented Mar 5, 2017

Jiangfeng-Xiong commented Mar 6, 2017

Ouya-Bytes commented Mar 6, 2017

davheld commented Mar 6, 2017

Jiangfeng-Xiong commented Mar 6, 2017

davheld commented Mar 6, 2017

ujsyehao commented Mar 15, 2017 • edited Loading

freescar commented Nov 1, 2017

wendianwei commented Nov 3, 2017

ujsyehao commented Mar 15, 2017 •

edited

Loading