Trials to reproduce the results in the paper using SGD #2

lim0606 · 2016-02-22T01:22:06Z

Hi, this is Jaehyun Lim

I have a problems in learning rate scheduling to reproduce the results on the paper using sgd (as in the paper) (see https://github.com/lim0606/caffe-posenet-googlenet).

As far as I understood from the paper, it seems like that the model is trained until 100 epoch (or about to 100 epoch), i.e about 16 iterations for King's College dataset; however, it was very short for converge in my experience with the learning rate scheduling.

Therefore, I referred the maximum iterations for the datasets based on the adagrad in this repo, i.e. 30000, and it (fortunately) worked with King's College dataset. I got the results good enough compared to the record on the paper (actually better, see https://github.com/lim0606/caffe-posenet-googlenet).

However, I couldn't reproduce the results on the other datasets.

I would appreciate if you let me know some advises that I might miss or misunderstood from your paper.

Best regards,

Jaehyun

alexgkendall · 2016-02-22T11:51:17Z

Yes, typically the model converges after approx 30,000 iterations.

How close were your results to the paper for the other datasets? Also did you change the values of Beta? If you look at Fig 2 in the PoseNet paper, the model is quite dependant on a good choice of Beta.

Cheers,
Alex

lim0606 · 2016-02-23T07:09:32Z

@alexgkendall

Thank you for your reply!

I used the models in the link (http://mi.eng.cam.ac.uk/~agk34/resources/PoseNet.zip); thus, the betas are set as follows;
- King's College: 500
- Old Hospital: 1000
- Shop Facade: 100
- St Mary's Church: 250
- Street: Median error 2000
Results
- King's College: Median error 1.88411343098 m and 2.33481308286 degrees
- Old Hospital: Median error 2.78002953529 m and 2.94750986627 degrees
- Shop Facade: Median error 2.2239086628 m and 4.27796134748 degrees
- St Mary's Church: Median error 3.12559008598 m and 4.17480798865 degrees
- Street: Median error 40.2640609741 m and 26.8703415628 degrees

Best regards,

Jaehyun

alexgkendall · 2016-02-23T09:09:50Z

Hi Jaehyun,

Sounds good to me then - are you also initialising your weights from a pretrained model on the Places dataset? (get weights here: https://github.com/BVLC/caffe/wiki/Model-Zoo)

Alex

lim0606 · 2016-02-23T11:52:00Z

Hi Alex,

Thank you for reply :)

I did use the pretrained model for the googlenet.

Best regards,

Jaehyun Lim

alykhantejani · 2016-03-08T17:15:52Z

Hi @lim0606 - did you manage to reproduce results with SGD?

Thanks,
Aly

lim0606 · 2016-03-09T12:00:31Z

@alykhantejani

Hi @lim0606 - did you manage to reproduce results with SGD?

Yes and no...

I did only for King's College data (see https://github.com/lim0606/caffe-posenet-googlenet)

I think I have to run grid search to tune hyperparameters, i.e. stepsize, max iterations, gamma (a parameter of learning rate policy), and initial learning rate, for other data, but I couldn't have time to do it...

alykhantejani · 2016-03-09T13:11:51Z

@lim0606 Thanks for sharing the results!

alykhantejani · 2016-03-09T16:34:27Z

Hi @alexgkendall,

In the original paper it states

It was trained using stochastic gradient de- scent with a base learning rate of 10−5, 
reduced by 90% every 80 epochs and with momentum of 0.9. Using one half of a 
dual-GPU card (NVidia Titan Black), training took an hour using a batch size of 75

But the solver prototxt file (solver_posenet.prototxt) uses adagrad and base learning rate of 1e-3 with no decay. Is there a reason for this, did you find it trained faster/gave better results? Is this how the pre-trained models provided were trained?

Thanks,
Aly

lidaweironaldo · 2016-05-03T21:31:01Z

Hi,

Has anyone managed to achieve the error reported in the paper for "street". I tried both the sgd and adagrad methods, but got errors about 10 times the value reported in the paper.

Best,
Dawei

janosszabo · 2017-04-07T09:54:25Z

Hi @alexgkendall,
I also tried to train Street with both sgd and adagrad, but I also could not reproduce the accuracy reported in the article (4m as far as i remember). Mine was also around 20-30m. But your pretrained caffemodel worked fine. Also when I trained Shopfacade it learned allright.
Isn´t it possible that something is missing from this repo or some prototxt file is outdated?
Thanks,
Janos

janosszabo · 2017-04-10T14:54:12Z

Some more details:
I used train_street.prototxt from your models zip file, along with the lmdb database I made using the script included in posenet/scripts (create_posenet_lmdb_dataset.py), from the contents of Street.zip, and I used posenet/models/solver_posenet.prototxt from the repo as the solver script. I initialized the GoogLeNet weights with the model pretrained on Places, available from here: http://vision.princeton.edu/pvt/GoogLeNet/.
The distance error was around 20-30m when I tested the model after training, but your pretrained caffemodel worked as expected, it gave an error of 3-4m.
@alexgkendall do you know why this could happen?

duongnamduong · 2017-05-16T07:16:07Z

Hi @alexgkendall ,
I have a question, I try re-train model using your dataset "KingsCollege". I configured my solver as follows:

net: "./model1/train_val_kingscollege_googlenet.prototxt"
test_initialization: true
test_iter: 11
test_interval: 300
base_lr: 1e-5
lr_policy: "step"
gamma: 0.9
stepsize: 4000
display: 20
average_loss: 20
max_iter: 80000
solver_type: ADAGRAD
weight_decay: 0.0002
snapshot: 1000
snapshot_prefix: "./model1/snapshots/"
solver_mode: GPU

batch_size = 32.
And I ran it. But I seen the result is not converge. Could you help me to explain that?

ming-c · 2018-04-20T10:17:12Z

@janosszabo Though it's been more than one year, but could you pls advise me that did your model of scene 'Street' coverage to PoseNet paper's result?

DRAhmadFaraz · 2019-04-26T17:55:42Z

@ming-c @janosszabo @lim0606 @alexgkendall @lidaweironaldo

Can some body please guide me how to train this model on custom RGB images from scratch.?

I will be Thankful to you.
Regards

GaryWon mentioned this issue Sep 20, 2016

A training problem about " Iteration 40, loss = 1.#QNAN" #12

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Trials to reproduce the results in the paper using SGD #2

Trials to reproduce the results in the paper using SGD #2

lim0606 commented Feb 22, 2016

alexgkendall commented Feb 22, 2016

lim0606 commented Feb 23, 2016

alexgkendall commented Feb 23, 2016

lim0606 commented Feb 23, 2016

alykhantejani commented Mar 8, 2016

lim0606 commented Mar 9, 2016

alykhantejani commented Mar 9, 2016

alykhantejani commented Mar 9, 2016

lidaweironaldo commented May 3, 2016

janosszabo commented Apr 7, 2017

janosszabo commented Apr 10, 2017

duongnamduong commented May 16, 2017 •

edited

Loading

ming-c commented Apr 20, 2018

DRAhmadFaraz commented Apr 26, 2019

Trials to reproduce the results in the paper using SGD #2

Trials to reproduce the results in the paper using SGD #2

Comments

lim0606 commented Feb 22, 2016

alexgkendall commented Feb 22, 2016

lim0606 commented Feb 23, 2016

alexgkendall commented Feb 23, 2016

lim0606 commented Feb 23, 2016

alykhantejani commented Mar 8, 2016

lim0606 commented Mar 9, 2016

alykhantejani commented Mar 9, 2016

alykhantejani commented Mar 9, 2016

lidaweironaldo commented May 3, 2016

janosszabo commented Apr 7, 2017

janosszabo commented Apr 10, 2017

duongnamduong commented May 16, 2017 • edited Loading

ming-c commented Apr 20, 2018

DRAhmadFaraz commented Apr 26, 2019

duongnamduong commented May 16, 2017 •

edited

Loading