Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to train faster R-CNN on PASCAL VOC 2007 #128

Closed
gplhegde opened this issue Mar 26, 2016 · 15 comments
Closed

Unable to train faster R-CNN on PASCAL VOC 2007 #128

gplhegde opened this issue Mar 26, 2016 · 15 comments

Comments

@gplhegde
Copy link

I follow the exact steps from Readme.md but unable to start training on PASCAL VOC dataset.
I run this command './experiments/scripts/faster_rcnn_end2end.sh 0 VGG_CNN_M_1024 pascal_voc' and got error "AssertionError: Selective search data not found". This error is because I do not have region proposals from selective search. However, from the paper I understand that faster r-cnn does not require this selective search data as it internally trains RPN. How do I set the proposal method to RPN instead of 'selective_search'? I am using the same configurations in this repo. My end goal is to train faster r-cnn on custom data. I am stuck in training the PASCAL VOC itself.

Appreciate help and suggestions. Thanks

@andrewliao11
Copy link

You're right that faster rcnn doesn't require selective search data. Maybe you can show what you've done so far.
btw, if you just want to train faster rcnn on your dataset, you can check this out:
https://github.com/andrewliao11/py-faster-rcnn/blob/master/original_README.md

@gplhegde
Copy link
Author

Thanks for your reply. I am following that guide. I did not do anything fancy so far. Just followed instructions from Readme.md(the one you mentioned above but from the main repo). Only change is that I am using CPU mode. Hence I changed caffe.set_gpu_mode() to cpu_mode() in train_net.py and compiled caffe without GPU support. I downloaded PASCAL VOC 2007 dataset and uncompressed in VOCdevkit2007. Also, downloaded imagenet models which are used to initialize the network. After this, I simply run the command "./experiments/scripts/faster_rcnn_end2end.sh 0 VGG_CNN_M_1024 pascal_voc" . Attached is the console output when I run this script.
I think I need to change the PROPOSAL_METHOD in config yml file. Current proposal method is 'gt' ( what does gt refer here? ground truth bbox?). What should be PROPOSAL_METHOD if I want to select RPN as proposal method? In other words, what is the 'gt' equivalent for RPN?

Thanks.
log.txt

@happyharrycn
Copy link

Certain version of EasyDict has bugs such that the parameters in YAML did not get propagated to the actual dictionary. A quick solution is to directly set config.py (accroding to your YMAL file) and re-run your training.

@gplhegde
Copy link
Author

Thanks. I modified the config.py to set the proposal method to rpn. Now it is taking RPN as proposal method.
But looks like end2end training method requires region proposal file. when I run ./experiments/scripts/faster_rcnn_end2end.sh 0 VGG_CNN_M_1024 pascal_voc, it fails in _load_rpn_roidb because self.config['rpn_file'] is None. How to get rpn_file? The gt_roidb file gets generated from the xml annotation file in ./data/cache. Attached is the tail of the log.
log.txt

@happyharrycn
Copy link

You should set the proposal method to gt as specified in the end2end YAML file. rpn is used for alternative optimization, where the region proposals are stored in files.

@slchiang
Copy link

slchiang commented Jun 4, 2016

Hi @gplhegde,

Did you make it work?

I also followed the guide and ran ./experiments/scripts/faster_rcnn_end2end.sh 1 VGG_CNN_M_1024 pascal_voc

but I got the error as below:

I0603 22:12:46.401278 11555 layer_factory.hpp:77] Creating layer rpn_loss_cls
I0603 22:12:46.401286 11555 net.cpp:106] Creating Layer rpn_loss_cls
I0603 22:12:46.401290 11555 net.cpp:454] rpn_loss_cls <- rpn_cls_score_reshape_rpn_cls_score_reshape_0_split_0
I0603 22:12:46.401296 11555 net.cpp:454] rpn_loss_cls <- rpn_labels
I0603 22:12:46.401303 11555 net.cpp:411] rpn_loss_cls -> rpn_cls_loss
I0603 22:12:46.401314 11555 layer_factory.hpp:77] Creating layer rpn_loss_cls
F0603 22:12:46.402214 11555 loss_layer.cpp:19] Check failed: bottom[0]->num() == bottom[1]->num() (2 vs. 1) The data and label should have the same number.
*** Check failure stack trace: ***
./experiments/scripts/faster_rcnn_end2end.sh: line 57: 11555 Aborted

Do you have any suggestion about this?

Thank you!

@gplhegde
Copy link
Author

Hi @slchiang I dropped VGG net as I don't have a good GPU for that. However, I am able to train fast-rcnn ( this required you to prepare selective search data) using CaffeNet on my GPU.

@Huangying-Zhan
Copy link

This reply is for original "AssertionError: Selective search data not found". I encountered this problem and found a way to solve it. The main cause of this error is the version of easydict. Here is only my experience, I hope it can solve the problem.
As mentioned by @happyharrycn , some versions of easydict may have a problem of passing messages (from config.yml to somewhere). Therefore, the solution is to install a correct version of easydict.
Firstly, I installed easydict using conda install -c auto easydict according to Easydict::Anaconda. This version is 1.4. Although we set PROPOSAL_METHOD in config.yml to 'gt', which means that RPN is adopted, this easydict-1.4 can't pass the configurations to train_net.py and cause this assertionError: Selective search data not found. Even you manually set PROPOSAL_METHOD in train_net.py, there are still other errors since the whole configuration is wrong!
After that, I search for other easydict version by anaconda search -t conda eaysdict and found something called verydeep/easydict which has a version 1.6.
So, my solution is to install this verydeep/easydict by conda install -c verydeep easydict. So far there are no further problems and the training is going well.

@qinhuan
Copy link

qinhuan commented Oct 17, 2016

Thank you all!!!!

@duygusar
Copy link

duygusar commented Oct 24, 2016

@Huangying-Zhan Hey, thanks for the clarification about easydict version, I am having the exact same problem as this thread and although I have used pip, the easydict I have installed is already 1.6, but it still doesn't work somehow. I wonder if it has to be the specific one that conda installs? I have been using Python but not the Anaconda version. Would it cause conflicts with Python? (I am new to Python)

@Huangying-Zhan
Copy link

Huangying-Zhan commented Oct 26, 2016

Hi, @duygusar , I think it should be the same.
Maybe you can check the version of your easydict first by $ pip freeze | grep easydict.
If it is still 1.4, you can upgrade it using $ pip install easydict --upgrade. And check whether it works.
By the way, it is recommended to install anaconda since it is very convenient for Python. Anaconda includes a lot of useful Python packages.

@duygusar
Copy link

duygusar commented Oct 26, 2016

@Huangying-Zhan Thank you. I have figured out my problem was different after all, so it turns out it was a problem with how bash script passes parameters to train_net, I had commented out --weights and that has caused the rest of the parameters not be passed to train_net. It is fixed after removing that line completely. It seemed like I was having the same error because it wouldn't pass config.yml correctly! Thank you for the tips about Anaconda, it is helpful for a Python newbie.

@EvanYellow
Copy link

@Huangying-Zhan 厉害了我的哥!

@lixiang-ucas
Copy link

Hi, @gplhegde I'm trying alternative training on my own data. You said that you are able to train fast-rcnn ( this required you to prepare selective search data) using CaffeNet on your GPU. Have you succeed using RPN proposal but not selective search?

@ParanoidW
Copy link

@Huangying-Zhan Thank you so much!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

10 participants