-
Notifications
You must be signed in to change notification settings - Fork 543
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
implement R-FCN with py-faster-rcnn #9
Comments
Hi, thanks for your effort. The problem seems to be from OHEM implementation. I suggest you try R-FCN without OHEM first, then carefully add OHEM training. |
Did you modify softmax_loss_layer? An additional output should be added to it, so as to get the classification loss. |
Yes, I have already change softmax_loss_layer the same as your file. I fix this by change roi_data_layer/layer.py line 128 to top[idx].reshape(1, 1, 1, 1), 0721 18:01:12.584405 31924 softmax_loss_layer.cpp:49] Check failed: outer_num_ * inner_num_ == bottom[1]->count() (1 vs. 4004) I changed roi_data_layer/minibatch.py line 77 to: # blobs['rois'] = rois_blob
# blobs['labels'] = labels_blob
blobs['rois'] = rois_blob.transpose(1, 0).reshape((1, 1, 5, -1))
blobs['labels'] = labels_blob.reshape((1, 1, 1, -1))
if cfg.TRAIN.BBOX_REG:
# blobs['bbox_targets'] = bbox_targets_blob
# blobs['bbox_inside_weights'] = bbox_inside_blob
# blobs['bbox_outside_weights'] = \
# np.array(bbox_inside_blob > 0).astype(np.float32)
blobs['bbox_targets'] = bbox_targets_blob.transpose(1, 0).reshape((1, 1, 4 * num_classes, -1))
blobs['bbox_inside_weights'] = bbox_inside_blob.transpose(1, 0).reshape((1, 1, 4 * num_classes, -1))
blobs['bbox_outside_weights'] = \
np.array(bbox_inside_blob > 0).astype(np.float32).transpose(1, 0).reshape((1, 1, 4* num_classes, -1)) according to your matlab code, now the shape of my label is (1, 1, 1, N), but I dont know the shape of cls_score |
Hi, Please consider trying R-FCN without OHEM first. From: argman [mailto:notifications@github.com] I fix this by change roi_data_layer/layer.py line 128 to top[idx].reshape(1, 1, 1, 1), 0721 18:01:12.584405 31924 softmax_loss_layer.cpp:49] Check failed: outer_num_ * inner_num_ == bottom[1]->count() (1 vs. 4004) I changed roi_data_layer/minibatch.py line 77 to: blobs['rois'] = rois_blob
according to your matlab code, now the shape of my label is (1, 1, 1, N), but I dont know the shape of cls_score — |
Hi, Daiji, Now I am able to run the whole 4 stage model(I just try a quick run and test there is any bugs, now the model is still running). And can you share your training log on pascal_voc(much appreciated), Thanks for your help! |
Hi, I am not sure. But in ZF, there is a large fc layer, which would be converted into a large conv layer in R-FCN. Thus the computation is heavy. I suggest you to experiment with ResNet. We have shared the step-wise training log, but do not store the joint training log. From: argman [mailto:notifications@github.com] Hi, Daiji, Now I am able to run the whole 4 stage model(I just try a quick run and test there is any bugs, now the model is still running). And can you share your training log on pascal_voc(much appreciated), Thanks for your help! — |
I have tried reset50, the timeis about 0.466s, really much faster with RFCN(0.166s), BTW seems some information of your email mess your reply. |
argman I guess that you have a python script based on py-faster-rcnn to run prediction. Are you OK to share your changes? Thanks, |
argman, Glad to see that. Yes, you can minimize the other time cost by implementing a pure C++-based detection code. But I think that should be final step before say applying R-FCN in products. |
@kaishijeng it's not a single script, you need to make a bit modify to other files. @daijifeng001 during my training, I can see the region proposal time is about 0.08s/img, so seems the main cost is region proposal, the scoring time is much less, as I say resnet50 in faster rcnn is 0.466s which include region proposal time, and RFCN 0.166s/img is without region proposal, so maybe the speedup is not that great. |
Hi,
Two points about speed up. First, did you use CUDNN? R-FCN is faster without CUDNN. Second, as for region proposal time. The conv computation should be very fast. While most time should be spent on Python and NMS in your implementation. A GPU c++ implementation of region proposal parser layer can minimize the cost.
|
@daijifeng001 , agree , but I dont see a c++ rpn layer in your code? do you know there is c++ implemention of rpn? |
And in psroi_pooling_layer.cu, line 43, you have but in py-fasterrcnn's roi_pooling_layer, bottom_rois[3] does not add 1, can you explain this? |
We have an internal C++ implementation. It requires some careful implementation, but not quite hard. For roi_end_w and roi_end_h in psroi_pooling_layer and roi_pooling_layer, they are just different understanding about the ending location of ROIs and differ by 1 pixel. You may just ignore it. |
but the 1 pixel seems important, with your original implemention, the result bbox seems totally wrong. the height and weight of the roi, in your imp is or maybe you have some preprocess somewhere I ignored. And do you have plan to open source the rpn layer? |
We just use the exact RPN implementation in Matlab version of Faster-RCNN. We run 4-step and joint training of RPN and R-FCN for feature sharing. Everything is fine in our experiments. |
the MATLAB version also doesnot add 1 pixel, |
I cannot find where is wrong, here is the log. it seems ok. |
Hi argman, we cannot locate your problem. You may carefully check and compare the intermediate results of the python and the matlab code. To the best of our knowledge, there are some researchers/groups who have successfully implement R-FCN in python. |
@daijifeng001 I want to confirm, if batch_size=128 and two images each batch, the labels shape is 128_1_1_1, and each row is a number as 0~20, tha bbox_targets shape is 128_8_1_1, each row is [0, 0, 0, 0, 1.2, -0.75, 1.0, 0.55], the bbox_loss_weights is 128_8_1*1, each row is [0, 0, 0, 0, 1, 1, ,1 ,1] for positive and [0, 0, 0, 0, 0, 0, 0, 0] for negative |
Hi argman, the first 4 dimension of bbox_pred should be zero. There should be some bugs in your code. |
I have checked my input, really dont know why.I've sent you an email. Can we talk in wechat? Tks! |
@daijifeng001 Finally I'm able to run R-FCN with python successfully! Thanks for you help! For others want to implement this, I have some tips here:
|
@daijifeng001 My test with ZF+R-FCN, the mAP on pascal_voc_test is only 35.9 with 0.053s/img, but with ZF+FRCN it's 59.3 with 0.073s/img Maybe some threshhold is different, can you explain the difference of testing between R-FCN and FRCNN? |
@argman This is incredibly interesting work. Are you able to share your work, for example as a fork of py-faster-rcnn? |
@ngaloppo I cannot reach the author's accuary so far, and still debugging.. @daijifeng001 Have you tried end2end training? |
I have tried joint training of RPN and R-FCN using ResNet 101. Its accuracy is on par with multi-step training without OHEM. |
@daijifeng001 tks, I tried end2end training, but the bbox_pred seems to learn nothing, do you have some experience of this? And its strange that when I use your matlab code to train resnet50, it takes about 4GB gpu memory, but when I use python code, it takes about 8GB gpu memory. |
It seems ZF5-net is not appropriate to RFCN, only one layer can be tuned with bbox regression , ResNet50 is much better, because we have conv5* to finetune, with resnet50 end2end training, 07trainval->07test can get a mAP of 67.90(0.175s/img). During my experiment, there maybe a better network structure for object detection and now most paper use VGG |
Hi, |
@weiyichang , in roi_data layer, its still 21, because if you use rpn to generate proposal, it's bbox regression is not class-agonostic for bbox prediction, you should check fast_rcnn/train.py snapshot, FRCNN save the bbox mean and std in the convolution paras, you should modify it. |
@argman Hi, could you explain how to reshape rois, labels, bbox_targets and bbox_inside_weights? |
@xchani , you may change blobs['rois'] = rois_blob to blobs['rois'] = rois_blob.reshape((-1, 5, 1, 1)) in minibatch.py and the same to lables, bbox_targets and bbox_inside_weights. |
@argman It's success now? |
@duanLH ,yes, both 4 stage training and end2end works, but i cannot reproduce the mAP in the paper. |
@argman Wow, can you send me the py version? the Matlab version is wrong in my computer |
@duanLH sorry I cannot right now, but if you have any questions, i'd like to help |
@argman OK ~thanks, |
@argman if you set `num_classes' as 21, the size of bbox_targets would be (N, 84, 1, 1). How is that possible to reshape it to (-1, 8, 1, 1) ? |
@xchani the paper use class-agnostic bbox regression ,so its 8 |
@argman I add the box_annotator_ohem_layer.cu box_annotator_ohem_layer.cpp to faster rcnn/src/layers src/caffe/layers/box_annotator_ohem_layer.cu(50): error: a template argument may not reference a local type src/caffe/layers/box_annotator_ohem_layer.cu(50): error: a template argument may not reference a local type 2 errors detected in the compilation of "/tmp/tmpxft_0000576e_00000000-16_box_annotator_ohem_layer.compute_50.cpp1.ii". |
@duanLH box_annotator_ohem_layer uses C++11 features, you need to make changes to the makefile according to this commit. |
@oh233 thank you ~ |
@argman Is there any way to save 'bbox_pred' ? I have no idea which file to modify. |
@xchani ,its the output of the network, why do you need to save it? |
@argman I mean when doing snapshot, the rois' mean and std is not saved to bbox_pred layer |
@xchani , its most the same as py-frcnn, anyway here is mine: def snapshot(self):
"""Take a snapshot of the network after unnormalizing the learned
bounding-box regression weights. This enables easy use at test-time.
"""
net = self.solver.net
# scale_bbox_params = (cfg.TRAIN.BBOX_REG and
# cfg.TRAIN.BBOX_NORMALIZE_TARGETS and
# net.params.has_key('bbox_pred'))
scale_bbox_params = (cfg.TRAIN.BBOX_REG and
cfg.TRAIN.BBOX_NORMALIZE_TARGETS and
net.params.has_key('rfcn_bbox'))
if scale_bbox_params:
# save original values
orig_0 = net.params['rfcn_bbox'][0].data.copy()
orig_1 = net.params['rfcn_bbox'][1].data.copy()
rep_time = orig_0.shape[0]/self.bbox_means.shape[0]
bbox_stds = self.bbox_stds.flatten().reshape((-1, 1))
bbox_stds = np.tile(bbox_stds, rep_time)
bbox_stds = bbox_stds.flatten().reshape((-1, 1, 1, 1))
bbox_means = self.bbox_means.flatten().reshape((-1, 1))
bbox_means = np.tile(bbox_means, rep_time)
bbox_means = bbox_means.flatten().reshape((-1, 1, 1, 1))
# scale and shift with bbox reg unnormalization; then save snapshot
net.params['rfcn_bbox'][0].data[...] = \
(net.params['rfcn_bbox'][0].data * bbox_stds)
net.params['rfcn_bbox'][1].data[...] = \
(net.params['rfcn_bbox'][1].data * bbox_stds.flatten() + bbox_means.flatten())
infix = ('_' + cfg.TRAIN.SNAPSHOT_INFIX
if cfg.TRAIN.SNAPSHOT_INFIX != '' else '')
filename = (self.solver_param.snapshot_prefix + infix +
'_iter_{:d}'.format(self.solver.iter) + '.caffemodel')
filename = os.path.join(self.output_dir, filename)
net.save(str(filename))
print 'Wrote snapshot to: {:s}'.format(filename)
if scale_bbox_params:
# restore net to original state
net.params['rfcn_bbox'][0].data[...] = orig_0
net.params['rfcn_bbox'][1].data[...] = orig_1
return filename |
@argman thanks for your efforts. i try ohem with resnet50 in end2end style on 07+12 trainval, achieve 76.91% mAP on 07 testset. |
@zimenglan-sysu-512 , thats great! so ohem is really helpful! |
@argman Since you have been able to reproduce the paper's results, are you able to share your python implementation? |
@orpine Super! I will have a look. Thanks for the heads-up! |
Given @orpine 's nice work, we close this issue. |
@duanLH Have you solved this problem? The makefile of mine is correct, but still has the problem. I have no idea how to solve this problem. Could you please tell me? src/caffe/layers/box_annotator_ohem_layer.cu(50): error: a template argument may not reference a local type src/caffe/layers/box_annotator_ohem_layer.cu(50): error: a template argument may not reference a local type 2 errors detected in the compilation of "/tmp/tmpxft_0000576e_00000000-16_box_annotator_ohem_layer.compute_50.cpp1.ii". |
@whuhxb (as @duanLH already suggested) add "-std=c++11" to CXX flags! how to do that? |
Hi, I'm trying to implement R-FCN in py-faster-rcnn, but encounter serval issues,
I make these changes with py-faster-rcnn:
And changes with R-FCN:
Now I am able to re-compie caffe and train rpn net, but when traing fast-rcnn, I got a bug:
I0721 12:24:04.370088 12785 layer_factory.hpp:77] Creating layer per_roi_loss_cls
I0721 12:24:04.370101 12785 net.cpp:106] Creating Layer per_roi_loss_cls
I0721 12:24:04.370105 12785 net.cpp:454] per_roi_loss_cls <- cls_score_ave_cls_score_rois_0_split_0
I0721 12:24:04.370110 12785 net.cpp:454] per_roi_loss_cls <- labels_data_2_split_0
I0721 12:24:04.370113 12785 net.cpp:411] per_roi_loss_cls -> temp_loss_cls
I0721 12:24:04.370120 12785 net.cpp:411] per_roi_loss_cls -> temp_prob_cls
I0721 12:24:04.370124 12785 net.cpp:411] per_roi_loss_cls -> per_roi_loss_cls
I0721 12:24:04.370139 12785 layer_factory.hpp:77] Creating layer per_roi_loss_cls
I0721 12:24:04.370525 12785 net.cpp:150] Setting up per_roi_loss_cls
I0721 12:24:04.370537 12785 net.cpp:157] Top shape: (1)
I0721 12:24:04.370553 12785 net.cpp:157] Top shape: 1 21 1 1 (21)
I0721 12:24:04.370556 12785 net.cpp:157] Top shape: 1 (1)
I0721 12:24:04.370559 12785 net.cpp:165] Memory required for data: 678497932
I0721 12:24:04.370563 12785 layer_factory.hpp:77] Creating layer per_roi_loss_bbox
I0721 12:24:04.370575 12785 net.cpp:106] Creating Layer per_roi_loss_bbox
I0721 12:24:04.370579 12785 net.cpp:454] per_roi_loss_bbox <- bbox_pred_ave_bbox_pred_rois_0_split_0
I0721 12:24:04.370584 12785 net.cpp:454] per_roi_loss_bbox <- bbox_targets_data_3_split_0
I0721 12:24:04.370589 12785 net.cpp:454] per_roi_loss_bbox <- bbox_inside_weights_data_4_split_0
I0721 12:24:04.370595 12785 net.cpp:411] per_roi_loss_bbox -> temp_loss_bbox
I0721 12:24:04.370601 12785 net.cpp:411] per_roi_loss_bbox -> per_roi_loss_bbox
I0721 12:24:04.370652 12785 net.cpp:150] Setting up per_roi_loss_bbox
I0721 12:24:04.370658 12785 net.cpp:157] Top shape: (1)
I0721 12:24:04.370662 12785 net.cpp:157] Top shape: 1 1 1 1 (1)
I0721 12:24:04.370664 12785 net.cpp:165] Memory required for data: 678497940
I0721 12:24:04.370667 12785 layer_factory.hpp:77] Creating layer per_roi_loss
I0721 12:24:04.370679 12785 net.cpp:106] Creating Layer per_roi_loss
I0721 12:24:04.370682 12785 net.cpp:454] per_roi_loss <- per_roi_loss_cls
I0721 12:24:04.370688 12785 net.cpp:454] per_roi_loss <- per_roi_loss_bbox
I0721 12:24:04.370692 12785 net.cpp:411] per_roi_loss -> per_roi_loss
F0721 12:24:04.370698 12785 eltwise_layer.cpp:34] Check failed: bottom[i]->shape() == bottom[0]->shape() 1 0 0 0
*** Check failure stack trace: ***
thanks for you help!
The text was updated successfully, but these errors were encountered: