Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Training FasterRCNN without pre-trained network? #238

Closed
tiepnh opened this issue Jul 1, 2016 · 56 comments
Closed

Training FasterRCNN without pre-trained network? #238

tiepnh opened this issue Jul 1, 2016 · 56 comments

Comments

@tiepnh
Copy link

tiepnh commented Jul 1, 2016

Hi all,
I got the error ""BB = BB[sorted_ind, :]
IndexError: too many indices for array"
It seem that the trained network is nothing.

I follow the original steps in https://github.com/rbgirshick/py-faster-rcnn
And just modify scripts file ./experiments/scripts/faster_rcnn_end2end.sh to remove the line " --weights data/imagenet_models/${NET}.v2.caffemodel "
I can finish the training, and also make the caffemodel file.

Anyones face this error? Could you please give me the solution?
Thank you,

@tiepnh tiepnh changed the title Training FasterRCNN without initial weight Training FasterRCNN without pre-trained network? Jul 4, 2016
@manipopopo
Copy link

manipopopo commented Jul 5, 2016

Maybe your model didn't learned anything (without initializing the model by the pre-trained model parameters).
The prediction of the model (on all images in the test set) doesn't contains any boxes labeled with some class C. So the BB becomes an empty array.

@tiepnh
Copy link
Author

tiepnh commented Jul 8, 2016

@manipopopo : Thank you for your answer. I understand that my trained model contain nothing. I just confuse that why the model cannot learn anything without pre-trained model).
If you have any knowledge about this, pls share with me.
Thankyou

@manipopopo
Copy link

manipopopo commented Jul 8, 2016

Hi @tiepnh ,

Which ${NET} did you try? The lr_mult of the layers before conv2 in VGG_CNN_M_1024 and the layers before conv3_1 in VGG16 are set to 0, that is, the bottom layers of the networks won't learn anything during the training.

Even if all the lr_mult are non-zero (in ZF net), the hyperparameters (e.g. weight decay, training max_iteration and learning rates) are needed to be tuned. The provided hyperparameters are tuned for training models initialized with some pre-trained models.

@tiepnh
Copy link
Author

tiepnh commented Jul 11, 2016

Hi @manipopopo : Thank you for your answer. I will check those hyperparameters again.
Thanks,

@tiepnh
Copy link
Author

tiepnh commented Jul 12, 2016

Hi @manipopopo : I tried training ZF without pre-trained and use the solver as below:

base_lr: 0.001
lr_policy: "step"
gamma: 0.8
stepsize: 7000
display: 20
average_loss: 100
momentum: 0.9
weight_decay: 0.0005
max_iteration is 70000

But, it still cannot learn anything. Can you give me some advices?

@manipopopo
Copy link

  • You need to specify weight_filler for all Convolution and InnerProduct layers in the proto, otherwise all parameters will be initialized with zero.
  • You may want to check lr_policy exp, inv and poly (see caffe.proto for further information). Besides, hyperparameters include __C.TRAIN.{SOMETHING} in config.py.
  • Even if the model initialized randomly learns something, it may still perform worse than one initialized with pre-trained weights.

@tiepnh
Copy link
Author

tiepnh commented Jul 12, 2016

Hi @manipopopo. Thank for your support.

You need to specify weight_filler for all Convolution and InnerProduct layers in the proto, otherwise all parameters will be initialized with zero.

So, it mean that, every convolution or inner_product layer in proto need to put some thing like "weight_filler { type: "gaussian" std: 0.01 }" ?
My probelem is the network don't leanr anything with out pre-trained, so my first target is make it learning something, even the performance worse than one with pre-trained weighs. It still big step with me.
I will try again as your advice. Thank again

@manipopopo
Copy link

Before starting training a model on the whole data set by 70000 iterations, you may want to experiment with a tiny training data set (2-20 images), and make sure that the model can overfit the training data set.

@tiepnh
Copy link
Author

tiepnh commented Jul 14, 2016

Hi @manipopopo: Follow your guide, I can train the network with out pre-trained network.
But, as you said, the mAP lower than one initialized with pre-trained network. It is still good for me.
Thank a lot

@tolry418
Copy link

tolry418 commented Aug 2, 2016

@tiepnh HI. I have same problem as you. I really appreciate you if you let me know how to modify the train.prototxt about faster-rcnn_end2end. I don't wanna use pre-trained model. So could you tell me more detail?

@tiepnh
Copy link
Author

tiepnh commented Aug 2, 2016

@tolry418 : You should follow comment of @manipopopo before.
Or you can try to use this prototxt(from #56) :https://github.com/rbgirshick/py-faster-rcnn/files/380443/train_val.txt
Base on that, you don't need use pre-trained network any more

@tolry418
Copy link

tolry418 commented Aug 2, 2016

@tiepnh Thanks.
I opened the file what you gave me. but it has no RPN and RCNN parts. It just VGG Model. isn't it?
In order to train faster-rcnn. What should i refer?
And i don't understand @manipopopo's comments.
Should i have to run with iteration 70000? and have to put weight_filler in all of convolution or innerproduct layer?
I'm wondering how can you modify your end2end train.prototxt.
In my case i used VGG16/faster_rcnn_end2end/train.prototxt.
Thanks.

@tiepnh
Copy link
Author

tiepnh commented Aug 3, 2016

@tolry418 : I'm sorry, I sent to you the wrong prototxt file. That file just for pre-training network.
So, to solve your issue now, you just need to add weight_filler for all Convolution and InnerProduct layers in the proto (put this line "weight_filler { type: "gaussian" std: 0.01 }" for all convolution_param and _ inner_product_param_). You can check the prototxt in pre comment to check how to add weight_filter

These changes will make your network can learn something, not sure that the accuracy of final model is good or bad. so, you don't need change the other hyperparameter(such as iteration, learning rate,...) for now to train the new network. Just keep them are default and check that the network can leaning anything or not.
Hope that you can solve your issue

@tolry418
Copy link

tolry418 commented Aug 3, 2016

@tiepnh Thanks for your reply. As you mentioned, i put the line like that "weight_filler { type : "gaussian" std : 0.01}" for all Convolution and Innerproduct layers in train.prototxt which is in py-faster_rcnn/models/pascal_voc/VGG16/faster_rcnn_end2end folder.
And i do change nothing. except putting the weight_filler.

like below

layer {
name: "conv1_2"
type: "Convolution"
bottom: "conv1_1"
top: "conv1_2"
param {
lr_mult: 0
decay_mult: 0
}
param {
lr_mult: 0
decay_mult: 0
}
convolution_param {
num_output: 64
pad: 1
kernel_size: 3
weight_filler { type: "gaussian" std: 0.01 }
bias_filler { type: "constant" value: 1 }

}

}

BUT, i can't train it. Still have same error ...
What did i wrong?

Thanks.

@tiepnh
Copy link
Author

tiepnh commented Aug 3, 2016

@tolry418 : Can you upload your full proto file, and also faster_rcnn_end2end.sh file.
And, pls upload your using config. Base on that, maybe we can resolve the current issue.

@tolry418
Copy link

tolry418 commented Aug 4, 2016

OK. You might miss understand what i'm in now.
I can train it anyway. but i face the problem at the test moment when i use trained model without pretrained-model.
But same error occur at the end of test time.
""BB = BB[sorted_ind, :]
IndexError: too many indices for array"

This is the train.prototxt what i modified.


name: "VGG_ILSVRC_16_layers"
layer {
name: 'input-data'
type: 'Python'
top: 'data'
top: 'im_info'
top: 'gt_boxes'
python_param {
module: 'roi_data_layer.layer'
layer: 'RoIDataLayer'
param_str: "'num_classes': 21"
}
}

layer {
name: "conv1_1"
type: "Convolution"
bottom: "data"
top: "conv1_1"
param {
lr_mult: 0
decay_mult: 0
}
param {
lr_mult: 0
decay_mult: 0
}
convolution_param {
num_output: 64
pad: 1
kernel_size: 3
weight_filler { type: "gaussian" std: 0.01 }
bias_filler { type: "constant" value: 0 }
}
}
layer {
name: "relu1_1"
type: "ReLU"
bottom: "conv1_1"
top: "conv1_1"
}
layer {
name: "conv1_2"
type: "Convolution"
bottom: "conv1_1"
top: "conv1_2"
param {
lr_mult: 0
decay_mult: 0
}
param {
lr_mult: 0
decay_mult: 0
}
convolution_param {
num_output: 64
pad: 1
kernel_size: 3
weight_filler { type: "gaussian" std: 0.01 }
bias_filler { type: "constant" value: 1 }
}
}
layer {
name: "relu1_2"
type: "ReLU"
bottom: "conv1_2"
top: "conv1_2"
}
layer {
name: "pool1"
type: "Pooling"
bottom: "conv1_2"
top: "pool1"
pooling_param {
pool: MAX
kernel_size: 2
stride: 2
}
}
layer {
name: "conv2_1"
type: "Convolution"
bottom: "pool1"
top: "conv2_1"
param {
lr_mult: 0
decay_mult: 0
}
param {
lr_mult: 0
decay_mult: 0
}
convolution_param {
num_output: 128
pad: 1
kernel_size: 3
weight_filler { type: "gaussian" std: 0.01 }
bias_filler { type: "constant" value: 1 }
}
}
layer {
name: "relu2_1"
type: "ReLU"
bottom: "conv2_1"
top: "conv2_1"
}
layer {
name: "conv2_2"
type: "Convolution"
bottom: "conv2_1"
top: "conv2_2"
param {
lr_mult: 0
decay_mult: 0
}
param {
lr_mult: 0
decay_mult: 0
}
convolution_param {
num_output: 128
pad: 1
kernel_size: 3
weight_filler { type: "gaussian" std: 0.01 }
bias_filler { type: "constant" value: 0 }
}
}
layer {
name: "relu2_2"
type: "ReLU"
bottom: "conv2_2"
top: "conv2_2"
}
layer {
name: "pool2"
type: "Pooling"
bottom: "conv2_2"
top: "pool2"
pooling_param {
pool: MAX
kernel_size: 2
stride: 2
}
}
layer {
name: "conv3_1"
type: "Convolution"
bottom: "pool2"
top: "conv3_1"
param {
lr_mult: 1
}
param {
lr_mult: 2
}
convolution_param {
num_output: 256
pad: 1
kernel_size: 3
weight_filler { type: "gaussian" std: 0.01 }
bias_filler { type: "constant" value: 0 }
}
}
layer {
name: "relu3_1"
type: "ReLU"
bottom: "conv3_1"
top: "conv3_1"
}
layer {
name: "conv3_2"
type: "Convolution"
bottom: "conv3_1"
top: "conv3_2"
param {
lr_mult: 1
}
param {
lr_mult: 2
}
convolution_param {
num_output: 256
pad: 1
kernel_size: 3
weight_filler { type: "gaussian" std: 0.01 }
bias_filler { type: "constant" value: 0 }
}
}
layer {
name: "relu3_2"
type: "ReLU"
bottom: "conv3_2"
top: "conv3_2"
}
layer {
name: "conv3_3"
type: "Convolution"
bottom: "conv3_2"
top: "conv3_3"
param {
lr_mult: 1
}
param {
lr_mult: 2
}
convolution_param {
num_output: 256
pad: 1
kernel_size: 3
weight_filler { type: "gaussian" std: 0.01 }
bias_filler { type: "constant" value: 0 }
}
}
layer {
name: "relu3_3"
type: "ReLU"
bottom: "conv3_3"
top: "conv3_3"
}
layer {
name: "pool3"
type: "Pooling"
bottom: "conv3_3"
top: "pool3"
pooling_param {
pool: MAX
kernel_size: 2
stride: 2
}
}
layer {
name: "conv4_1"
type: "Convolution"
bottom: "pool3"
top: "conv4_1"
param {
lr_mult: 1
}
param {
lr_mult: 2
}
convolution_param {
num_output: 512
pad: 1
kernel_size: 3
weight_filler { type: "gaussian" std: 0.01 }
bias_filler { type: "constant" value: 0 }
}
}
layer {
name: "relu4_1"
type: "ReLU"
bottom: "conv4_1"
top: "conv4_1"
}
layer {
name: "conv4_2"
type: "Convolution"
bottom: "conv4_1"
top: "conv4_2"
param {
lr_mult: 1
}
param {
lr_mult: 2
}
convolution_param {
num_output: 512
pad: 1
kernel_size: 3
weight_filler { type: "gaussian" std: 0.01 }
bias_filler { type: "constant" value: 0 }
}
}
layer {
name: "relu4_2"
type: "ReLU"
bottom: "conv4_2"
top: "conv4_2"
}
layer {
name: "conv4_3"
type: "Convolution"
bottom: "conv4_2"
top: "conv4_3"
param {
lr_mult: 1
}
param {
lr_mult: 2
}
convolution_param {
num_output: 512
pad: 1
kernel_size: 3
weight_filler { type: "gaussian" std: 0.01 }
bias_filler { type: "constant" value: 0 }
}
}
layer {
name: "relu4_3"
type: "ReLU"
bottom: "conv4_3"
top: "conv4_3"
}
layer {
name: "pool4"
type: "Pooling"
bottom: "conv4_3"
top: "pool4"
pooling_param {
pool: MAX
kernel_size: 2
stride: 2
}
}
layer {
name: "conv5_1"
type: "Convolution"
bottom: "pool4"
top: "conv5_1"
param {
lr_mult: 1
}
param {
lr_mult: 2
}
convolution_param {
num_output: 512
pad: 1
kernel_size: 3
weight_filler { type: "gaussian" std: 0.01 }
bias_filler { type: "constant" value: 0 }
}
}
layer {
name: "relu5_1"
type: "ReLU"
bottom: "conv5_1"
top: "conv5_1"
}
layer {
name: "conv5_2"
type: "Convolution"
bottom: "conv5_1"
top: "conv5_2"
param {
lr_mult: 1
}
param {
lr_mult: 2
}
convolution_param {
num_output: 512
pad: 1
kernel_size: 3
weight_filler { type: "gaussian" std: 0.01 }
bias_filler { type: "constant" value: 0.1 }
}
}
layer {
name: "relu5_2"
type: "ReLU"
bottom: "conv5_2"
top: "conv5_2"
}
layer {
name: "conv5_3"
type: "Convolution"
bottom: "conv5_2"
top: "conv5_3"
param {
lr_mult: 1
}
param {
lr_mult: 2
}
convolution_param {
num_output: 512
pad: 1
kernel_size: 3
weight_filler { type: "gaussian" std: 0.01 }
bias_filler { type: "constant" value: 0 }
}
}
layer {
name: "relu5_3"
type: "ReLU"
bottom: "conv5_3"
top: "conv5_3"
}

#========= RPN ============

layer {
name: "rpn_conv/3x3"
type: "Convolution"
bottom: "conv5_3"
top: "rpn/output"
param { lr_mult: 1.0 }
param { lr_mult: 2.0 }
convolution_param {
num_output: 512
kernel_size: 3 pad: 1 stride: 1
weight_filler { type: "gaussian" std: 0.01 }
bias_filler { type: "constant" value: 0 }
}
}
layer {
name: "rpn_relu/3x3"
type: "ReLU"
bottom: "rpn/output"
top: "rpn/output"
}

layer {
name: "rpn_cls_score"
type: "Convolution"
bottom: "rpn/output"
top: "rpn_cls_score"
param { lr_mult: 1.0 }
param { lr_mult: 2.0 }
convolution_param {
num_output: 18 # 2(bg/fg) * 9(anchors)
kernel_size: 1 pad: 0 stride: 1
weight_filler { type: "gaussian" std: 0.01 }
bias_filler { type: "constant" value: 0 }
}
}

layer {
name: "rpn_bbox_pred"
type: "Convolution"
bottom: "rpn/output"
top: "rpn_bbox_pred"
param { lr_mult: 1.0 }
param { lr_mult: 2.0 }
convolution_param {
num_output: 36 # 4 * 9(anchors)
kernel_size: 1 pad: 0 stride: 1
weight_filler { type: "gaussian" std: 0.01 }
bias_filler { type: "constant" value: 0 }
}
}

layer {
bottom: "rpn_cls_score"
top: "rpn_cls_score_reshape"
name: "rpn_cls_score_reshape"
type: "Reshape"
reshape_param { shape { dim: 0 dim: 2 dim: -1 dim: 0 } }
}

layer {
name: 'rpn-data'
type: 'Python'
bottom: 'rpn_cls_score'
bottom: 'gt_boxes'
bottom: 'im_info'
bottom: 'data'
top: 'rpn_labels'
top: 'rpn_bbox_targets'
top: 'rpn_bbox_inside_weights'
top: 'rpn_bbox_outside_weights'
python_param {
module: 'rpn.anchor_target_layer'
layer: 'AnchorTargetLayer'
param_str: "'feat_stride': 16"
}
}

layer {
name: "rpn_loss_cls"
type: "SoftmaxWithLoss"
bottom: "rpn_cls_score_reshape"
bottom: "rpn_labels"
propagate_down: 1
propagate_down: 0
top: "rpn_cls_loss"
loss_weight: 1
loss_param {
ignore_label: -1
normalize: true
}
}

layer {
name: "rpn_loss_bbox"
type: "SmoothL1Loss"
bottom: "rpn_bbox_pred"
bottom: "rpn_bbox_targets"
bottom: 'rpn_bbox_inside_weights'
bottom: 'rpn_bbox_outside_weights'
top: "rpn_loss_bbox"
loss_weight: 1
smooth_l1_loss_param { sigma: 3.0 }
}

#========= RoI Proposal ============

layer {
name: "rpn_cls_prob"
type: "Softmax"
bottom: "rpn_cls_score_reshape"
top: "rpn_cls_prob"
}

layer {
name: 'rpn_cls_prob_reshape'
type: 'Reshape'
bottom: 'rpn_cls_prob'
top: 'rpn_cls_prob_reshape'
reshape_param { shape { dim: 0 dim: 18 dim: -1 dim: 0 } }
}

layer {
name: 'proposal'
type: 'Python'
bottom: 'rpn_cls_prob_reshape'
bottom: 'rpn_bbox_pred'
bottom: 'im_info'
top: 'rpn_rois'
python_param {
module: 'rpn.proposal_layer'
layer: 'ProposalLayer'
param_str: "'feat_stride': 16"
}
}
layer {
name: 'roi-data'
type: 'Python'
bottom: 'rpn_rois'
bottom: 'gt_boxes'
top: 'rois'
top: 'labels'
top: 'bbox_targets'
top: 'bbox_inside_weights'
top: 'bbox_outside_weights'
python_param {
module: 'rpn.proposal_target_layer'
layer: 'ProposalTargetLayer'
param_str: "'num_classes': 21"
}
}

#========= RCNN ============

layer {
name: "roi_pool5"
type: "ROIPooling"
bottom: "conv5_3"
bottom: "rois"
top: "pool5"
roi_pooling_param {
pooled_w: 7
pooled_h: 7
spatial_scale: 0.0625 # 1/16
}
}
layer {
name: "fc6"
type: "InnerProduct"
bottom: "pool5"
top: "fc6"
param {
lr_mult: 1
}
param {
lr_mult: 2
}
inner_product_param {
num_output: 4096
weight_filler { type: "gaussian" std: 0.01 }
bias_filler { type: "constant" value: 1 }
}
}
layer {
name: "relu6"
type: "ReLU"
bottom: "fc6"
top: "fc6"
}
layer {
name: "drop6"
type: "Dropout"
bottom: "fc6"
top: "fc6"
dropout_param {
dropout_ratio: 0.5
}
}
layer {
name: "fc7"
type: "InnerProduct"
bottom: "fc6"
top: "fc7"
param {
lr_mult: 1
}
param {
lr_mult: 2
}
inner_product_param {
num_output: 4096
weight_filler { type: "gaussian" std: 0.01 }
bias_filler { type: "constant" value: 1 }
}
}
layer {
name: "relu7"
type: "ReLU"
bottom: "fc7"
top: "fc7"
}
layer {
name: "drop7"
type: "Dropout"
bottom: "fc7"
top: "fc7"
dropout_param {
dropout_ratio: 0.5
}
}
layer {
name: "cls_score"
type: "InnerProduct"
bottom: "fc7"
top: "cls_score"
param {
lr_mult: 1
}
param {
lr_mult: 2
}
inner_product_param {
num_output: 21
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "bbox_pred"
type: "InnerProduct"
bottom: "fc7"
top: "bbox_pred"
param {
lr_mult: 1
}
param {
lr_mult: 2
}
inner_product_param {
num_output: 84
weight_filler {
type: "gaussian"
std: 0.001
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "loss_cls"
type: "SoftmaxWithLoss"
bottom: "cls_score"
bottom: "labels"
propagate_down: 1
propagate_down: 0
top: "loss_cls"
loss_weight: 1
}
layer {
name: "loss_bbox"
type: "SmoothL1Loss"
bottom: "bbox_pred"
bottom: "bbox_targets"
bottom: "bbox_inside_weights"
bottom: "bbox_outside_weights"
top: "loss_bbox"
loss_weight: 1

}

This is faster_rcnn_end2end.sh

set -x
set -e

export PYTHONUNBUFFERED="True"

GPU_ID=$1
NET=$2
NET_lc=${NET,,}
DATASET=$3

array=( $@ )
len=${#array[@]}
EXTRA_ARGS=${array[@]:3:$len}
EXTRA_ARGS_SLUG=${EXTRA_ARGS// /_}

case $DATASET in
pascal_voc)
TRAIN_IMDB="voc_2007_trainval"
TEST_IMDB="voc_2007_test"
PT_DIR="pascal_voc"
ITERS=70000
;;
coco)
# This is a very long and slow training schedule
# You can probably use fewer iterations and reduce the
# time to the LR drop (set in the solver to 350,000 iterations).
TRAIN_IMDB="coco_2014_train"
TEST_IMDB="coco_2014_minival"
PT_DIR="coco"
ITERS=490000
;;
*)
echo "No dataset given"
exit
;;
esac

LOG="experiments/logs/faster_rcnn_end2end_${NET}_${EXTRA_ARGS_SLUG}.txt.date +'%Y-%m-%d_%H-%M-%S'"
exec &> >(tee -a "$LOG")
echo Logging output to "$LOG"

time ./tools/train_net.py --gpu ${GPU_ID}
--solver models/${PT_DIR}/${NET}/faster_rcnn_end2end/solver.prototxt
--weights data/imagenet_models/${NET}.v2.caffemodel
--imdb ${TRAIN_IMDB}
--iters ${ITERS}
--cfg experiments/cfgs/faster_rcnn_end2end.yml
${EXTRA_ARGS}

set +x
NET_FINAL=grep -B 1 "done solving" ${LOG} | grep "Wrote snapshot" | awk '{print $4}'
set -x

time ./tools/test_net.py --gpu ${GPU_ID}
--def models/${PT_DIR}/${NET}/faster_rcnn_end2end/test.prototxt
--net ${NET_FINAL}
--imdb ${TEST_IMDB}
--cfg experiments/cfgs/faster_rcnn_end2end.yml
${EXTRA_ARGS}


This is the extra config file what i add on original config file.

EXP_DIR: faster_rcnn_end2end
TRAIN:
HAS_RPN: True
IMS_PER_BATCH: 1
BBOX_NORMALIZE_TARGETS_PRECOMPUTED: True
RPN_POSITIVE_OVERLAP: 0.7
RPN_BATCHSIZE: 256
PROPOSAL_METHOD: gt
BG_THRESH_LO: 0.0
TEST:
HAS_RPN: True


Thanks your help.

@tiepnh
Copy link
Author

tiepnh commented Aug 4, 2016

@tolry418 : in faster_rcnn_end2end.sh, please remove line "--weights data/imagenet_models/${NET}.v2.caffemodel " to avoid pre-trained network.
Not sure it can help or not. pls try it
Other tips is you should test with smaller iteration first (exp. change 70000 in faster_rcnn_end2end.sh to 10000)

@tolry418
Copy link

tolry418 commented Aug 4, 2016

@tiepnh
Above faster_rcnn_end2end. sh is original file, it is just given. I do not touch anything.
I already train it without pre-trained model.

I run test on command line directly like this
./tools/test_net.py --def models/pascal_voc/VGG16/faster_rcnn_end2end/test.prototxt --net output/faster_rcnn_end2end/voc_2007_train/vgg16_faster_rcnn_iter_20000.caffemodel --cfg experiments/cfgs/faster_rcnn_end2end.yml

As i wrote above
--net output/faster_rcnn_end2end/voc_2007_train/vgg16_faster_rcnn_iter_20000.caffemodel
This is the model i trained without pretrained model.

But when i run the test.
I bump into that problem.
""BB = BB[sorted_ind, :]
IndexError: too many indices for array"

Thanks your help.

@manipopopo
Copy link

@tolry418

BB contains all boxes of some specific class. If the model doesn't find any boxes for some class from the whole test data set, BB will be an empty (1-d) array. So calling BB[sorted_ind, :] will lead to IndexError.

Maybe you should remove all lr_mult: 0 from conv1 and conv2 in the prototxt.

@karaspd
Copy link

karaspd commented Aug 23, 2016

@tolry418, If you still not figured about your issue, the problem might be your four first convolution layers. In your provided prototxt, you specify the learning multiplier for these layer as 0. It would cause your network to not learn anything in these layers. Since these layers are important in filtering images, it means that after 100000 of iterations you still could not learn anything.

@CassieMai
Copy link

@tiepnh Hello, I have the problem of

BB = BB[sorted_ind, :]
IndexError: too many indices for array

And I followed above comments to modify train.prototxt by adding weight_filler and bias filler. But I still have the IndexError. Then I didn't use pretrained_model VGG16.v2.caffemodel to initialize faster r-cnn. However, I have a new error:

AssertionError: Selective search data not found at: 
/py-faster-rcnn/data/selective_search_data/voc_2007_trainval.mat

Can you help me to solve it? Thank you very much.

@tiepnh
Copy link
Author

tiepnh commented Apr 21, 2017

Hi @CassieMai:
Did you use config from file ./experiments/cfgs/faster_rcnn_end2end.yml ???
Make sure to set PROPOSAL_METHOD to "gt". (In default, the PROPOSAL_METHOD is set to selective_search and as I think you have no selective_search data).

@CassieMai
Copy link

@tiepnh Yes, I used faster_rcnn_end2end.yml, and I kept the default setting PROPOSAL_METHOD = gt. It seems config file didn't make a difference.

EXP_DIR: faster_rcnn_end2end
TRAIN:
  HAS_RPN: True
  IMS_PER_BATCH: 1
  BBOX_NORMALIZE_TARGETS_PRECOMPUTED: True
  RPN_POSITIVE_OVERLAP: 0.7
  RPN_BATCHSIZE: 256
  PROPOSAL_METHOD: gt
  BG_THRESH_LO: 0.0
TEST:
  HAS_RPN: True

@hxj2012
Copy link

hxj2012 commented Aug 13, 2017

@tolry418
hi, do you figure out the problem when you train VGG16/faster rcnn without pretraining model ?
I also met the same problem. Thank you!

@whmin
Copy link

whmin commented Nov 8, 2017

@CassieMai I met the same error,AssertionError: Selective search data not found at:
/py-faster-rcnn/data/selective_search_data/voc_2007_trainval.mat
How did you handle it?
Thank you very much!!

@RichardMrLu
Copy link

@CassieMai @whmin I met the same error: AssertionError: Selective search data not found
di you solve it? Thanks for your answer.

@CassieMai
Copy link

@whmin @RichardMrLu Hello, it was long time ago. I didn't remember whether I solved it, because I used a tensorflow version code instead. You may follow the above posts or try other ways. Sorry for that.

@ujsyehao
Copy link

I have a problem, why need use selective search data? We use RPN instead of selective search

@FredaZhang338
Copy link

@tiepnh @manipopopo Hi, I am not sure that since the layers before conv3_1 in VGG16 are set to 0, I need add the weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 1 } to all layers in VGG16??or just the layers before conv3_1?thans for your help!

@manipopopo
Copy link

If you don't load pretrained models, you'll need to make sure that the weights of all convolution layers are initialized with some random tensors, and all convolution layers are learnable. That is, all convolution layers in prototxt should have something like

param { 
  lr_mult: {GREATER_THAN_ZERO} 
  decay_mult: ...
}
# if bias_term is true
param { 
  lr_mult: {GREATER_THAN_ZERO} 
  decay_mult: ...
}
convolution_param {
  ...
  weight_filler {
    # initialize weights with random values
    type: {gaussian, ...}
    ...
  }
}

@FredaZhang338
Copy link

@manipopopo ok~thanks I get it! And I have another question now, If I modified the train.prototxt for training, how about the test.prototxt? Do I need modify it before testing? add the wights. Thanks for your help!

@manipopopo
Copy link

If you only change lr_mult and *_filler, you can use the corresponding deploy.prototxt directly.

@fanw52
Copy link

fanw52 commented May 24, 2018

that's to say whether set the method to 'gt' or not,the network will use RPN ,and then get proposal ,thx @CassieMai @ujsyehao

@Ram-Godavarthi
Copy link

HI guys, I am getting this error when try to run the below script.
What would be the problem for this??

........~/FRCN_ROOT$ ./experiments/scripts/faster_rcnn_end2end.sh 0 VGG16

  • set -e
  • export PYTHONUNBUFFERED=True
  • PYTHONUNBUFFERED=True
  • GPU_ID=1
  • NET=VGG16
  • NET_lc=vgg16
  • DATASET=
  • array=($@)
  • len=2
  • EXTRA_ARGS=
  • EXTRA_ARGS_SLUG=
  • case $DATASET in
  • echo 'No dataset given'
    No dataset given
  • exit

Please let me know if someone know about it...
Thank You

@CassieMai
Copy link

@ram124 Please try to run the following command
$ ./experiments/scripts/faster_rcnn_end2end.sh gpu 0 VGG16 pascal_voc

@Ram-Godavarthi
Copy link

@CassieMai Thanks for your reply.
I have my own dataset..
I have it in
$ ./data/RAM_dataset/data
under this i have
/Annotation files,
/Images,
/ImageSets.

How should i specify the input file in the run command??

@Ram-Godavarthi
Copy link

HI @CassieMai ,
I have solved the above mentioned problem..
But i ma getting new problem.
While training the network on my own data.

I got this error..

What is the solution for this??

I0607 11:45:46.728519 2386 net.cpp:283] Network initialization done.
I0607 11:45:46.728889 2386 solver.cpp:60] Solver scaffolding done.
Loading pretrained model weights from ./data/faster_rcnn_models/ZF_faster_rcnn_final.caffemodel
I0607 11:45:48.512673 2386 net.cpp:816] Ignoring source layer data
F0607 11:45:48.516479 2386 net.cpp:829] Cannot copy param 0 weights from layer 'conv3'; shape mismatch. Source param shape is 384 256 3 3 (884736); target param shape is 512 256 3 3 (1179648). To learn this layer's parameters from scratch rather than copying from a saved net, rename the layer.
*** Check failure stack trace: ***
Aborted (core dumped)

I have 2 classes.
I have images of size 512 * 512..

Please help if someone knows this

@CassieMai
Copy link

@ram124 It looks that you did not use parameters of ZF net to initialize your network. Have you downloaded a pre-trained model of ZF net?

@Ram-Godavarthi
Copy link

@CassieMai
Yes..
I had used VGG parametrs..
But now i solved it..

BUt i am getting different error now.

Cannot copy param 0 weights from layer 'fc6'; shape mismatch. Source param shape is 4096 25088 (102760448); target param shape is 4096 18432 (75497472). To learn this layer's parameters from scratch rather than copying from a saved net, rename the layer.
*** Check failure stack trace: ***
Aborted (core dumped)

What is this??

@CassieMai
Copy link

Which backbone network are you using? ZF or VGG? Did you run a right .sh?

@Ram-Godavarthi
Copy link

@CassieMai
VGG

@CassieMai
Copy link

Maybe you can check your class num?

@Ram-Godavarthi
Copy link

@CassieMai
I have 2 classes.
in cls_score : num_output : 3
in bbox_pred : num_output : 12

should i change the image size in dim : below??
My image size are 512 * 512 gray scale

name: "VGG_CNN_M_1024"
input: "data"
input_shape {
dim: 1
dim: 3
dim: 224
dim: 224
}
input: "im_info"
input_shape {
dim: 1
dim: 3
}
layer {
name: "conv1"
type: "Convolution"
bottom: "data"
top: "conv1"
param {
lr_mult: 0
decay_mult: 0
}

@Ram-Godavarthi
Copy link

@CassieMai
I am able to train the network..

It is running.

607 12:33:20.665174 2694 solver.cpp:229] Iteration 2900, loss = 0.411904
I0607 12:33:20.665235 2694 solver.cpp:245] Train net output #0: loss_bbox = 0.0577442 (* 1 = 0.0577442 loss)
I0607 12:33:20.665252 2694 solver.cpp:245] Train net output #1: loss_cls = 0.176116 (* 1 = 0.176116 loss)
I0607 12:33:20.665264 2694 solver.cpp:245] Train net output #2: rpn_cls_loss = 0.160761 (* 1 = 0.160761 loss)
I0607 12:33:20.665277 2694 solver.cpp:245] Train net output #3: rpn_loss_bbox = 0.0172822 (* 1 = 0.0172822 loss)
I0607 12:33:20.665290 2694 sgd_solver.cpp:106] Iteration 2900, lr = 0.001
I0607 12:33:23.360194 2694 solver.cpp:229] Iteration 2920, loss = 0.618303
I0607 12:33:23.360254 2694 solver.cpp:245] Train net output #0: loss_bbox = 0.188877 (* 1 = 0.188877 loss)
I0607 12:33:23.360271 2694 solver.cpp:245] Train net output #1: loss_cls = 0.336819 (* 1 = 0.336819 loss)
I0607 12:33:23.360285 2694 solver.cpp:245] Train net output #2: rpn_cls_loss = 0.0918402 (* 1 = 0.0918402 loss)
I0607 12:33:23.360297 2694 solver.cpp:245] Train net output #3: rpn_loss_bbox = 0.000766937 (* 1 = 0.000766937 loss)
I0607 12:33:23.360309 2694 sgd_solver.cpp:106] Iteration 2920, lr = 0.001

Where can i see the output??

How to test it on other dataset??

how long does this training runs??
This is the solver.prototxt

train_net: "models/VGG16/faster_rcnn_end2end/train.prototxt"
base_lr: 0.001
lr_policy: "step"
gamma: 0.1
stepsize: 50000
display: 20
average_loss: 100
momentum: 0.9
weight_decay: 0.0005

We disable standard caffe solver snapshotting and implement our own snapshot

function

snapshot: 0

We still use the snapshot prefix, though

snapshot_prefix: "vgg_cnn_m_1024_faster_rcnn"

Please help..

@trungphan9x
Copy link

trungphan9x commented Jun 12, 2018

@manipopopo @tiepnh should I use a pretrained model with classes which is different totally from classes of the model I will train?

@manipopopo
Copy link

manipopopo commented Jun 13, 2018

Do Better ImageNet Models Transfer Better? carries out experiments on classification tasks. They show

ImageNet pretraining accelerates convergence and improves performance on
many datasets, but its value diminishes with greater training time, more training data, and greater divergence from ImageNet labels. For some fine-grained classification datasets, a few thousand
labeled examples, or a few dozen per class, are all that are needed to make training from scratch perform competitively with fine-tuning.

The effectness of transfer learning varies between datasets. Maybe you could try both of them (training from scratch and with pretrained models) and compare their performance on your validation dataset.

If you don't have enough resources, it seems that initializing weights from pretrained models is a good choice.

@Ram-Godavarthi
Copy link

@manipopopo
I ahve done trainig on 2 classses.
I am getting some goo doutput aswell.

But i m not able to display 2 objects in single frame..
When i run demo.py. I am getting only 1 object per image even though there 2 objects located in the image,. What is the problem?

Any help is really appreciated..

Thank You

@Ram-Godavarthi
Copy link

@tiepnh @manipopopo @CassieMai @karaspd
Hello guys,
i have question about detection.
I have 2 classes (Including background it is 3)
If i feed the network with only 1 object per image as training data.
100 images contains only A class and another 100 images contains B class.
so total there would be 200 images with 2 classes.

If i do training on them.
After training is done.

During testing,
If i give images with both classes (A & B classes) in it.
Would the network detect both the objects simultaneously??

Or should i fee the network with images having 2 objects????

Please clarify this.
Because i am getting only 1 object detected per image when i do testing (after feeding images with 1 object dataset).

Any help is really appreciated

@manipopopo
Copy link

When i run demo.py. I am getting only 1 object per image even though there 2 objects located in the image,. What is the problem?

If i give images with both classes (A & B classes) in it.

i am getting only 1 object detected per image

See https://github.com/rbgirshick/py-faster-rcnn/blob/master/tools/demo.py#L90-L98
The loop visualizes one class at a time.

@Ram-Godavarthi
Copy link

@manipopopo But how to get 2 detections in the same images
I have some 100's of images with 2 objects in it..
i want to detect all of them and save them somewhere. What should i change to make it happen?
Do you have any idea?

@manipopopo
Copy link

You can save dets from all iterations.
It seems to me that the question is a little bit off-topic.

@Ram-Godavarthi
Copy link

How to do it actually??
I tried all means of solutions.
But couldn't overcome it.
Can You share me the part of the code which is required for this..

@Ram-Godavarthi
Copy link

@tiepnh @manipopopo @CassieMai @karaspd I have a question regarding batch size. Can we use batch size of more than 1 in mxnet-rcnn training??
Because i have a large dataset of 15000 images.
if i do training on them , the speed : 2.35 sample/sec.
it takes almost 4 hours per epoch.
is there anyother way i could increase the speed??

Any help is really appreciated.

@zdgithub
Copy link

zdgithub commented Mar 6, 2019

Hi, @ram124 I have the similar problem:

Cannot copy param 0 weights from layer 'cls_score'; shape mismatch. Source param shape is 2 4096 (8192); target param shape is 21 4096 (86016). To learn this layer's parameters from scratch rather than copying from a saved net, rename the layer.
*** Check failure stack trace: ***
Aborted (core dumped)

How did you solve it?

@zdgithub
Copy link

zdgithub commented Mar 6, 2019

I have solved it.
Because I encountered it when running ./tools/demo.py, I change the num_output in py-faster-rcnn/models/pascal_voc/VGG16/faster_rcnn_alt_opt/faster_rcnn_test.pt and it works.

@LEXUSAPI
Copy link

@tolry418 : You should follow comment of @manipopopo before.
Or you can try to use this prototxt(from #56) :https://github.com/rbgirshick/py-faster-rcnn/files/380443/train_val.txt
Base on that, you don't need use pre-trained network any more

it is a good way to learning net .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests