Check failed: outer_num_ * inner_num_ == bottom[1]->count() #15

Timo-hab · 2016-09-10T23:14:45Z

I want to train the context module, but get the following error:

F0911 01:01:41.267956 15432 softmax_loss_layer.cpp:47] Check failed: outer_num_ * inner_num_ == bottom[1]->count() (3276800 vs. 435600) Number of labels must match number of predictions; e.g., if softmax axis == 1 and prediction shape is (N, C, H, W), label count (number of labels) must be N_H_W, with integer values in {0, 1, ..., C-1}.

I followed the documentation "training.md". First, i had trained the front-end module and than generated the .bin files from test.py (and the feats.txt).
I use the following to start the training:

python train.py context \
--train_image /home/timo/dilation/feat/train/feats.txt \
--train_label /home/timo/Cityscapes/gtFine/train/train_city_gt.txt \
--test_image /home/timo/dilation/feat/val/feats.txt \
--test_label /home/timo/Cityscapes/gtFine/val/val_city_gt.txt \
--train_batch 100 \
--test_batch 10 \
--caffe /home/timo/dilation/caffe-dilation/build_master_release/tools/caffe \
--classes 19 \
--layers 10 \
--label_shape 66 66
--lr 0.0001
--momentum 0.99

I am grateful for every tip.

The text was updated successfully, but these errors were encountered:

fyu · 2016-09-11T00:23:51Z

You have to change label_shape the the dimensions of feature bin files.

Timo-hab · 2016-09-11T19:43:17Z

The dimensions of feature bin files are (19, 128, 256).
So, when i change --label_shape to 128 256, the error message changes to that values:

F0911 21:38:15.349256 4092 softmax_loss_layer.cpp:47] Check failed: outer_num_ * inner_num_ == bottom[1]->count() (348480 vs. 327680) Number of labels must match number of predictions; e.g., if softmax axis == 1 and prediction shape is (N, C, H, W), label count (number of labels) must be N_H_W, with integer values in {0, 1, ..., C-1}.

Timo-hab · 2016-09-13T19:27:06Z

I am stuck at this point. After running train.py using:

python train.py context \
--train_image /home/timo/dilation/feat/train/feats.txt \
--train_label /home/timo/Cityscapes/gtFine/train/train_city_gt.txt \
--test_image /home/timo/dilation/feat/val/feats.txt \
--test_label /home/timo/Cityscapes/gtFine/val/val_city_gt.txt \
--train_batch 100 \
--test_batch 10 \
--caffe /home/timo/dilation/caffe-dilation/build_master_release/tools/caffe \
--classes 19 \
--layers 10 \
--label_shape 128 256
--lr 0.0001
--momentum 0.99

these lines appears:

0913 21:16:07.667611 20652 layer_factory.hpp:77] Creating layer data
I0913 21:16:07.667631 20652 net.cpp:91] Creating Layer data
I0913 21:16:07.667639 20652 net.cpp:399] data -> data
I0913 21:16:07.667650 20652 net.cpp:399] data -> label
I0913 21:16:07.667660 20652 bin_label_data_layer.cpp:367] Opening bin list /home/timo/dilation/feat/val/feats.txt
I0913 21:16:07.667742 20652 bin_label_data_layer.cpp:376] Opening label list /home/timo/Cityscapes/gtFine/val/val_city_gt.txt
I0913 21:16:07.667809 20652 bin_label_data_layer.cpp:387] Shuffling data
I0913 21:16:07.667831 20652 bin_label_data_layer.cpp:392] A total of 233 images.
I0913 21:16:07.673760 20652 bin_label_data_layer.cpp:421] output data size: 10,19,132,264
I0913 21:16:07.673775 20652 bin_label_data_layer.cpp:425] output label size: 10,1,128,256

When i change the label_shape to 132 264, i get the following lines:

I0913 21:24:02.122853 20724 layer_factory.hpp:77] Creating layer data
I0913 21:24:02.122891 20724 net.cpp:91] Creating Layer data
I0913 21:24:02.122901 20724 net.cpp:399] data -> data
I0913 21:24:02.122927 20724 net.cpp:399] data -> label
I0913 21:24:02.122947 20724 bin_label_data_layer.cpp:367] Opening bin list /home/timo/dilation/feat/train/feats.txt
I0913 21:24:02.123760 20724 bin_label_data_layer.cpp:376] Opening label list /home/timo/Cityscapes/gtFine/train/train_city_gt.txt
I0913 21:24:02.124497 20724 bin_label_data_layer.cpp:387] Shuffling data
I0913 21:24:02.125047 20724 bin_label_data_layer.cpp:392] A total of 2975 images.
I0913 21:24:02.152776 20724 bin_label_data_layer.cpp:421] output data size: 100,19,128,256
I0913 21:24:02.152806 20724 bin_label_data_layer.cpp:425] output label size: 100,1,132,264

The output data size has changed curiously. What could be the reason for that?

fyu · 2016-09-13T19:39:57Z

In the log, it says your bin size is 132 by 264. So if you set the label_shape to the same dimension, it should solve your problem.

Timo-hab · 2016-09-13T19:58:07Z

After I posted this, I noticed that the path between the both logs are different. Obviously the feature bin file generation did not run under the same conditions (for test and train). After i repeated the bin file generation (using test.py) the context training starts without errors. :)

huangh12 · 2017-04-03T10:55:10Z

@Timo-hab Hello, I am trying to train the front-end model myself. But the vgg caffemodel downloaded from website is not fully convolutional. That's to say, the fc6 and fc7 are fully connected which leads to following shape mismatch problem

Could do you tell me how you get the weight successfully loaded?
Thank you!

Timo-hab closed this as completed Sep 13, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Check failed: outer_num_ * inner_num_ == bottom[1]->count() #15

Check failed: outer_num_ * inner_num_ == bottom[1]->count() #15

Timo-hab commented Sep 10, 2016

fyu commented Sep 11, 2016

Timo-hab commented Sep 11, 2016

Timo-hab commented Sep 13, 2016 •

edited

fyu commented Sep 13, 2016

Timo-hab commented Sep 13, 2016 •

edited

huangh12 commented Apr 3, 2017

Check failed: outer_num_ * inner_num_ == bottom[1]->count() #15

Check failed: outer_num_ * inner_num_ == bottom[1]->count() #15

Comments

Timo-hab commented Sep 10, 2016

fyu commented Sep 11, 2016

Timo-hab commented Sep 11, 2016

Timo-hab commented Sep 13, 2016 • edited

fyu commented Sep 13, 2016

Timo-hab commented Sep 13, 2016 • edited

huangh12 commented Apr 3, 2017

Timo-hab commented Sep 13, 2016 •

edited

Timo-hab commented Sep 13, 2016 •

edited