error in training mobilenet #6

kaishijeng · 2017-12-22T19:27:27Z

@ruinmessi

I follow your instruction below to train VOC with mobilenet, but got an error:

python3 train_RFB.py -d VOC -v RFB_mobile -s 300
300 21
Traceback (most recent call last):
File "train_RFB.py", line 88, in
net = build_net('train', img_dim, num_classes)
File "/home/topspin/2TB/src/RFBNet/models/RFB_Net_mobile.py", line 348, in build_net
mbox[str(size)], num_classes), num_classes)
TypeError: init() missing 2 required positional arguments: 'head' and 'num_classes'

Any idea why this happens?

Thanks,

GOATmessi7 · 2017-12-23T01:19:43Z

@kaishijeng I can't reproduce this error, the training script works fine in my working env (Anaconda3, python3.6) . Could you give more details? Any modifications of my code? And your python environment?

kaishijeng · 2017-12-23T01:54:58Z

I didn't change your code and previous error may be due to python in make.sh uses python2.
So I change it to python3 in make.sh. Now I got different error below with the following training command:

python3 train_RFB.py -d VOC -v RFB_mobile -s 300 --basenet weights/mobilenet_feature.pth

RFBNet(
(base): ModuleList(
(0): Sequential(
(0): Conv2d (3, 32, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
(1): BatchNorm2d(32, eps=1e-05, momentum=0.1, affine=True)
(2): ReLU(inplace)
)
(1): Sequential(
(0): Conv2dDepthwise (32, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=32, bias=False)
(1): BatchNorm2d(32, eps=1e-05, momentum=0.1, affine=True)
(2): ReLU(inplace)
(3): Conv2d (32, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)
(4): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True)
(5): ReLU(inplace)
)
(2): Sequential(
(0): Conv2dDepthwise (64, 64, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), groups=64, bias=False)
(1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True)
(2): ReLU(inplace)
(3): Conv2d (64, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)
(4): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True)
(5): ReLU(inplace)
)
(3): Sequential(
(0): Conv2dDepthwise (128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=128, bias=False)
(1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True)
(2): ReLU(inplace)
(3): Conv2d (128, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)
(4): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True)
(5): ReLU(inplace)
)
(4): Sequential(
(0): Conv2dDepthwise (128, 128, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), groups=128, bias=False)
(1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True)
(2): ReLU(inplace)
(3): Conv2d (128, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(4): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True)
(5): ReLU(inplace)
)
(5): Sequential(
(0): Conv2dDepthwise (256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=256, bias=False)
(1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True)
(2): ReLU(inplace)
(3): Conv2d (256, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(4): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True)
(5): ReLU(inplace)
)
(6): Sequential(
(0): Conv2dDepthwise (256, 256, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), groups=256, bias=False)
(1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True)
(2): ReLU(inplace)
(3): Conv2d (256, 512, kernel_size=(1, 1), stride=(1, 1), bias=False)
(4): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True)
(5): ReLU(inplace)
)
(7): Sequential(
(0): Conv2dDepthwise (512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=512, bias=False)
(1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True)
(2): ReLU(inplace)
(3): Conv2d (512, 512, kernel_size=(1, 1), stride=(1, 1), bias=False)
(4): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True)
(5): ReLU(inplace)
)
(8): Sequential(
(0): Conv2dDepthwise (512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=512, bias=False)
(1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True)
(2): ReLU(inplace)
(3): Conv2d (512, 512, kernel_size=(1, 1), stride=(1, 1), bias=False)
(4): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True)
(5): ReLU(inplace)
)
(9): Sequential(
(0): Conv2dDepthwise (512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=512, bias=False)
(1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True)
(2): ReLU(inplace)
(3): Conv2d (512, 512, kernel_size=(1, 1), stride=(1, 1), bias=False)
(4): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True)
(5): ReLU(inplace)
)
(10): Sequential(
(0): Conv2dDepthwise (512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=512, bias=False)
(1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True)
(2): ReLU(inplace)
(3): Conv2d (512, 512, kernel_size=(1, 1), stride=(1, 1), bias=False)
(4): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True)
(5): ReLU(inplace)
)
(11): Sequential(
(0): Conv2dDepthwise (512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=512, bias=False)
(1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True)
(2): ReLU(inplace)
(3): Conv2d (512, 512, kernel_size=(1, 1), stride=(1, 1), bias=False)
(4): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True)
(5): ReLU(inplace)
)
(12): Sequential(
(0): Conv2dDepthwise (512, 512, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), groups=512, bias=False)
(1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True)
(2): ReLU(inplace)
(3): Conv2d (512, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False)
(4): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True)
(5): ReLU(inplace)
)
(13): Sequential(
(0): Conv2dDepthwise (1024, 1024, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=1024, bias=False)
(1): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True)
(2): ReLU(inplace)
(3): Conv2d (1024, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False)
(4): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True)
(5): ReLU(inplace)
)
)
(Norm): BasicRFB_a(
(branch0): Sequential(
(0): BasicConv(
(conv): Conv2d (512, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(128, eps=1e-05, momentum=0.01, affine=True)
(relu): ReLU(inplace)
)
(1): BasicSepConv(
(conv): Conv2dDepthwise (128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=128, bias=False)
(bn): BatchNorm2d(128, eps=1e-05, momentum=0.01, affine=True)
)
)
(branch1): Sequential(
(0): BasicConv(
(conv): Conv2d (512, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(128, eps=1e-05, momentum=0.01, affine=True)
(relu): ReLU(inplace)
)
(1): BasicConv(
(conv): Conv2d (128, 128, kernel_size=(3, 1), stride=(1, 1), padding=(1, 0), bias=False)
(bn): BatchNorm2d(128, eps=1e-05, momentum=0.01, affine=True)
(relu): ReLU(inplace)
)
(2): BasicSepConv(
(conv): Conv2dDepthwise (128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(3, 3), dilation=(3, 3), groups=128, bias=False)
(bn): BatchNorm2d(128, eps=1e-05, momentum=0.01, affine=True)
)
)
(branch2): Sequential(
(0): BasicConv(
(conv): Conv2d (512, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(128, eps=1e-05, momentum=0.01, affine=True)
(relu): ReLU(inplace)
)
(1): BasicConv(
(conv): Conv2d (128, 128, kernel_size=(1, 3), stride=(1, 1), padding=(0, 1), bias=False)
(bn): BatchNorm2d(128, eps=1e-05, momentum=0.01, affine=True)
(relu): ReLU(inplace)
)
(2): BasicSepConv(
(conv): Conv2dDepthwise (128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(3, 3), dilation=(3, 3), groups=128, bias=False)
(bn): BatchNorm2d(128, eps=1e-05, momentum=0.01, affine=True)
)
)
(branch3): Sequential(
(0): BasicConv(
(conv): Conv2d (512, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(64, eps=1e-05, momentum=0.01, affine=True)
(relu): ReLU(inplace)
)
(1): BasicConv(
(conv): Conv2d (64, 96, kernel_size=(1, 3), stride=(1, 1), padding=(0, 1), bias=False)
(bn): BatchNorm2d(96, eps=1e-05, momentum=0.01, affine=True)
(relu): ReLU(inplace)
)
(2): BasicConv(
(conv): Conv2d (96, 128, kernel_size=(3, 1), stride=(1, 1), padding=(1, 0), bias=False)
(bn): BatchNorm2d(128, eps=1e-05, momentum=0.01, affine=True)
(relu): ReLU(inplace)
)
(3): BasicSepConv(
(conv): Conv2dDepthwise (128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(5, 5), dilation=(5, 5), groups=128, bias=False)
(bn): BatchNorm2d(128, eps=1e-05, momentum=0.01, affine=True)
)
)
(ConvLinear): BasicConv(
(conv): Conv2d (512, 512, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(512, eps=1e-05, momentum=0.01, affine=True)
)
(relu): ReLU()
)
(extras): ModuleList(
(0): BasicRFB(
(branch1): Sequential(
(0): BasicConv(
(conv): Conv2d (1024, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(128, eps=1e-05, momentum=0.01, affine=True)
(relu): ReLU(inplace)
)
(1): BasicConv(
(conv): Conv2d (128, 192, kernel_size=(1, 3), stride=(1, 1), padding=(0, 1), bias=False)
(bn): BatchNorm2d(192, eps=1e-05, momentum=0.01, affine=True)
(relu): ReLU(inplace)
)
(2): BasicConv(
(conv): Conv2d (192, 192, kernel_size=(3, 1), stride=(2, 2), padding=(1, 0), bias=False)
(bn): BatchNorm2d(192, eps=1e-05, momentum=0.01, affine=True)
(relu): ReLU(inplace)
)
(3): BasicSepConv(
(conv): Conv2dDepthwise (192, 192, kernel_size=(3, 3), stride=(1, 1), padding=(3, 3), dilation=(3, 3), groups=192, bias=False)
(bn): BatchNorm2d(192, eps=1e-05, momentum=0.01, affine=True)
)
)
(branch2): Sequential(
(0): BasicConv(
(conv): Conv2d (1024, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(128, eps=1e-05, momentum=0.01, affine=True)
(relu): ReLU(inplace)
)
(1): BasicConv(
(conv): Conv2d (128, 192, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn): BatchNorm2d(192, eps=1e-05, momentum=0.01, affine=True)
(relu): ReLU(inplace)
)
(2): BasicConv(
(conv): Conv2d (192, 192, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
(bn): BatchNorm2d(192, eps=1e-05, momentum=0.01, affine=True)
(relu): ReLU(inplace)
)
(3): BasicSepConv(
(conv): Conv2dDepthwise (192, 192, kernel_size=(3, 3), stride=(1, 1), padding=(5, 5), dilation=(5, 5), groups=192, bias=False)
(bn): BatchNorm2d(192, eps=1e-05, momentum=0.01, affine=True)
)
)
(ConvLinear): BasicConv(
(conv): Conv2d (384, 512, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(512, eps=1e-05, momentum=0.01, affine=True)
)
(shortcut): BasicConv(
(conv): Conv2d (1024, 512, kernel_size=(1, 1), stride=(2, 2), bias=False)
(bn): BatchNorm2d(512, eps=1e-05, momentum=0.01, affine=True)
)
(relu): ReLU()
)
(1): BasicConv(
(conv): Conv2d (512, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(128, eps=1e-05, momentum=0.01, affine=True)
(relu): ReLU(inplace)
)
(2): BasicConv(
(conv): Conv2d (128, 256, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
(bn): BatchNorm2d(256, eps=1e-05, momentum=0.01, affine=True)
(relu): ReLU(inplace)
)
(3): BasicConv(
(conv): Conv2d (256, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(128, eps=1e-05, momentum=0.01, affine=True)
(relu): ReLU(inplace)
)
(4): BasicConv(
(conv): Conv2d (128, 256, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
(bn): BatchNorm2d(256, eps=1e-05, momentum=0.01, affine=True)
(relu): ReLU(inplace)
)
(5): BasicConv(
(conv): Conv2d (256, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(64, eps=1e-05, momentum=0.01, affine=True)
(relu): ReLU(inplace)
)
(6): BasicConv(
(conv): Conv2d (64, 128, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
(bn): BatchNorm2d(128, eps=1e-05, momentum=0.01, affine=True)
(relu): ReLU(inplace)
)
)
(loc): ModuleList(
(0): Conv2d (512, 24, kernel_size=(1, 1), stride=(1, 1))
(1): Conv2d (1024, 24, kernel_size=(1, 1), stride=(1, 1))
(2): Conv2d (512, 24, kernel_size=(1, 1), stride=(1, 1))
(3): Conv2d (256, 24, kernel_size=(1, 1), stride=(1, 1))
(4): Conv2d (256, 16, kernel_size=(1, 1), stride=(1, 1))
(5): Conv2d (128, 16, kernel_size=(1, 1), stride=(1, 1))
)
(conf): ModuleList(
(0): Conv2d (512, 126, kernel_size=(1, 1), stride=(1, 1))
(1): Conv2d (1024, 126, kernel_size=(1, 1), stride=(1, 1))
(2): Conv2d (512, 126, kernel_size=(1, 1), stride=(1, 1))
(3): Conv2d (256, 126, kernel_size=(1, 1), stride=(1, 1))
(4): Conv2d (256, 84, kernel_size=(1, 1), stride=(1, 1))
(5): Conv2d (128, 84, kernel_size=(1, 1), stride=(1, 1))
)
)
Loading base network...
Initializing weights...
Loading Dataset...
Training RFB_mobile on VOC0712
Traceback (most recent call last):
File "train_RFB.py", line 253, in
train()
File "train_RFB.py", line 219, in train
loss_l, loss_c = criterion(out, priors, targets)
File "/usr/local/lib/python3.5/dist-packages/torch/nn/modules/module.py", line 325, in call
result = self.forward(*input, **kwargs)
File "/home/fc/2TB/src/RFBNet/layers/modules/multibox_loss.py", line 96, in forward
loss_c[pos] = 0 # filter out pos boxes for now
RuntimeError: The shape of the mask [4, 2990] at index 0 does not match the shape of the indexed tensor [11960, 1] at index 0

GOATmessi7 · 2017-12-23T06:20:03Z

@kaishijeng It seems your pytorch version doesn't support that indexing. Change the line 96 in multibox_loss.py from "loss_c[pos] = 0" to "loss_c[pos.view(-1)]" should fix this incompatibility.
By the way, your mask shape is [4, 2990], means your batch size is 4? For a better performance, at least 32 batch size is recommended.

GOATmessi7 · 2017-12-23T06:23:34Z

@kaishijeng If you still suffer from the training script, update the pytorch to 0.3.0 and try again.

kaishijeng · 2017-12-23T16:19:13Z

@ruinmessi

It works now after changing the line 96 in multibox_loss.py from "loss_c[pos] = 0" to "loss_c[pos.view(-1)]"

Thanks

liangxi627 · 2018-09-13T04:59:59Z

@kaishijeng Hello, I change python to python3 in make.sh and meet the following error when running make.sh, do you know how to solve it? Thank you ~

running build_ext
skipping 'nms/cpu_nms.c' Cython extension (up-to-date)
building 'nms.cpu_nms' extension
creating build/temp.linux-x86_64-3.5
creating build/temp.linux-x86_64-3.5/nms
{'gcc': ['-Wno-cpp', '-Wno-unused-function']}
x86_64-linux-gnu-gcc -pthread -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -g -fstack-protector --param=ssp-buffer-size=4 -Wformat -Werror=format-security -D_FORTIFY_SOURCE=2 -fPIC -I/usr/local/lib/python3.5/dist-packages/numpy/core/include -I/usr/include/python3.5m -c nms/cpu_nms.c -o build/temp.linux-x86_64-3.5/nms/cpu_nms.o -Wno-cpp -Wno-unused-function
nms/cpu_nms.c:4:20: fatal error: Python.h: 没有那个文件或目录
#include "Python.h"
^
compilation terminated.
error: command 'x86_64-linux-gnu-gcc' failed with exit status 1

kaishijeng closed this as completed Dec 23, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

error in training mobilenet #6

error in training mobilenet #6

kaishijeng commented Dec 22, 2017

GOATmessi7 commented Dec 23, 2017

kaishijeng commented Dec 23, 2017

GOATmessi7 commented Dec 23, 2017

GOATmessi7 commented Dec 23, 2017

kaishijeng commented Dec 23, 2017

liangxi627 commented Sep 13, 2018

error in training mobilenet #6

error in training mobilenet #6

Comments

kaishijeng commented Dec 22, 2017

GOATmessi7 commented Dec 23, 2017

kaishijeng commented Dec 23, 2017

GOATmessi7 commented Dec 23, 2017

GOATmessi7 commented Dec 23, 2017

kaishijeng commented Dec 23, 2017

liangxi627 commented Sep 13, 2018