Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

error in training mobilenet #6

Closed
kaishijeng opened this issue Dec 22, 2017 · 6 comments
Closed

error in training mobilenet #6

kaishijeng opened this issue Dec 22, 2017 · 6 comments

Comments

@kaishijeng
Copy link

@ruinmessi

I follow your instruction below to train VOC with mobilenet, but got an error:

python3 train_RFB.py -d VOC -v RFB_mobile -s 300
300 21
Traceback (most recent call last):
File "train_RFB.py", line 88, in
net = build_net('train', img_dim, num_classes)
File "/home/topspin/2TB/src/RFBNet/models/RFB_Net_mobile.py", line 348, in build_net
mbox[str(size)], num_classes), num_classes)
TypeError: init() missing 2 required positional arguments: 'head' and 'num_classes'

Any idea why this happens?

Thanks,

@GOATmessi7
Copy link
Owner

@kaishijeng I can't reproduce this error, the training script works fine in my working env (Anaconda3, python3.6) . Could you give more details? Any modifications of my code? And your python environment?

@kaishijeng
Copy link
Author

I didn't change your code and previous error may be due to python in make.sh uses python2.
So I change it to python3 in make.sh. Now I got different error below with the following training command:

python3 train_RFB.py -d VOC -v RFB_mobile -s 300 --basenet weights/mobilenet_feature.pth

RFBNet(
(base): ModuleList(
(0): Sequential(
(0): Conv2d (3, 32, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
(1): BatchNorm2d(32, eps=1e-05, momentum=0.1, affine=True)
(2): ReLU(inplace)
)
(1): Sequential(
(0): Conv2dDepthwise (32, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=32, bias=False)
(1): BatchNorm2d(32, eps=1e-05, momentum=0.1, affine=True)
(2): ReLU(inplace)
(3): Conv2d (32, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)
(4): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True)
(5): ReLU(inplace)
)
(2): Sequential(
(0): Conv2dDepthwise (64, 64, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), groups=64, bias=False)
(1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True)
(2): ReLU(inplace)
(3): Conv2d (64, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)
(4): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True)
(5): ReLU(inplace)
)
(3): Sequential(
(0): Conv2dDepthwise (128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=128, bias=False)
(1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True)
(2): ReLU(inplace)
(3): Conv2d (128, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)
(4): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True)
(5): ReLU(inplace)
)
(4): Sequential(
(0): Conv2dDepthwise (128, 128, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), groups=128, bias=False)
(1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True)
(2): ReLU(inplace)
(3): Conv2d (128, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(4): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True)
(5): ReLU(inplace)
)
(5): Sequential(
(0): Conv2dDepthwise (256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=256, bias=False)
(1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True)
(2): ReLU(inplace)
(3): Conv2d (256, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(4): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True)
(5): ReLU(inplace)
)
(6): Sequential(
(0): Conv2dDepthwise (256, 256, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), groups=256, bias=False)
(1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True)
(2): ReLU(inplace)
(3): Conv2d (256, 512, kernel_size=(1, 1), stride=(1, 1), bias=False)
(4): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True)
(5): ReLU(inplace)
)
(7): Sequential(
(0): Conv2dDepthwise (512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=512, bias=False)
(1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True)
(2): ReLU(inplace)
(3): Conv2d (512, 512, kernel_size=(1, 1), stride=(1, 1), bias=False)
(4): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True)
(5): ReLU(inplace)
)
(8): Sequential(
(0): Conv2dDepthwise (512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=512, bias=False)
(1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True)
(2): ReLU(inplace)
(3): Conv2d (512, 512, kernel_size=(1, 1), stride=(1, 1), bias=False)
(4): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True)
(5): ReLU(inplace)
)
(9): Sequential(
(0): Conv2dDepthwise (512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=512, bias=False)
(1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True)
(2): ReLU(inplace)
(3): Conv2d (512, 512, kernel_size=(1, 1), stride=(1, 1), bias=False)
(4): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True)
(5): ReLU(inplace)
)
(10): Sequential(
(0): Conv2dDepthwise (512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=512, bias=False)
(1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True)
(2): ReLU(inplace)
(3): Conv2d (512, 512, kernel_size=(1, 1), stride=(1, 1), bias=False)
(4): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True)
(5): ReLU(inplace)
)
(11): Sequential(
(0): Conv2dDepthwise (512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=512, bias=False)
(1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True)
(2): ReLU(inplace)
(3): Conv2d (512, 512, kernel_size=(1, 1), stride=(1, 1), bias=False)
(4): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True)
(5): ReLU(inplace)
)
(12): Sequential(
(0): Conv2dDepthwise (512, 512, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), groups=512, bias=False)
(1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True)
(2): ReLU(inplace)
(3): Conv2d (512, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False)
(4): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True)
(5): ReLU(inplace)
)
(13): Sequential(
(0): Conv2dDepthwise (1024, 1024, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=1024, bias=False)
(1): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True)
(2): ReLU(inplace)
(3): Conv2d (1024, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False)
(4): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True)
(5): ReLU(inplace)
)
)
(Norm): BasicRFB_a(
(branch0): Sequential(
(0): BasicConv(
(conv): Conv2d (512, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(128, eps=1e-05, momentum=0.01, affine=True)
(relu): ReLU(inplace)
)
(1): BasicSepConv(
(conv): Conv2dDepthwise (128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=128, bias=False)
(bn): BatchNorm2d(128, eps=1e-05, momentum=0.01, affine=True)
)
)
(branch1): Sequential(
(0): BasicConv(
(conv): Conv2d (512, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(128, eps=1e-05, momentum=0.01, affine=True)
(relu): ReLU(inplace)
)
(1): BasicConv(
(conv): Conv2d (128, 128, kernel_size=(3, 1), stride=(1, 1), padding=(1, 0), bias=False)
(bn): BatchNorm2d(128, eps=1e-05, momentum=0.01, affine=True)
(relu): ReLU(inplace)
)
(2): BasicSepConv(
(conv): Conv2dDepthwise (128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(3, 3), dilation=(3, 3), groups=128, bias=False)
(bn): BatchNorm2d(128, eps=1e-05, momentum=0.01, affine=True)
)
)
(branch2): Sequential(
(0): BasicConv(
(conv): Conv2d (512, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(128, eps=1e-05, momentum=0.01, affine=True)
(relu): ReLU(inplace)
)
(1): BasicConv(
(conv): Conv2d (128, 128, kernel_size=(1, 3), stride=(1, 1), padding=(0, 1), bias=False)
(bn): BatchNorm2d(128, eps=1e-05, momentum=0.01, affine=True)
(relu): ReLU(inplace)
)
(2): BasicSepConv(
(conv): Conv2dDepthwise (128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(3, 3), dilation=(3, 3), groups=128, bias=False)
(bn): BatchNorm2d(128, eps=1e-05, momentum=0.01, affine=True)
)
)
(branch3): Sequential(
(0): BasicConv(
(conv): Conv2d (512, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(64, eps=1e-05, momentum=0.01, affine=True)
(relu): ReLU(inplace)
)
(1): BasicConv(
(conv): Conv2d (64, 96, kernel_size=(1, 3), stride=(1, 1), padding=(0, 1), bias=False)
(bn): BatchNorm2d(96, eps=1e-05, momentum=0.01, affine=True)
(relu): ReLU(inplace)
)
(2): BasicConv(
(conv): Conv2d (96, 128, kernel_size=(3, 1), stride=(1, 1), padding=(1, 0), bias=False)
(bn): BatchNorm2d(128, eps=1e-05, momentum=0.01, affine=True)
(relu): ReLU(inplace)
)
(3): BasicSepConv(
(conv): Conv2dDepthwise (128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(5, 5), dilation=(5, 5), groups=128, bias=False)
(bn): BatchNorm2d(128, eps=1e-05, momentum=0.01, affine=True)
)
)
(ConvLinear): BasicConv(
(conv): Conv2d (512, 512, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(512, eps=1e-05, momentum=0.01, affine=True)
)
(relu): ReLU()
)
(extras): ModuleList(
(0): BasicRFB(
(branch1): Sequential(
(0): BasicConv(
(conv): Conv2d (1024, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(128, eps=1e-05, momentum=0.01, affine=True)
(relu): ReLU(inplace)
)
(1): BasicConv(
(conv): Conv2d (128, 192, kernel_size=(1, 3), stride=(1, 1), padding=(0, 1), bias=False)
(bn): BatchNorm2d(192, eps=1e-05, momentum=0.01, affine=True)
(relu): ReLU(inplace)
)
(2): BasicConv(
(conv): Conv2d (192, 192, kernel_size=(3, 1), stride=(2, 2), padding=(1, 0), bias=False)
(bn): BatchNorm2d(192, eps=1e-05, momentum=0.01, affine=True)
(relu): ReLU(inplace)
)
(3): BasicSepConv(
(conv): Conv2dDepthwise (192, 192, kernel_size=(3, 3), stride=(1, 1), padding=(3, 3), dilation=(3, 3), groups=192, bias=False)
(bn): BatchNorm2d(192, eps=1e-05, momentum=0.01, affine=True)
)
)
(branch2): Sequential(
(0): BasicConv(
(conv): Conv2d (1024, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(128, eps=1e-05, momentum=0.01, affine=True)
(relu): ReLU(inplace)
)
(1): BasicConv(
(conv): Conv2d (128, 192, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn): BatchNorm2d(192, eps=1e-05, momentum=0.01, affine=True)
(relu): ReLU(inplace)
)
(2): BasicConv(
(conv): Conv2d (192, 192, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
(bn): BatchNorm2d(192, eps=1e-05, momentum=0.01, affine=True)
(relu): ReLU(inplace)
)
(3): BasicSepConv(
(conv): Conv2dDepthwise (192, 192, kernel_size=(3, 3), stride=(1, 1), padding=(5, 5), dilation=(5, 5), groups=192, bias=False)
(bn): BatchNorm2d(192, eps=1e-05, momentum=0.01, affine=True)
)
)
(ConvLinear): BasicConv(
(conv): Conv2d (384, 512, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(512, eps=1e-05, momentum=0.01, affine=True)
)
(shortcut): BasicConv(
(conv): Conv2d (1024, 512, kernel_size=(1, 1), stride=(2, 2), bias=False)
(bn): BatchNorm2d(512, eps=1e-05, momentum=0.01, affine=True)
)
(relu): ReLU()
)
(1): BasicConv(
(conv): Conv2d (512, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(128, eps=1e-05, momentum=0.01, affine=True)
(relu): ReLU(inplace)
)
(2): BasicConv(
(conv): Conv2d (128, 256, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
(bn): BatchNorm2d(256, eps=1e-05, momentum=0.01, affine=True)
(relu): ReLU(inplace)
)
(3): BasicConv(
(conv): Conv2d (256, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(128, eps=1e-05, momentum=0.01, affine=True)
(relu): ReLU(inplace)
)
(4): BasicConv(
(conv): Conv2d (128, 256, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
(bn): BatchNorm2d(256, eps=1e-05, momentum=0.01, affine=True)
(relu): ReLU(inplace)
)
(5): BasicConv(
(conv): Conv2d (256, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(64, eps=1e-05, momentum=0.01, affine=True)
(relu): ReLU(inplace)
)
(6): BasicConv(
(conv): Conv2d (64, 128, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
(bn): BatchNorm2d(128, eps=1e-05, momentum=0.01, affine=True)
(relu): ReLU(inplace)
)
)
(loc): ModuleList(
(0): Conv2d (512, 24, kernel_size=(1, 1), stride=(1, 1))
(1): Conv2d (1024, 24, kernel_size=(1, 1), stride=(1, 1))
(2): Conv2d (512, 24, kernel_size=(1, 1), stride=(1, 1))
(3): Conv2d (256, 24, kernel_size=(1, 1), stride=(1, 1))
(4): Conv2d (256, 16, kernel_size=(1, 1), stride=(1, 1))
(5): Conv2d (128, 16, kernel_size=(1, 1), stride=(1, 1))
)
(conf): ModuleList(
(0): Conv2d (512, 126, kernel_size=(1, 1), stride=(1, 1))
(1): Conv2d (1024, 126, kernel_size=(1, 1), stride=(1, 1))
(2): Conv2d (512, 126, kernel_size=(1, 1), stride=(1, 1))
(3): Conv2d (256, 126, kernel_size=(1, 1), stride=(1, 1))
(4): Conv2d (256, 84, kernel_size=(1, 1), stride=(1, 1))
(5): Conv2d (128, 84, kernel_size=(1, 1), stride=(1, 1))
)
)
Loading base network...
Initializing weights...
Loading Dataset...
Training RFB_mobile on VOC0712
Traceback (most recent call last):
File "train_RFB.py", line 253, in
train()
File "train_RFB.py", line 219, in train
loss_l, loss_c = criterion(out, priors, targets)
File "/usr/local/lib/python3.5/dist-packages/torch/nn/modules/module.py", line 325, in call
result = self.forward(*input, **kwargs)
File "/home/fc/2TB/src/RFBNet/layers/modules/multibox_loss.py", line 96, in forward
loss_c[pos] = 0 # filter out pos boxes for now
RuntimeError: The shape of the mask [4, 2990] at index 0 does not match the shape of the indexed tensor [11960, 1] at index 0

@GOATmessi7
Copy link
Owner

@kaishijeng It seems your pytorch version doesn't support that indexing. Change the line 96 in multibox_loss.py from "loss_c[pos] = 0" to "loss_c[pos.view(-1)]" should fix this incompatibility.
By the way, your mask shape is [4, 2990], means your batch size is 4? For a better performance, at least 32 batch size is recommended.

@GOATmessi7
Copy link
Owner

@kaishijeng If you still suffer from the training script, update the pytorch to 0.3.0 and try again.

@kaishijeng
Copy link
Author

@ruinmessi

It works now after changing the line 96 in multibox_loss.py from "loss_c[pos] = 0" to "loss_c[pos.view(-1)]"

Thanks

@liangxi627
Copy link

@kaishijeng Hello, I change python to python3 in make.sh and meet the following error when running make.sh, do you know how to solve it? Thank you ~

running build_ext
skipping 'nms/cpu_nms.c' Cython extension (up-to-date)
building 'nms.cpu_nms' extension
creating build/temp.linux-x86_64-3.5
creating build/temp.linux-x86_64-3.5/nms
{'gcc': ['-Wno-cpp', '-Wno-unused-function']}
x86_64-linux-gnu-gcc -pthread -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -g -fstack-protector --param=ssp-buffer-size=4 -Wformat -Werror=format-security -D_FORTIFY_SOURCE=2 -fPIC -I/usr/local/lib/python3.5/dist-packages/numpy/core/include -I/usr/include/python3.5m -c nms/cpu_nms.c -o build/temp.linux-x86_64-3.5/nms/cpu_nms.o -Wno-cpp -Wno-unused-function
nms/cpu_nms.c:4:20: fatal error: Python.h: 没有那个文件或目录
#include "Python.h"
^
compilation terminated.
error: command 'x86_64-linux-gnu-gcc' failed with exit status 1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants