Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

training error after 80000 iteration, only get zf_rpn_stage1_iter_80000.caffemodel and zf_rpn_stage1_iter_80000_proposals.pkl #626

Closed
nanzhixiong opened this issue Jul 6, 2017 · 13 comments

Comments

@nanzhixiong
Copy link

I am training a ZF model using voc 2007, appearing the error "TypeError: 'numpy.float64' object cannot be interpreted as an index" after 8000 iteration (with zf_rpn_stage1_iter_80000.caffemodel and zf_rpn_stage1_iter_80000_proposals.pkl finished).

I find the problem is caused by numpy , therefore, I install the numpy-1.11 by "sudo pip install -U numpy==1.11.0". however, another error "ImportError: numpy.core.multiarray failed to import" appears. This error can be solved by using "pip install -U numpy", which uninstall the numpy-1.11 and install the numpy-1.13.

The most boring problem is numpy-1.13 will agagin result in the error "TypeError: 'numpy.float64' object cannot be interpreted as an index" after 8000 iteration.

have someone met the same problem? how to solve it ? thanks

@eugene123tw
Copy link

Can you provide the whole error logs?
Otherwise it's hard to know where your problem is.

@nanzhixiong
Copy link
Author

@eugene123tw

The error is as follows:

Process Process-3:
Traceback (most recent call last):
File "/usr/lib/python2.7/multiprocessing/process.py", line 258, in _bootstrap
self.run()
File "/usr/lib/python2.7/multiprocessing/process.py", line 114, in run
self._target(*self._args, **self._kwargs)
File "./tools/train_faster_rcnn_alt_opt.py", line 195, in train_fast_rcnn
max_iters=max_iters)
File "/home/nan/faster rcnn/py-faster-rcnn/tools/../lib/fast_rcnn/train.py", line 161, in train_net
model_paths = sw.train_model(max_iters)
File "/home/nan/faster rcnn/py-faster-rcnn/tools/../lib/fast_rcnn/train.py", line 102, in train_model
self.solver.step(1)
File "/home/nan/faster rcnn/py-faster-rcnn/tools/../lib/roi_data_layer/layer.py", line 144, in forward
blobs = self._get_next_minibatch()
File "/home/nan/faster rcnn/py-faster-rcnn/tools/../lib/roi_data_layer/layer.py", line 63, in _get_next_minibatch
return get_minibatch(minibatch_db, self._num_classes)
File "/home/nan/faster rcnn/py-faster-rcnn/tools/../lib/roi_data_layer/minibatch.py", line 55, in get_minibatch
num_classes)
File "/home/nan/faster rcnn/py-faster-rcnn/tools/../lib/roi_data_layer/minibatch.py", line 100, in _sample_rois
fg_inds, size=fg_rois_per_this_image, replace=False)
File "mtrand.pyx", line 1187, in mtrand.RandomState.choice (numpy/random/mtrand/mtrand.c:18864)
TypeError: 'numpy.float64' object cannot be interpreted as an index

@eugene123tw
Copy link

Try to change line 100, in lib/roi_data_layer/minibatch.py

from fg_inds = npr.choice(fg_inds,size=fg_rois_per_this_image, replace=False)

to fg_inds = npr.choice(int(fg_inds),size=int(fg_rois_per_this_image), replace=False)

@nanzhixiong
Copy link
Author

@eugene123tw thank you very much for your kindness and patience. I follow your comment, the error is removed. However, another error appears:
"
Process Process-3:
Traceback (most recent call last):
File "/usr/lib/python2.7/multiprocessing/process.py", line 258, in _bootstrap
self.run()
File "/usr/lib/python2.7/multiprocessing/process.py", line 114, in run
self._target(*self._args, **self._kwargs)
File "./tools/train_faster_rcnn_alt_opt.py", line 195, in train_fast_rcnn
max_iters=max_iters)
File "/home/nan/faster rcnn/py-faster-rcnn/tools/../lib/fast_rcnn/train.py", line 161, in train_net
model_paths = sw.train_model(max_iters)
File "/home/nan/faster rcnn/py-faster-rcnn/tools/../lib/fast_rcnn/train.py", line 102, in train_model
self.solver.step(1)
File "/home/nan/faster rcnn/py-faster-rcnn/tools/../lib/roi_data_layer/layer.py", line 144, in forward
blobs = self._get_next_minibatch()
File "/home/nan/faster rcnn/py-faster-rcnn/tools/../lib/roi_data_layer/layer.py", line 63, in _get_next_minibatch
return get_minibatch(minibatch_db, self._num_classes)
File "/home/nan/faster rcnn/py-faster-rcnn/tools/../lib/roi_data_layer/minibatch.py", line 55, in get_minibatch
num_classes)
File "/home/nan/faster rcnn/py-faster-rcnn/tools/../lib/roi_data_layer/minibatch.py", line 120, in _sample_rois
labels[fg_rois_per_this_image:] = 0
TypeError: slice indices must be integers or None or have an index method
"
I am now studying on it.

@nanzhixiong
Copy link
Author

@eugene123tw I search for some materials which shows the problem is caused by the numpy,
if the numpy-1.11.0 is installed, the problem will be solved.
therefore, I install it: sudo pip install -U numpy==1.11.0

however, the following error appears after install numpy-1.11.0:
Traceback (most recent call last):
File "./tools/train_faster_rcnn_alt_opt.py", line 19, in
from datasets.factory import get_imdb
File "/home/nan/faster rcnn/py-faster-rcnn/tools/../lib/datasets/factory.py", line 13, in
from datasets.coco import coco
File "/home/nan/faster rcnn/py-faster-rcnn/tools/../lib/datasets/coco.py", line 20, in
from pycocotools.coco import COCO
File "/home/nan/faster rcnn/py-faster-rcnn/tools/../lib/pycocotools/coco.py", line 58, in
import mask
File "/home/nan/faster rcnn/py-faster-rcnn/tools/../lib/pycocotools/mask.py", line 3, in
import pycocotools._mask as _mask
File "pycocotools/_mask.pyx", line 20, in init pycocotools._mask (pycocotools/_mask.c:11197)
File "init.pxd", line 989, in numpy.import_array (pycocotools/_mask.c:10030)
ImportError: numpy.core.multiarray failed to import

@eugene123tw
Copy link

I setup Faster-RCNN using Python 2.7 and followed the installation guide.

Can you provide me your setup?
Like the version of Python you are using.

@eugene123tw
Copy link

Did you install cython?

@nanzhixiong
Copy link
Author

@eugene123tw cython has been installed. My configuration is ubuntu16.04, cuda 8.0, cudnn 5.1, python2.7. My setup combines mutiple installing guides. I find that the main issue is numpy, the code seems to be unfriendly with numpy-1.13.1.? @rbgirshick

@nanzhixiong
Copy link
Author

@eugene123tw thank you very much. I find a possible error in makefile.confiure.
I change the following line
PYTHON_INCLUDE := /usr/include/python2.7
/usr/lib/python2.7/dist-packages/numpy/core/include
to
PYTHON_INCLUDE := /usr/include/python2.7
/usr/lib/python2.7/dist-packages/numpy/core/include/numpy
I am trying it.

You are really helpful. My QQ number is 729887877

@eugene123tw
Copy link

@nanzhixiong I am glad it helped you. If you think the issue is solved, you can close it. Sorry, I don't have QQ. But you can add my LinkedIn or Facebook.

@nanzhixiong
Copy link
Author

The problem is solved under the help of @eugene123tw.
Here is the solution:
configure:
ubuntu 16.04, cudnn 5.1, opencv 3.2 ,cuda 8.0.

solution:
Try to change the file in lib/roi_data_layer/minibatch.py
change the following lines:
line 55
for im_i in xrange(num_images):
labels, overlaps, im_rois, bbox_targets, bbox_inside_weights
= _sample_rois(roidb[im_i], int(fg_rois_per_image), int(rois_per_image),
int(num_classes))
line 98
if fg_inds.size > 0:
fg_inds = npr.choice(
fg_inds, size=int(fg_rois_per_this_image), replace=False)

line 110
if bg_inds.size > 0:
bg_inds = npr.choice(
bg_inds, size=int(bg_rois_per_this_image), replace=False)

line 124
bbox_targets, bbox_inside_weights = _get_bbox_regression_labels(
roidb['bbox_targets'][keep_inds, :], int(num_classes))
add
start = int(start)
end = int (end)

after line 175

potential cause:
According to some materials, the error is caused by numpy.
the version higher than 1.11.0 may not support float. therefore int(*) is needed.

when appearing "TypeError: 'numpy.float64' object cannot be interpreted as an index"
someone advise to install numpy-1.11.0
which will lead to "ImportError: numpy.core.multiarray failed to import"
using "sudo pip install -U numpy"can help find the right version of numpy.
my version is numpy-1.13.1

@Xinying666
Copy link

I modified it according to your method.However,the following error appears:TypeError: slice indices must be integers or None or have an index method

@JFishLover
Copy link

completely right @nanzhixiong ,thx

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants