-
Notifications
You must be signed in to change notification settings - Fork 6.7k
[Fast-RCNN] Proposal operator failed: too many resources requested for launch #6204
Description
Environment info
Operating System: Ubuntu 16.04
Compiler: Gcc 5.4, cuDnn 5.1, cuda 8.0.44, Nvidia 375.39 (980Ti)
Package used (Python/R/Scala/Julia): Python
MXNet version: 0.9.5
MXNet commit hash (git rev-parse HEAD):
Python version and distribution: Python 2.7.12
Error Message:
Called with argument: Namespace(begin_epoch=7, dataset='PascalVOC', dataset_path='data/VOCdevkit', end_epoch=20, frequent=20, gpus='0', image_set='2007_trainval', kvstore='device', lr=0.001, lr_step='20', network='vgg', no_flip=False, no_shuffle=False, prefix='model/mx95', pretrained='model/vgg', pretrained_epoch=7, resume=False, root_path='data', work_load_list=None)
{'ANCHOR_RATIOS': [0.5, 1, 2],
'ANCHOR_SCALES': [8, 16, 32],
'FIXED_PARAMS': ['conv1', 'conv2'],
'FIXED_PARAMS_SHARED': ['conv1', 'conv2', 'conv3', 'conv4', 'conv5'],
'IMAGE_STRIDE': 0,
'NUM_ANCHORS': 9,
'NUM_CLASSES': 21,
'PIXEL_MEANS': array([ 103.939, 116.779, 123.68 ]),
'RCNN_FEAT_STRIDE': 16,
'RPN_FEAT_STRIDE': 16,
'SCALES': [(600, 1000)],
'TEST': {'BATCH_IMAGES': 1,
'CXX_PROPOSAL': True,
'HAS_RPN': False,
'NMS': 0.3,
'PROPOSAL_MIN_SIZE': 16,
'PROPOSAL_NMS_THRESH': 0.7,
'PROPOSAL_POST_NMS_TOP_N': 2000,
'PROPOSAL_PRE_NMS_TOP_N': 20000,
'RPN_MIN_SIZE': 16,
'RPN_NMS_THRESH': 0.7,
'RPN_POST_NMS_TOP_N': 300,
'RPN_PRE_NMS_TOP_N': 6000},
'TRAIN': {'ASPECT_GROUPING': True,
'BATCH_IMAGES': 1,
'BATCH_ROIS': 128,
'BBOX_MEANS': [0.0, 0.0, 0.0, 0.0],
'BBOX_NORMALIZATION_PRECOMPUTED': True,
'BBOX_REGRESSION_THRESH': 0.5,
'BBOX_STDS': [0.1, 0.1, 0.2, 0.2],
'BBOX_WEIGHTS': array([ 1., 1., 1., 1.]),
'BG_THRESH_HI': 0.5,
'BG_THRESH_LO': 0.0,
'CXX_PROPOSAL': True,
'END2END': True,
'FG_FRACTION': 0.25,
'FG_THRESH': 0.5,
'RPN_BATCH_SIZE': 256,
'RPN_BBOX_WEIGHTS': [1.0, 1.0, 1.0, 1.0],
'RPN_CLOBBER_POSITIVES': False,
'RPN_FG_FRACTION': 0.5,
'RPN_MIN_SIZE': 16,
'RPN_NEGATIVE_OVERLAP': 0.3,
'RPN_NMS_THRESH': 0.7,
'RPN_POSITIVE_OVERLAP': 0.7,
'RPN_POSITIVE_WEIGHT': -1.0,
'RPN_POST_NMS_TOP_N': 2000,
'RPN_PRE_NMS_TOP_N': 12000}}
num_images 472
voc_2007_trainval gt roidb loaded from data/cache/voc_2007_trainval_gt_roidb.pkl
append flipped images to roidb
filtered 0 roidb entries: 944 -> 944
providing maximum shape [('data', (1, 3, 600, 1000)), ('gt_boxes', (1, 100, 5))] [('label', (1, 20646)), ('bbox_target', (1, 36, 37, 62)), ('bbox_weight', (1, 36, 37, 62))]
output shape
{'bbox_loss_reshape_output': (1L, 128L, 84L),
'blockgrad0_output': (1L, 128L),
'cls_prob_reshape_output': (1L, 128L, 21L),
'rpn_bbox_loss_output': (1L, 36L, 37L, 38L),
'rpn_cls_prob_output': (1L, 2L, 333L, 38L)}
lr 0.001 lr_epoch_diff [13] lr_iters [12272]
[15:47:47] ....../src/operator/././cudnn_algoreg-inl.h:65: Running performance tests to find the best convolution algorithm, this can take a while... (setting env variable MXNET_CUDNN_AUTOTUNE_DEFAULT to 0 to disable)
[15:47:56]....../dmlc-core/include/dmlc/././logging.h:304: [15:47:56] ....../src/operator/contrib/proposal.cu:476: Check failed: error == cudaSuccess (7 vs. 0) too many resources requested for launch
Stack trace returned 10 entries:
[bt] (0) /usr/local/lib/python2.7/dist-packages/mxnet-0.9.5-py2.7.egg/mxnet/libmxnet.so(_ZN4dmlc15LogMessageFatalD1Ev+0x3f) [0x7f3c727f0a99]
[bt] (1) /usr/local/lib/python2.7/dist-packages/mxnet-0.9.5-py2.7.egg/mxnet/libmxnet.so(ZN5mxnet2op13ProposalGPUOpIN7mshadow3gpuEE7ForwardERKNS_9OpContextERKSt6vectorINS_5TBlobESaIS9_EERKS8_INS_9OpReqTypeESaISE_EESD_SD+0x17dd) [0x7f3c73add499]
........
........
........
Traceback (most recent call last):
File "train_end2end.py", line 185, in
main()
File "train_end2end.py", line 182, in main
lr=args.lr, lr_step=args.lr_step)
File "train_end2end.py", line 144, in train_net
arg_params=arg_params, aux_params=aux_params, begin_epoch=begin_epoch, num_epoch=end_epoch)
File "/usr/local/lib/python2.7/dist-packages/mxnet-0.9.5-py2.7.egg/mxnet/module/base_module.py", line 472, in fit
self.forward_backward(data_batch)
File "/usr/local/lib/python2.7/dist-packages/mxnet-0.9.5-py2.7.egg/mxnet/module/base_module.py", line 193, in forward_backward
self.forward(data_batch, is_train=True)
File "....../example/rcnn/rcnn/core/module.py", line 190, in forward
self._curr_module.forward(data_batch, is_train=is_train)
File "/usr/local/lib/python2.7/dist-packages/mxnet-0.9.5-py2.7.egg/mxnet/module/module.py", line 538, in forward
self.exec_group.forward(data_batch, is_train)
File "/usr/local/lib/python2.7/dist-packages/mxnet-0.9.5-py2.7.egg/mxnet/module/executor_group.py", line 379, in forward
exec.forward(is_train=is_train)
File "/usr/local/lib/python2.7/dist-packages/mxnet-0.9.5-py2.7.egg/mxnet/executor.py", line 133, in forward
ctypes.c_int(int(is_train))))
File "/usr/local/lib/python2.7/dist-packages/mxnet-0.9.5-py2.7.egg/mxnet/base.py", line 84, in check_call
raise MXNetError(py_str(_LIB.MXGetLastError()))
mxnet.base.MXNetError: [15:50:32] ......../src/operator/contrib/proposal.cu:476: Check failed: error == cudaSuccess (7 vs. 0) too many resources requested for launch
Minimum reproducible example
example/rcnn
Steps to reproduce
-
pwd: .../example/rcnn
-
cmdline:
python train_end2end.py --pretrained model/vgg --pretrained_epoch 7 --prefix model/mx95 --begin_epoch 7 --end_epoch 20 --lr_step 20 --gpus 0
What have you tried to solve it?
- Try multiple versions, e.g. 0.9.3 && 0.9.5 series, the same failure.
- set kMaxThreadsPerBlock (tensor_gpu-inl.cuh) to 512, cause another error.
- any hint?