-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Runtime error occurs when i train my own data #13
Comments
trainer.py line 147
you may print |
I'll close it for now, feel free to reopen it if you have any questions. |
I have the same error when I try to train on my own dataset. Would you please to share how you solved this problem? THANKS |
@chenyuntc I try to print the roi_cls_loc and gt_roi_label, but it print nothing |
I have same problem too = = ` |
when i run python3 train.py train,
======user config========
{'caffe_pretrain': False,
'caffe_pretrain_path': '/home/garcons/simple-faster-rcnn-pytorch/fasterrcnn_12211511_0.701052458187_torchvision_pretrain.pth',
'data': 'voc',
'debug_file': '/tmp/debugf',
'env': 'faster-rcnn',
'epoch': 14,
'load_path': None,
'lr': 0.001,
'lr_decay': 0.1,
'max_size': 1000,
'min_size': 400,
'num_workers': 4,
'plot_every': 40,
'port': 8097,
'pretrained_model': 'vgg16',
'roi_sigma': 1.0,
'rpn_sigma': 3.0,
'test_num': 1000,
'test_num_workers': 4,
'use_adam': False,
'use_chainer': False,
'use_drop': False,
'voc_data_dir': '/home/garcons/simple-faster-rcnn-pytorch/garconsdata/',
'weight_decay': 0.0005}
==========end============
load data
model construct completed
0it [00:00, ?it/s]/opt/conda/conda-bld/pytorch_1512386481460/work/torch/lib/THC/THCTensorIndex.cu:382: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [32,0,0] Assertion
indexAtDim < data.baseSizes[dim]
failed./opt/conda/conda-bld/pytorch_1512386481460/work/torch/lib/THC/THCTensorIndex.cu:382: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [33,0,0] Assertion
indexAtDim < data.baseSizes[dim]
failed....
/opt/conda/conda-bld/pytorch_1512386481460/work/torch/lib/THC/THCTensorIndex.cu:382: long calculateOffset(IndexType, LinearIndexCalcData<IndexType, Dims>) [with IndexType = unsigned int, Dims = 3U]: block: [0,0,0], thread: [31,0,0] Assertion
indexAtDim < data.baseSizes[dim]
failed.THCudaCheck FAIL file=/opt/conda/conda-bld/pytorch_1512386481460/work/torch/lib/THC/generic/THCTensorIndex.cu line=648 error=59 : device-side assert triggered
Traceback (most recent call last):
File "train.py", line 130, in
fire.Fire()
File "/root/anaconda3/lib/python3.6/site-packages/fire/core.py", line 127, in Fire
component_trace = _Fire(component, args, context, name)
File "/root/anaconda3/lib/python3.6/site-packages/fire/core.py", line 366, in _Fire
component, remaining_args)
File "/root/anaconda3/lib/python3.6/site-packages/fire/core.py", line 542, in _CallCallable
result = fn(*varargs, **kwargs)
File "train.py", line 80, in train
trainer.train_step(img, bbox, label, scale)
File "/home/garcons/simple-faster-rcnn-pytorch/trainer.py", line 168, in train_step
losses = self.forward(imgs, bboxes, labels, scale)
File "/home/garcons/simple-faster-rcnn-pytorch/trainer.py", line 147, in forward
at.totensor(gt_roi_label).long()]
File "/root/anaconda3/lib/python3.6/site-packages/torch/autograd/variable.py", line 78, in getitem
return Index.apply(self, key)
File "/root/anaconda3/lib/python3.6/site-packages/torch/autograd/_functions/tensor.py", line 87, in forward
result = i.index(ctx.index)
RuntimeError: cuda runtime error (59) : device-side assert triggered at /opt/conda/conda-bld/pytorch_1512386481460/work/torch/lib/THC/generic/THCTensorIndex.cu:648
this error occurs when i train with my own data
with pretrained model and VOC2007 dataset, there was no error like this.
i tried CUDA_LAUNCH_BLOCKING=1 python3 train.py train but it doesn't work.
how can i fix this error?
The text was updated successfully, but these errors were encountered: