You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello,I used your code to train. However, the model terminate after first iter
Would you please help me find out the problem?
Thank you
Here are my Trace backs:
[session 1][epoch 1][iter 0] loss: 4.0006, lr: 1.00e-02
fg/bg=(128/384), time cost: 7.218862
rpn_cls: 0.6919, rpn_box: 0.1386, rcnn_cls: 2.8319, rcnn_box 0.3382
Traceback (most recent call last):
File "trainval_net.py", line 330, in
roi_labels = FPN(im_data, im_info, gt_boxes, num_boxes)
File "/home/zhiqi.cheng/anaconda2/lib/python2.7/site-packages/torch/nn/modules/module.py", line 357, in call
result = self.forward(*input, **kwargs)
File "/home/zhiqi.cheng/anaconda2/lib/python2.7/site-packages/torch/nn/parallel/data_parallel.py", line 73, in forward
outputs = self.parallel_apply(replicas, inputs, kwargs)
File "/home/zhiqi.cheng/anaconda2/lib/python2.7/site-packages/torch/nn/parallel/data_parallel.py", line 83, in parallel_apply
return parallel_apply(replicas, inputs, kwargs, self.device_ids[:len(replicas)])
File "/home/zhiqi.cheng/anaconda2/lib/python2.7/site-packages/torch/nn/parallel/parallel_apply.py", line 67, in parallel_apply
raise output
RuntimeError: invalid argument 2: Input tensor must have same size as output tensor apart from the specified dimension at /opt/conda/conda-bld/pytorch_1518238409320/work/torch/lib/THC/generic/THCTensorScatterGather.cu:29
The text was updated successfully, but these errors were encountered:
I found that the code runs normally on faster-rcnn. But if I use the code of fpn, it failed. So I guess the problem happens in fpn.py, but I still can't find out why.
What's more, I used this model to train my personal data, if I changed the data back to origin Voc2007, it works. That's strange. I just changed my personal data into the form of Voc2007.
Here is one of my annotation file:
I have solved the problem through downloading the whole pascal data set and change the data part instead of directly using my personal data.
But it's interesting that I think your code is based on that of jwyang. But through using the method of changing data part, I can successfully use your code to train but that still doesn't work when it comes to jwyang's work. So, would you mind telling me if you changed some codes which is relevant to reading data from data set?
Hello,I used your code to train. However, the model terminate after first iter
Would you please help me find out the problem?
Thank you
Here are my Trace backs:
[session 1][epoch 1][iter 0] loss: 4.0006, lr: 1.00e-02
fg/bg=(128/384), time cost: 7.218862
rpn_cls: 0.6919, rpn_box: 0.1386, rcnn_cls: 2.8319, rcnn_box 0.3382
Traceback (most recent call last):
File "trainval_net.py", line 330, in
roi_labels = FPN(im_data, im_info, gt_boxes, num_boxes)
File "/home/zhiqi.cheng/anaconda2/lib/python2.7/site-packages/torch/nn/modules/module.py", line 357, in call
result = self.forward(*input, **kwargs)
File "/home/zhiqi.cheng/anaconda2/lib/python2.7/site-packages/torch/nn/parallel/data_parallel.py", line 73, in forward
outputs = self.parallel_apply(replicas, inputs, kwargs)
File "/home/zhiqi.cheng/anaconda2/lib/python2.7/site-packages/torch/nn/parallel/data_parallel.py", line 83, in parallel_apply
return parallel_apply(replicas, inputs, kwargs, self.device_ids[:len(replicas)])
File "/home/zhiqi.cheng/anaconda2/lib/python2.7/site-packages/torch/nn/parallel/parallel_apply.py", line 67, in parallel_apply
raise output
RuntimeError: invalid argument 2: Input tensor must have same size as output tensor apart from the specified dimension at /opt/conda/conda-bld/pytorch_1518238409320/work/torch/lib/THC/generic/THCTensorScatterGather.cu:29
The text was updated successfully, but these errors were encountered: