for issue#339 #655

imwxc · 2021-04-01T10:45:14Z

maybe an augmentation cause the target tensor become empty( tensor([ ]) ) , my solution is comment the Affine out so that the bug will be fixed

Closes #339

maybe an augmentation cause the target tensor become empty( tensor([ ]) ) , my solution is comment the Affine out so that the bug will be fixed

Flova · 2021-04-02T15:17:39Z

But the Affine is a crucial part of the data augmentation process. I also don't understand how this relates to issues when using negative data or do you use positive data and all boxes get moved out of the image by the augmentation and therefore the image is quasi a negative sample. Negative data is also quite uncommon in this context https://stackoverflow.com/questions/55202727/yolo-object-detection-include-images-that-do-not-contain-classes-to-be-predicte.

imwxc · 2021-04-02T15:51:25Z

But the Affine is a crucial part of the data augmentation process. I also don't understand how this relates to issues when using negative data or do you use positive data and all boxes get moved out of the image by the augmentation and therefore the image is quasi a negative sample. Negative data is also quite uncommon in this context https://stackoverflow.com/questions/55202727/yolo-object-detection-include-images-that-do-not-contain-classes-to-be-predicte.

thanks for your advices,.

I checked my datasets and I found that the images cause the problem have some short-distance boxes so I tryed change the param of translate_percent from (-0.2,0.2) to (-0.05 to 0.05) and the problem also got fixed.

Flova · 2021-04-02T23:41:49Z

Ah okay, this seems to speak for the thesis that we convert these ones to negative samples by moving the box out of the image. Thank you for your troubleshooting. Now we need to fix the issue that training fails at negative samples. Could you provide a complete stack trace of the error in the target building? The one in the issue is a bit short and outdated.

Flova · 2021-04-05T13:19:35Z

I ran a few trials and I was not able to reproduce this issue with the current master. I indeed get a tensor([], size=(0, 6)) tensor as the target, but it doesn't cause an exception. Maybe you are on an older version of this repo could you send me the commit your on?

imwxc · 2021-04-05T13:33:48Z

I ran a few trials and I was not able to reproduce this issue with the current master. I indeed get a tensor([], size=(0, 6)) tensor as the target, but it doesn't cause an exception. Maybe you are on an older version of this repo could you send me the commit your on?

sorry for late. I tried to get the orignal stack trace but maybe because I update my pytorch so the Traceback become this ( as follow):

Traceback (most recent call last):
File "train.py", line 109, in
loss, outputs = model(imgs, targets)
File "D:\ProgramData\Anaconda3\envs\Pytorch\lib\site-packages\torch\nn\modules\module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "D:\Graduation_Project\YOLOv3-forusing\models.py", line 274, in forward
yolo_outputs = to_cpu(torch.cat(yolo_outputs, 1))
RuntimeError: There were no tensor arguments to this function (e.g., you passed an empty list of Tensors), but no fallback function is registered for schema aten::_cat. This usually means that this function requires a non-empty list of Tensors. Available functions are [CPU, CUDA, QuantizedCPU, BackendSelect, Named, AutogradOther, AutogradCPU, AutogradCUDA, AutogradXLA, AutogradPrivateUse1, AutogradPrivateUse2, AutogradPrivateUse3, Tracer, Autocast, Batched, VmapMode].

Flova · 2021-04-05T13:47:23Z

I ran a few trials and I was not able to reproduce this issue with the current master. I indeed get a tensor([], size=(0, 6)) tensor as the target, but it doesn't cause an exception. Maybe you are on an older version of this repo could you send me the commit your on?

sorry for late. I tried to get the orignal stack trace but maybe because I update my pytorch so the Traceback become this ( as follow):

Traceback (most recent call last):
File "train.py", line 109, in
loss, outputs = model(imgs, targets)
File "D:\ProgramData\Anaconda3\envs\Pytorch\lib\site-packages\torch\nn\modules\module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "D:\Graduation_Project\YOLOv3-forusing\models.py", line 274, in forward
yolo_outputs = to_cpu(torch.cat(yolo_outputs, 1))
RuntimeError: There were no tensor arguments to this function (e.g., you passed an empty list of Tensors), but no fallback function is registered for schema aten::_cat. This usually means that this function requires a non-empty list of Tensors. Available functions are [CPU, CUDA, QuantizedCPU, BackendSelect, Named, AutogradOther, AutogradCPU, AutogradCUDA, AutogradXLA, AutogradPrivateUse1, AutogradPrivateUse2, AutogradPrivateUse3, Tracer, Autocast, Batched, VmapMode].

Did you modify any parts of the code or the .cfg or something similar? Because this trace says, that you don't have any Yolo layers in your network. This is obviously a problem.

imwxc · 2021-04-05T15:12:14Z

thanks for your advice. I checked my code and I run my task again. And here is the stack trace for the issue :

targets: tensor([], device='cuda:0', size=(0, 6))

imgs: tensor([[[[0.0000, 0.0000, 0.0000, ..., 0.9412, 0.9882, 0.9882],
[0.0000, 0.0000, 0.0000, ..., 0.5098, 0.9961, 0.9882],
[0.0000, 0.0000, 0.0000, ..., 0.9725, 0.9686, 0.9725],
...,
[0.0000, 0.0000, 0.0000, ..., 0.0000, 0.0000, 0.0000],
[0.0000, 0.0000, 0.0000, ..., 0.0000, 0.0000, 0.0000],
[0.0000, 0.0000, 0.0000, ..., 0.0000, 0.0000, 0.0000]]]],
device='cuda:0')

Traceback (most recent call last):
File "train.py", line 110, in
loss, outputs = model(imgs, targets)
File "D:\ProgramData\Anaconda3\envs\Pytorch\lib\site-packages\torch\nn\modules\module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "D:\Graduation_Project\YOLOv3-forusing\models.py", line 270, in forward
x, layer_loss = module[0](x, targets, img_dim)
File "D:\ProgramData\Anaconda3\envs\Pytorch\lib\site-packages\torch\nn\modules\module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "D:\Graduation_Project\YOLOv3-forusing\models.py", line 196, in forward
ignore_thres=self.ignore_thres,
File "D:\Graduation_Project\YOLOv3-forusing\utils\utils.py", line 303, in build_targets
best_ious, best_n = ious.max(0)
RuntimeError: cannot perform reduction function max on tensor with no elements because the operation does not have an identity
ignore_thres=self.ignore_thres,
File "D:\Graduation_Project\YOLOv3-forusing\utils\utils.py", line 303, in build_targets
best_ious, best_n = ious.max(0)
RuntimeError: cannot perform reduction function max on tensor with no elements because the operation does not have an identity

and I also checked my target txt file of the error target here is my txt file data:

3 0.07633587786259542 0.5251908396946565 0.07633587786259542 0.04122137404580153
0 0.1099236641221374 0.35 0.1099236641221374 0.06030534351145038
5 0.08015267175572519 0.45610687022900764 0.08015267175572519 0.10763358778625955
2 0.04351145038167939 0.47748091603053433 0.04351145038167939 0.08015267175572519
1 0.0648854961832061 0.6423664122137405 0.0648854961832061 0.0450381679389313

** I also checked my imgage and here is my image data**

Flova · 2021-04-05T17:11:10Z

Thank you for your detailed information! It seems like your code is not up to date. Could you provide the commit hash of your HEAD? It seems like your code uses ignore_thres=self.ignore_thres which is not in the current codebase.

Flova · 2021-04-05T17:13:02Z

This PR on the other hand seems up to date.

imwxc · 2021-04-05T17:32:28Z

Thank you for your detailed information! It seems like your code is not up to date. Could you provide the commit hash of your HEAD? It seems like your code uses ignore_thres=self.ignore_thres which is not in the current codebase.

thanks ! the commit hash is 24381e5 which is 11 days ago . the code seems out of date. and i'll update my code. thanks again !!!

Flova · 2021-04-05T17:53:31Z

Commit 24381e5 should be fine imo. But it doesn't line up with the stack trace. Thats weird.

Flova · 2021-04-05T18:01:41Z

The stack trace shows a state previous to #646.

for issue#339

b8a231a

maybe an augmentation cause the target tensor become empty( tensor([ ]) ) , my solution is comment the Affine out so that the bug will be fixed

Flova closed this Apr 17, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

for issue#339 #655

for issue#339 #655

imwxc commented Apr 1, 2021 •

edited by Flova

Flova commented Apr 2, 2021

imwxc commented Apr 2, 2021

Flova commented Apr 2, 2021

Flova commented Apr 5, 2021

imwxc commented Apr 5, 2021

Flova commented Apr 5, 2021

imwxc commented Apr 5, 2021

Flova commented Apr 5, 2021

Flova commented Apr 5, 2021

imwxc commented Apr 5, 2021

Flova commented Apr 5, 2021

Flova commented Apr 5, 2021

for issue#339 #655

for issue#339 #655

Conversation

imwxc commented Apr 1, 2021 • edited by Flova

Flova commented Apr 2, 2021

imwxc commented Apr 2, 2021

Flova commented Apr 2, 2021

Flova commented Apr 5, 2021

imwxc commented Apr 5, 2021

Flova commented Apr 5, 2021

imwxc commented Apr 5, 2021

thanks for your advice. I checked my code and I run my task again. And here is the stack trace for the issue :

Flova commented Apr 5, 2021

Flova commented Apr 5, 2021

imwxc commented Apr 5, 2021

Flova commented Apr 5, 2021

Flova commented Apr 5, 2021

imwxc commented Apr 1, 2021 •

edited by Flova