Yolov7-w6.pt custom training runtime error indices should be either in cpu or on the same device #1228

dsbyprateekg · 2022-12-09T06:39:51Z

Hi,

Custom training with the W6 weight file is giving me the following error in Colab:-

d246810g2000 · 2022-12-09T06:53:11Z

This answer can solve your problem:

dsbyprateekg · 2022-12-09T06:57:31Z

@d246810g2000 No, I need to use GPU.
So can you please tell me the exact line in loss.py to change for W6 weight file?
Also it does not solve my issue.

d246810g2000 · 2022-12-09T07:16:01Z

You have two solutions:

yolov7/utils/loss.py

Line 685 in 2596994

from_which_layer.append(torch.ones(size=(len(b),)) * i)

from_which_layer.append((torch.ones(size=(len(b),)) * i).to('cuda'))

yolov7/utils/loss.py

Line 756 in 2596994

fg_mask_inboxes = matching_matrix.sum(0) > 0.0

add a line after 756 to put fg_mask_inboxes on your cuda device
fg_mask_inboxes = fg_mask_inboxes.to(torch.device('cuda'))

dsbyprateekg · 2022-12-09T07:35:45Z

You have two solutions:

yolov7/utils/loss.py

Line 685 in 2596994

from_which_layer.append(torch.ones(size=(len(b),)) * i)

from_which_layer.append((torch.ones(size=(len(b),)) * i).to('cuda'))

yolov7/utils/loss.py

Line 756 in 2596994

fg_mask_inboxes = matching_matrix.sum(0) > 0.0

add a line after 756 to put fg_mask_inboxes on your cuda device
fg_mask_inboxes = fg_mask_inboxes.to(torch.device('cuda'))

I am still getting the same error-

Please find attached my loss.py file with the changes.
loss.txt

Can you please check and let me know if I have missed something?

dsbyprateekg · 2022-12-09T07:40:01Z

Just a quick update, I also changed line no 1336 to from_which_layer.append((torch.ones(size=(len(b),)) * i).to('cuda')) and after that my issue is resolved.

rimaexo · 2022-12-15T11:38:50Z

could you send me the updated file I am still getting the same error

ayansaha280 · 2022-12-29T21:12:17Z

I am still getting the same error-
Epoch gpu_mem box obj cls total labels img_size
0% 0/5 [00:09<?, ?it/s]
Traceback (most recent call last):
File "train_aux.py", line 612, in
train(hyp, opt, device, tb_writer)
File "train_aux.py", line 362, in train
loss, loss_items = compute_loss_ota(pred, targets.to(device), imgs) # loss scaled by batch_size
File "/content/gdrive/MyDrive/Capstone Project22-23 Group-5/Note Book /1st try/yolov7/utils/loss.py", line 1205, in call
bs_aux, as_aux_, gjs_aux, gis_aux, targets_aux, anchors_aux = self.build_targets2(p[:self.nl], targets, imgs)
File "/content/gdrive/MyDrive/Capstone Project22-23 Group-5/Note Book /1st try/yolov7/utils/loss.py", line 1557, in build_targets2
from_which_layer = from_which_layer[fg_mask_inboxes]
RuntimeError: indices should be either on cpu or on the same device as the indexed tensor (cpu)

dsbyprateekg · 2023-01-20T12:57:04Z

@rimaexo and @ayansaha280 please use attached loss file-
loss_updated_w6.txt

JFMeyer2k · 2023-05-23T00:00:13Z

For reference, I pulled the most recent version of YOLOv7 from main at 2023-05-23.
When I train with yolov7-w6.pt or yolov7-e6e.pt using train_aux.py, I get the same error the authors mention here.

I followed the suggestions above:

within loss.py replace from_which_layer.append(torch.ones(size=(len(b),)) * i) with from_which_layer.append((torch.ones(size=(len(b),)) * i).to('cuda'))
Adding the line fg_mask_inboxes = fg_mask_inboxes.to(torch.device('cuda')) after fg_mask_inboxes = matching_matrix.sum(0) > 0.0 (in the current version of loss.py, its line 756 and the code reads fg_mask_inboxes = (matching_matrix.sum(0) > 0.0).to(device)
Change line 1336 to from_which_layer.append((torch.ones(size=(len(b),)) * i).to('cuda')). In the current code it is line 1330, which is from_which_layer.append(torch.ones(size=(len(b),)) * i)

However, the same error occurs. Finally, I replaced the loss.py file with the one (loss_updated_w6.txt) shared by dsbyprateekg and it worked. I also tested e6e and it works too.

dsbyprateekg closed this as completed Dec 9, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Yolov7-w6.pt custom training runtime error indices should be either in cpu or on the same device #1228

Yolov7-w6.pt custom training runtime error indices should be either in cpu or on the same device #1228

dsbyprateekg commented Dec 9, 2022

d246810g2000 commented Dec 9, 2022

dsbyprateekg commented Dec 9, 2022 •

edited

Loading

d246810g2000 commented Dec 9, 2022 •

edited

Loading

dsbyprateekg commented Dec 9, 2022

dsbyprateekg commented Dec 9, 2022

rimaexo commented Dec 15, 2022

ayansaha280 commented Dec 29, 2022

dsbyprateekg commented Jan 20, 2023

JFMeyer2k commented May 23, 2023 •

edited

Loading

Yolov7-w6.pt custom training runtime error indices should be either in cpu or on the same device #1228

Yolov7-w6.pt custom training runtime error indices should be either in cpu or on the same device #1228

Comments

dsbyprateekg commented Dec 9, 2022

d246810g2000 commented Dec 9, 2022

dsbyprateekg commented Dec 9, 2022 • edited Loading

d246810g2000 commented Dec 9, 2022 • edited Loading

dsbyprateekg commented Dec 9, 2022

dsbyprateekg commented Dec 9, 2022

rimaexo commented Dec 15, 2022

ayansaha280 commented Dec 29, 2022

dsbyprateekg commented Jan 20, 2023

JFMeyer2k commented May 23, 2023 • edited Loading

dsbyprateekg commented Dec 9, 2022 •

edited

Loading

d246810g2000 commented Dec 9, 2022 •

edited

Loading

JFMeyer2k commented May 23, 2023 •

edited

Loading