Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

indices should be either on cpu or on the same device as the indexed tensor (cpu) #1224

Open
hamdimina opened this issue Dec 8, 2022 · 10 comments · May be fixed by #1229
Open

indices should be either on cpu or on the same device as the indexed tensor (cpu) #1224

hamdimina opened this issue Dec 8, 2022 · 10 comments · May be fixed by #1229

Comments

@hamdimina
Copy link

hamdimina commented Dec 8, 2022

I'm training yolov7 on a custom dataset using colab:

!python train.py --batch 12 --cfg cfg/training/yolov7_custom.yaml --epochs 50 --data data/custom_data.yaml --weights 'yolov7.pt' --device 0
and i face this issue below:

Traceback (most recent call last):
File "train.py", line 616, in
train(hyp, opt, device, tb_writer)
File "train.py", line 363, in train
loss, loss_items = compute_loss_ota(pred, targets.to(device), imgs) # loss scaled by batch_size
File "/content/gdrive/MyDrive/yolov7/yolov7/utils/loss.py", line 585, in call
bs, as_, gjs, gis, targets, anchors = self.build_targets(p, targets, imgs)
File "/content/gdrive/MyDrive/yolov7/yolov7/utils/loss.py", line 759, in build_targets
from_which_layer = from_which_layer[fg_mask_inboxes]
RuntimeError: indices should be either on cpu or on the same device as the indexed tensor (cpu)

please anyone can help me!

@jro17002
Copy link

jro17002 commented Dec 8, 2022

This worked for me
Changing line 685 in utils/loss.py to:
from_which_layer.append((torch.ones(size=(len(b),)) * i).to('cuda'))

and adding a new line after 756
fg_mask_inboxes = fg_mask_inboxes.to(torch.device('cuda'))

@p0wned17
Copy link

p0wned17 commented Dec 9, 2022

This worked for me and also to most people

Changing line 685 in utils/loss.py to:
from_which_layer.append(torch.ones(size=(len(b), ), device=targets.device) * i)
and that's it.

@dsbyprateekg
Copy link

I have also faced the same error in Colab for training with W6 weight file-
image

But changing the above-mentioned could not solve the error.

mhwahdan added a commit to RobEn-AAST/yolov7 that referenced this issue Dec 9, 2022
when i used the command 

python train.py --workers 8 --device 0 --batch-size 16 --data data.yaml --img 640 640 --cfg cfg/training/yolov7.yaml --weights yolov7x.pt --name yolov7 --hyp data/hyp.scratch.p5.yaml

I got this error

RuntimeError: indices should be either on cpu or on the same device as the indexed tensor (cpu)

I modified the loss.py file to automatically get the index of the default GPU selected using torch.device('cuda') function

fixes WongKinYiu#1225 WongKinYiu#1224 WongKinYiu#1101 WongKinYiu#1045
@hamdimina
Copy link
Author

This worked for me and also to most people

Changing line 685 in utils/loss.py to: from_which_layer.append(torch.ones(size=(len(b), ), device=targets.device) * i) and that's it.

can you please provide how to put the command and change the line, cause i'm new in this domain and my knowledge are not wide and thanks

@p0wned17
Copy link

p0wned17 commented Dec 9, 2022

This worked for me and also to most people
Changing line 685 in utils/loss.py to: from_which_layer.append(torch.ones(size=(len(b), ), device=targets.device) * i) and that's it.

can you please provide how to put the command and change the line, cause i'm new in this domain and my knowledge are not wide and thanks

you work in google colab?

@hamdimina
Copy link
Author

This worked for me and also to most people
Changing line 685 in utils/loss.py to: from_which_layer.append(torch.ones(size=(len(b), ), device=targets.device) * i) and that's it.

can you please provide how to put the command and change the line, cause i'm new in this domain and my knowledge are not wide and thanks

you work in google colab?

yes

@p0wned17
Copy link

p0wned17 commented Dec 9, 2022

This worked for me and also to most people
Changing line 685 in utils/loss.py to: from_which_layer.append(torch.ones(size=(len(b), ), device=targets.device) * i) and that's it.

can you please provide how to put the command and change the line, cause i'm new in this domain and my knowledge are not wide and thanks

you work in google colab?

yes

you have telegram, I can help you there?

@kuotunyu
Copy link

kuotunyu commented Dec 9, 2022

1210

Colab

!sed -i '759s/from_which_layer[fg_mask_inboxes]/from_which_layer.to(fg_mask_inboxes.device)[fg_mask_inboxes]/' /content/gdrive/MyDrive/yolov7/yolov7/utils/loss.py
!sed -n -e 759p /content/gdrive/MyDrive/yolov7/yolov7/utils/loss.py

@LeAyky
Copy link

LeAyky commented Jul 14, 2023

I still get the error mentioned above.

Is there any news on a fix?

@AdeelH
Copy link

AdeelH commented Jul 31, 2023

A lot of the code in this file is repeated, so it is important to fix all occurrences.

What worked for me is replacing all occurrences of the line

from_which_layer = torch.cat(from_which_layer, dim=0)

with

from_which_layer = torch.cat(from_which_layer, dim=0).to(targets.device)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants