Error when training on GPU, tensor gets moved to CPU #3611
Unanswered
davidireland3
asked this question in
Q&A
Replies: 1 comment 13 replies
-
Can you confirm that |
Beta Was this translation helpful? Give feedback.
13 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I am training a network that consists of
SAGEConv
andTopKPooling
layers. I put the data and model on a GPU, but somewhere in the training a tensor gets put onto a cpu. I have checked via print statements that my data and model do indeed get put onto the GPU, which they do, so I am not sure where the error comes in.I have trained the same network before (using the same data) using an InfoNCE loss where I created class labels from the score attached to each graph. Now I am training using MSE to predict the score directly, so I am not sure if this is what is causing the issue. The other main difference is that the GPU I am using is not the default (I am using a machine with multiple GPU's so I specify
device = cuda:1
).Below is the attached error message, please let me know if I need to add anything else to the question, e.g. snippets of the training script (I checked the data device using the node feature matrix, batch score is the target matrix i.e. y inside the training loop).
Beta Was this translation helpful? Give feedback.
All reactions