You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
(I need the subgraph sampled in test_loader to be random, so I put shuffle=True and use n_id attribute to rearrange the predicted logits)
I used W&B to log the train_losses and other metrics during training, but I found that after 80min and 6h (2 experiments) it stucks, the curve stop running for about 2 hours. I can only think it's because the num_workers cause after I deleted num_workers paramater, it can successfully finished the 22h process.
Honestly it's hard for me to trace back the bug and reproduce it... So I can only just report this problem.
Environment
PyG version: 2.1.0.dev20220815
PyTorch version: 1.11.0
OS: Linux
Python version: 3.8.13
CUDA/cuDNN version: cuda10.2 cudnn7.6.5
How you installed PyTorch and PyG (conda, pip, source):
PyTorch: conda install pytorch==1.11.0 torchvision==0.12.0 torchaudio==0.11.0 cudatoolkit=10.2 -c pytorch
PyG:
Thanks for reporting. Do you have some intuition what might cause this? Is there a memory leak and memory requirements are increasing over epochs? Any guidance appreciated!
🐛 Describe the bug
My code is like this:
(I need the subgraph sampled in test_loader to be random, so I put
shuffle=True
and use n_id attribute to rearrange the predicted logits)I used W&B to log the train_losses and other metrics during training, but I found that after 80min and 6h (2 experiments) it stucks, the curve stop running for about 2 hours. I can only think it's because the num_workers cause after I deleted num_workers paramater, it can successfully finished the 22h process.
Honestly it's hard for me to trace back the bug and reproduce it... So I can only just report this problem.
Environment
conda
,pip
, source):PyTorch:
conda install pytorch==1.11.0 torchvision==0.12.0 torchaudio==0.11.0 cudatoolkit=10.2 -c pytorch
PyG:
torch-scatter
):torch-scatter 2.0.9
torch-sparse 0.6.14
The text was updated successfully, but these errors were encountered: