Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CUDA out of memory #1

Closed
Sanchez2020 opened this issue Jun 28, 2021 · 2 comments
Closed

CUDA out of memory #1

Sanchez2020 opened this issue Jun 28, 2021 · 2 comments

Comments

@Sanchez2020
Copy link

Hi,Thank you for releasing your code. When I run R-GSN get error "RuntimeError: CUDA out of memory. Tried to allocate 562.00 MiB (GPU 1; 10.76 GiB total capacity; 8.98 GiB already allocated; 470.56 MiB free; 9.19 GiB reserved in total by PyTorch)".

I try to reduce the batch size. batch_size has been reduced to 64 and test_batch_size has been reduced to 4, I still get the same error. I used GeForce RTX 2080, can u tell me why and how to fix it, thanks a lot!

Environment

numpy==1.18.5
scipy==1.6.2
ogb==1.3.1
texttable==1.6.3
torch==1.7.0+cu110
torchvision==0.8.0
torch-cluster==1.5.9
torch-geometric==1.7.0
torch-scatter==2.0.7
torch-sparse==0.6.9
torch-spline-conv==1.2.1

full error information

Using backend: pytorch
+-----------------+-------+
| Parameter       | Value |
+-----------------+-------+
| device          | 1     |
+-----------------+-------+
| num_layers      | 2     |
+-----------------+-------+
| hidden_channels | 64    |
+-----------------+-------+
| dropout         | 0.500 |
+-----------------+-------+
| lr              | 0.004 |
+-----------------+-------+
| epochs          | 3     |
+-----------------+-------+
| runs            | 10    |
+-----------------+-------+
| batch_size      | 64    |
+-----------------+-------+
| test_batch_size | 4     |
+-----------------+-------+
| opt             | adamw |
+-----------------+-------+
| early_stop      | 1     |
+-----------------+-------+
| feat_dir        | feat  |
+-----------------+-------+
| conv_name       | rgsn  |
+-----------------+-------+
| Norm4           | 1     |
+-----------------+-------+
| FDFT            | 1     |
+-----------------+-------+
| use_attack      | 1     |
+-----------------+-------+
Data(
  edge_index_dict={
    ('author', 'affiliated_with', 'institution')=[2, 1043998],
    ('author', 'writes', 'paper')=[2, 7145660],
    ('paper', 'cites', 'paper')=[2, 5416271],
    ('paper', 'has_topic', 'field_of_study')=[2, 7505078]
  },
  edge_reltype={
    ('author', 'affiliated_with', 'institution')=[1043998, 1],
    ('author', 'writes', 'paper')=[7145660, 1],
    ('paper', 'cites', 'paper')=[5416271, 1],
    ('paper', 'has_topic', 'field_of_study')=[7505078, 1]
  },
  node_year={
    paper=[736389, 1]
  },
  num_nodes_dict={
    author=1134649,
    field_of_study=59965,
    institution=8740,
    paper=736389
  },
  x_dict={
    paper=[736389, 128]
  },
  y_dict={
    paper=[736389, 1]
  }
)
preprocess finished
/opt/conda/lib/python3.8/site-packages/torch/nn/modules/container.py:550: UserWarning: Setting attributes on ParameterDict is not supported.
  warnings.warn("Setting attributes on ParameterDict is not supported.")
Model #Params: 154373028
Attack Epoch 01: 100%|███████████████| 629571/629571 [1:01:40<00:00, 170.13it/s]
* infer valid_test exact :  86%|█████▏| 629655/736389 [01:41<2:44:33, 10.81it/s]Traceback (most recent call last):
  File "/opt/conda/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 26, in decorate_context
    return func(*args, **kwargs)
  File "/data-input/houl/R-GSN/rgsn.py", line 284, in infer
    out = model(n_id, x_dict, adjs, edge_type, node_type, local_node_idx)
  File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/data-input/houl/R-GSN/models.py", line 266, in forward
    x = conv((x, x_target), edge_index, edge_type[e_id], node_type, src_node_type)
  File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/data-input/houl/R-GSN/models.py", line 124, in forward
    msg_from_i = F.normalize(self.propagate(ei, x=x, edge_type=i, src_node_type = src_node_type, a=a))
  File "/opt/conda/lib/python3.8/site-packages/torch_geometric/nn/conv/message_passing.py", line 237, in propagate
    out = self.message(**msg_kwargs)
  File "/data-input/houl/R-GSN/models.py", line 163, in message
    res = a.unsqueeze(-1) * self.rel_lins[edge_type](x_j)  ######## Message Transform
RuntimeError: CUDA out of memory. Tried to allocate 310.00 MiB (GPU 1; 10.76 GiB total capacity; 9.36 GiB already allocated; 54.56 MiB free; 9.59 GiB reserved in total by PyTorch)
python-BaseException
@xjtuwxliang
Copy link
Owner

@Sanchez2020 Hello, the GPU I used in the experiment was GTX1080Ti, 11GB. But I currently don't have a GPU on hand. If adjusting test_batch_size cannot solve the problem, maybe you can only try to use a GPU with a slightly larger memory. I haven't thought of a better solution right now, so sorry.

@Sanchez2020
Copy link
Author

@xjtuwxliang ,thank you for your reply.
I used GPU has 10.76 GiB total capacity, and no other programs occupied. So I am confused.
I'm looking for other ways, thank you again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants