rnn module uses cuda:0 even it moved into cuda:1 by to('cuda:1') #71400

dev-jwel · 2022-01-18T06:39:02Z

🐛 Describe the bug

When I use rnn such as LSTM or GRU with cuda:1, pytorch internally uses cuda:0 even I moved both input and module into cuda:1.

Code to reproduce:

import torch

input = torch.randn(100, 100, 100).to('cuda:1')
rnn = torch.nn.LSTM(100, 100).to('cuda:1')
out = rnn(input)

You can find that pytorch uses memory in both cuda:0 and cuda:1.

Versions

I uses 2 RTX 3090 GPUS.

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 460.91.03    Driver Version: 460.91.03    CUDA Version: 11.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  GeForce RTX 3090    Off  | 00000000:3B:00.0 Off |                  N/A |
|  0%   30C    P8    16W / 350W |      2MiB / 24268MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
|   1  GeForce RTX 3090    Off  | 00000000:B1:00.0 Off |                  N/A |
| 39%   29C    P8    21W / 350W |      2MiB / 24268MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

Version of pytorch is 1.10.1+cu113.

Package           Version     
----------------- ------------
pip               20.0.2      
pkg-resources     0.0.0       
setuptools        44.0.0      
torch             1.10.1+cu113
typing-extensions 4.0.1

cc @ngimel

The text was updated successfully, but these errors were encountered:

vadimkantorov · 2022-01-18T09:40:44Z

(just in case, regardless of RNN problems, PyTorch uses memory on all visible GPUs because CUDA initialization context takes a few hundred megabytes)

ngimel · 2022-01-18T17:19:46Z

Fixed on master and nightlies, workaround is to set device to 1 too: torch.cuda.set_device(1), duplicate of #71002 and #70404

samdow added module: cuda Related to torch.cuda, and CUDA support in general module: memory usage PyTorch is using more memory than it should, or it is leaking memory triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module labels Jan 18, 2022

ngimel closed this as completed Jan 18, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

rnn module uses cuda:0 even it moved into cuda:1 by to('cuda:1') #71400

rnn module uses cuda:0 even it moved into cuda:1 by to('cuda:1') #71400

dev-jwel commented Jan 18, 2022 •

edited by pytorch-probot bot

vadimkantorov commented Jan 18, 2022

ngimel commented Jan 18, 2022 •

edited

rnn module uses cuda:0 even it moved into cuda:1 by to('cuda:1') #71400

rnn module uses cuda:0 even it moved into cuda:1 by to('cuda:1') #71400

Comments

dev-jwel commented Jan 18, 2022 • edited by pytorch-probot bot

🐛 Describe the bug

Versions

vadimkantorov commented Jan 18, 2022

ngimel commented Jan 18, 2022 • edited

dev-jwel commented Jan 18, 2022 •

edited by pytorch-probot bot

ngimel commented Jan 18, 2022 •

edited