Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RuntimeError: Event device type CUDA does not match blocking stream’s device type CPU #78482

Open
pan24n opened this issue May 30, 2022 · 5 comments
Labels
module: autograd Related to torch.autograd, and the autograd engine in general module: cuda Related to torch.cuda, and CUDA support in general module: tests Issues related to tests (not the torch.testing module) triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module

Comments

@pan24n
Copy link

pan24n commented May 30, 2022

🐛 Describe the bug

when I do the unittest of test_autograd.py on pytorch1.11.0, it throw an RuntimeError: Event device type CUDA does not match blocking stream’s device type CPU. I want to know what causes this error.

Traceback (most recent call last):
  File "/root/.local/lib/python3.7/site-packages/torch/testing/_internal/common_utils.py", line 1754, in wrapper
    method(*args, **kwargs)
  File "/root/.local/lib/python3.7/site-packages/torch/testing/_internal/common_utils.py", line 1754, in wrapper
    method(*args, **kwargs)
  File "/root/.local/lib/python3.7/site-packages/torch/testing/_internal/common_device_type.py", line 389, in instantiated_test
    raise rte
  File "/root/.local/lib/python3.7/site-packages/torch/testing/_internal/common_device_type.py", line 376, in instantiated_test
    result = test(self, **param_kwargs)
  File "/root/.local/lib/python3.7/site-packages/torch/testing/_internal/common_device_type.py", line 939, in multi_fn
    return fn(slf, devices, *args, **kwargs)
  File "pytorch/test/test_autograd.py", line 8644, in test_backward_device
    Identity.apply(v).backward()
  File "/root/.local/lib/python3.7/site-packages/torch/_tensor.py", line 363, in backward
    torch.autograd.backward(self, gradient, retain_graph, create_graph, inputs=inputs)
  File "/root/.local/lib/python3.7/site-packages/torch/autograd/__init__.py", line 175, in backward
    allow_unreachable=True, accumulate_grad=True)  # Calls into the C++ engine to run the backward pass
RuntimeError: Event device type CUDA does not match blocking stream's device type CPU.

Versions

PyTorch version: 1.11.0a0
Is debug build: False
CUDA used to build PyTorch: N/A
ROCM used to build PyTorch: 4.3.22211

OS: CentOS Linux 7 (Core) (x86_64)
GCC version: (GCC) 7.3.1 20180303 (Red Hat 7.3.1-5)
Clang version: 14.0.0
CMake version: version 2.8.12.2

Is CUDA available: True

cc @ezyang @albanD @zou3519 @gqchen @pearu @nikitaved @soulitzer @lezcano @Varal7 @ngimel @mruberry

@ejguan ejguan added module: cuda Related to torch.cuda, and CUDA support in general module: tests Issues related to tests (not the torch.testing module) triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module labels May 31, 2022
@mruberry mruberry added the module: autograd Related to torch.autograd, and the autograd engine in general label May 31, 2022
@albanD
Copy link
Collaborator

albanD commented May 31, 2022

Are you trying to run the rocm build with the cuda backend?

@sunflower93
Copy link

Are you trying to run the rocm build with the cuda backend?
Traceback (most recent call last):
.........
File "/home/zjlab/anaconda3/envs/shao/lib/python3.9/site-packages/torch/_tensor.py", line 396, in backward
torch.autograd.backward(self, gradient, retain_graph, create_graph, inputs=inputs)
File "/home/zjlab/anaconda3/envs/shao/lib/python3.9/site-packages/torch/autograd/init.py", line 173, in backward
Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass
RuntimeError: Event device type CUDA does not match blocking stream's device type CPU.

@sunflower93
Copy link

🐛 Describe the bug

when I do the unittest of test_autograd.py on pytorch1.11.0, it throw an RuntimeError: Event device type CUDA does not match blocking stream’s device type CPU. I want to know what causes this error.

Traceback (most recent call last):
  File "/root/.local/lib/python3.7/site-packages/torch/testing/_internal/common_utils.py", line 1754, in wrapper
    method(*args, **kwargs)
  File "/root/.local/lib/python3.7/site-packages/torch/testing/_internal/common_utils.py", line 1754, in wrapper
    method(*args, **kwargs)
  File "/root/.local/lib/python3.7/site-packages/torch/testing/_internal/common_device_type.py", line 389, in instantiated_test
    raise rte
  File "/root/.local/lib/python3.7/site-packages/torch/testing/_internal/common_device_type.py", line 376, in instantiated_test
    result = test(self, **param_kwargs)
  File "/root/.local/lib/python3.7/site-packages/torch/testing/_internal/common_device_type.py", line 939, in multi_fn
    return fn(slf, devices, *args, **kwargs)
  File "pytorch/test/test_autograd.py", line 8644, in test_backward_device
    Identity.apply(v).backward()
  File "/root/.local/lib/python3.7/site-packages/torch/_tensor.py", line 363, in backward
    torch.autograd.backward(self, gradient, retain_graph, create_graph, inputs=inputs)
  File "/root/.local/lib/python3.7/site-packages/torch/autograd/__init__.py", line 175, in backward
    allow_unreachable=True, accumulate_grad=True)  # Calls into the C++ engine to run the backward pass
RuntimeError: Event device type CUDA does not match blocking stream's device type CPU.

Versions

PyTorch version: 1.11.0a0 Is debug build: False CUDA used to build PyTorch: N/A ROCM used to build PyTorch: 4.3.22211

OS: CentOS Linux 7 (Core) (x86_64) GCC version: (GCC) 7.3.1 20180303 (Red Hat 7.3.1-5) Clang version: 14.0.0 CMake version: version 2.8.12.2

Is CUDA available: True

cc @ezyang @albanD @zou3519 @gqchen @pearu @nikitaved @soulitzer @lezcano @Varal7 @ngimel @mruberry

Hello, has this problem been solved?

@pan24n
Copy link
Author

pan24n commented Aug 1, 2022 via email

@sunflower93
Copy link

Yes, in the beginning, I didn't use the official ROCM environment, so it had some problems, made some changes, which have now been fixed. I think using the official ROCM package, you probably won't run into this problem. I tested it on Rocm5.1 + Python3.7 + Pytorch1.11, and it worked 在 2022-08-01 13:55:01,"sunflower93" @.> 写道: Describe the bug when I do the unittest of test_autograd.py on pytorch1.11.0, it throw an RuntimeError: Event device type CUDA does not match blocking stream’s device type CPU. I want to know what causes this error. Traceback (most recent call last): File "/root/.local/lib/python3.7/site-packages/torch/testing/_internal/common_utils.py", line 1754, in wrapper method(args, kwargs) File "/root/.local/lib/python3.7/site-packages/torch/testing/_internal/common_utils.py", line 1754, in wrapper method(args, kwargs) File "/root/.local/lib/python3.7/site-packages/torch/testing/_internal/common_device_type.py", line 389, in instantiated_test raise rte File "/root/.local/lib/python3.7/site-packages/torch/testing/_internal/common_device_type.py", line 376, in instantiated_test result = test(self, param_kwargs) File "/root/.local/lib/python3.7/site-packages/torch/testing/_internal/common_device_type.py", line 939, in multi_fn return fn(slf, devices, args, kwargs) File "pytorch/test/test_autograd.py", line 8644, in test_backward_device Identity.apply(v).backward() File "/root/.local/lib/python3.7/site-packages/torch/_tensor.py", line 363, in backward torch.autograd.backward(self, gradient, retain_graph, create_graph, inputs=inputs) File "/root/.local/lib/python3.7/site-packages/torch/autograd/init.py", line 175, in backward allow_unreachable=True, accumulate_grad=True) # Calls into the C++ engine to run the backward pass RuntimeError: Event device type CUDA does not match blocking stream's device type CPU. Versions PyTorch version: 1.11.0a0 Is debug build: False CUDA used to build PyTorch: N/A ROCM used to build PyTorch: 4.3.22211 OS: CentOS Linux 7 (Core) (x86_64) GCC version: (GCC) 7.3.1 20180303 (Red Hat 7.3.1-5) Clang version: 14.0.0 CMake version: version 2.8.12.2 Is CUDA available: True cc @@.@@.@@.@@.@@.@mruberry Hello, has this problem been solved? — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.**>

thx

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
module: autograd Related to torch.autograd, and the autograd engine in general module: cuda Related to torch.cuda, and CUDA support in general module: tests Issues related to tests (not the torch.testing module) triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module
Projects
None yet
Development

No branches or pull requests

5 participants