Add type informations to torch.cuda #47134

guilhermeleobas · 2020-10-30T21:09:58Z

facebook-github-bot · 2020-10-30T21:10:12Z

Thank you for your pull request and welcome to our community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file.

In order for us to review and merge your code, please sign at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need to sign the corporate CLA.

If you have received this in error or have any questions, please contact us at cla@fb.com. Thanks!

dr-ci · 2020-10-30T21:46:42Z

💊 CI failures summary and remediations

As of commit 22bd4c1 (more details on the Dr. CI page):

3/3 failures possibly* introduced in this PR
- 1/3 non-CircleCI failure(s)

🕵️ 2 new failures recognized by patterns

The following CI failures do not appear to be due to upstream breakages:

pytorch_linux_xenial_py3_6_gcc5_4_test (1/2)

Step: "Run tests" (full log | diagnosis details | 🔁 rerun)

Nov 12 03:06:56 sccache: error: couldn't connect to server

Nov 12 03:06:56 +++ eval 'extract_trap_cmd ' 
Nov 12 03:06:56 ++++ extract_trap_cmd 
Nov 12 03:06:56 ++++ printf '%s\n' '' 
Nov 12 03:06:56 +++ printf '%s\n' cleanup 
Nov 12 03:06:56 ++ trap -- ' 
Nov 12 03:06:56 cleanup' EXIT 
Nov 12 03:06:56 ++ [[ pytorch-linux-xenial-py3.6-gcc5.4-test != *pytorch-win-* ]] 
Nov 12 03:06:56 ++ which sccache 
Nov 12 03:06:56 ++ sccache --stop-server 
Nov 12 03:06:56 Stopping sccache server... 
Nov 12 03:06:56 sccache: error: couldn't connect to server 
Nov 12 03:06:56 sccache: caused by: Connection refused (os error 111) 
Nov 12 03:06:56 ++ true 
Nov 12 03:06:56 ++ rm /var/lib/jenkins/sccache_error.log 
Nov 12 03:06:56 ++ [[ pytorch-linux-xenial-py3.6-gcc5.4-test == *rocm* ]] 
Nov 12 03:06:56 ++ SCCACHE_ERROR_LOG=/var/lib/jenkins/sccache_error.log 
Nov 12 03:06:56 ++ SCCACHE_IDLE_TIMEOUT=1200 
Nov 12 03:06:56 ++ RUST_LOG=sccache::server=error 
Nov 12 03:06:56 ++ sccache --start-server 
Nov 12 03:06:56 sccache: Starting the server... 
Nov 12 03:06:56 ++ sccache --zero-stats

pytorch_xla_linux_bionic_py3_6_clang9_test (2/2)

Step: "Run tests" (full log | diagnosis details | 🔁 rerun)

Nov 12 03:34:44 sccache: error: couldn't connect to server

Nov 12 03:34:44 +++ eval 'extract_trap_cmd ' 
Nov 12 03:34:44 ++++ extract_trap_cmd 
Nov 12 03:34:44 ++++ printf '%s\n' '' 
Nov 12 03:34:44 +++ printf '%s\n' cleanup 
Nov 12 03:34:44 ++ trap -- ' 
Nov 12 03:34:44 cleanup' EXIT 
Nov 12 03:34:44 ++ [[ pytorch-xla-linux-bionic-py3.6-clang9-test != *pytorch-win-* ]] 
Nov 12 03:34:44 ++ which sccache 
Nov 12 03:34:44 ++ sccache --stop-server 
Nov 12 03:34:44 Stopping sccache server... 
Nov 12 03:34:44 sccache: error: couldn't connect to server 
Nov 12 03:34:44 sccache: caused by: Connection refused (os error 111) 
Nov 12 03:34:44 ++ true 
Nov 12 03:34:44 ++ rm /var/lib/jenkins/sccache_error.log 
Nov 12 03:34:44 ++ [[ pytorch-xla-linux-bionic-py3.6-clang9-test == *rocm* ]] 
Nov 12 03:34:44 ++ SCCACHE_ERROR_LOG=/var/lib/jenkins/sccache_error.log 
Nov 12 03:34:44 ++ SCCACHE_IDLE_TIMEOUT=1200 
Nov 12 03:34:44 ++ RUST_LOG=sccache::server=error 
Nov 12 03:34:44 ++ sccache --start-server 
Nov 12 03:34:44 sccache: Starting the server... 
Nov 12 03:34:44 ++ sccache --zero-stats

ci.pytorch.org: 1 failed

Failed: pr/pytorch-linux-bionic-rocm3.9-py3.6

This comment was automatically generated by Dr. CI (expand for details).

Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions on the GitHub issue tracker or post in the (internal) Dr. CI Users group.

See how this bot performed.

This comment has been revised 31 times.

rgommers

Thanks @guilhermeleobas. The one CI failure that needed a closer look is test_abs_cuda_complex128 for ROCm:

======================================================================
FAIL: test_abs_cuda_complex128 (__main__.TestForeachCUDA)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/var/lib/jenkins/.local/lib/python3.6/site-packages/torch/testing/_internal/common_utils.py", line 834, in wrapper
    method(*args, **kwargs)
  File "/var/lib/jenkins/.local/lib/python3.6/site-packages/torch/testing/_internal/common_device_type.py", line 272, in instantiated_test
    result = test_fn(self, *args)
  File "test_foreach.py", line 259, in test_abs
    self.assertEqual(res, expected)
  File "/var/lib/jenkins/.local/lib/python3.6/site-packages/torch/testing/_internal/common_utils.py", line 1138, in assertEqual
    exact_dtype=exact_dtype, exact_device=exact_device)
  File "/var/lib/jenkins/.local/lib/python3.6/site-packages/torch/testing/_internal/common_utils.py", line 1112, in assertEqual
    self.assertTrue(result, msg=msg)
AssertionError: False is not true : Tensors failed to compare as equal! Attempted to compare equality of tensors with different dtypes. Got dtypes torch.float64 and torch.complex128.

======================================================================
FAIL: test_abs_cuda_complex64 (__main__.TestForeachCUDA)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/var/lib/jenkins/.local/lib/python3.6/site-packages/torch/testing/_internal/common_utils.py", line 834, in wrapper
    method(*args, **kwargs)
  File "/var/lib/jenkins/.local/lib/python3.6/site-packages/torch/testing/_internal/common_device_type.py", line 272, in instantiated_test
    result = test_fn(self, *args)
  File "test_foreach.py", line 252, in test_abs
    self.assertEqual(res, expected)
  File "/var/lib/jenkins/.local/lib/python3.6/site-packages/torch/testing/_internal/common_utils.py", line 1138, in assertEqual
    exact_dtype=exact_dtype, exact_device=exact_device

It looks like a flake; that job has a different failure in many other PRs, but they're different each time.

Maybe just push a code comment update for my one other comment to check it's not the same failure twice?
LGTM other than that.

rgommers · 2020-11-11T22:10:19Z

torch/cuda/__init__.py

-        with device(self.get_device()):
-            return super(_CudaBase, self).type(*args, **kwargs)
+        # We could use a Protocol here to tell mypy that self has `get_device` method
+        # but it is only available on Python >= 3.8 on typing or mypy_extensions on


I think you mean typing_extensions? That is a non-optional dependency already, so could be used. That said, I don't think it's necessary to spend time on this now. Just updating the comment may be good for now.

rgommers · 2020-11-12T10:47:18Z

Okay, ROCm failures are different ones now - these are all flakes.

ezyang · 2020-11-13T18:13:16Z

torch/cuda/__init__.py

-            return super(_CudaBase, self).type(*args, **kwargs)
+        # We could use a Protocol here to tell mypy that self has `get_device` method
+        # but it is only available in the typing module on Python >= 3.8
+        # or on typing_extensions module on Python >= 3.6


btw we py3.6+ only now

facebook-github-bot

@ezyang has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

facebook-github-bot · 2020-11-14T07:14:12Z

@ezyang merged this pull request in 4f9d075.

guilhermeleobas added the module: typing Related to mypy type annotations label Oct 30, 2020

guilhermeleobas self-assigned this Oct 30, 2020

pytorchbot added the open source label Oct 30, 2020

facebook-github-bot added the cla signed label Oct 31, 2020

guilhermeleobas marked this pull request as ready for review November 10, 2020 14:51

guilhermeleobas requested a review from rgommers November 11, 2020 16:08

ailzhang added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Nov 11, 2020

rgommers approved these changes Nov 11, 2020

View reviewed changes

guilhermeleobas added 6 commits November 12, 2020 02:31

add type informations to torch.cuda

cf354ae

flake8

be82c1d

add get_device protocol

5113d7d

remove Protocol

2eaed7f

remove import Protocol

260659b

change comment msg

22bd4c1

rgommers requested review from malfet and ezyang November 12, 2020 10:47

ezyang reviewed Nov 13, 2020

View reviewed changes

ezyang approved these changes Nov 13, 2020

View reviewed changes

facebook-github-bot reviewed Nov 13, 2020

View reviewed changes

facebook-github-bot closed this in 4f9d075 Nov 14, 2020

facebook-github-bot added the Merged label Nov 14, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add type informations to torch.cuda #47134

Add type informations to torch.cuda #47134

guilhermeleobas commented Oct 30, 2020 •

edited

facebook-github-bot commented Oct 30, 2020

dr-ci bot commented Oct 30, 2020 •

edited

rgommers left a comment

rgommers Nov 11, 2020

rgommers commented Nov 12, 2020

ezyang Nov 13, 2020

facebook-github-bot left a comment

facebook-github-bot commented Nov 14, 2020

Add type informations to torch.cuda #47134

Add type informations to torch.cuda #47134

Conversation

guilhermeleobas commented Oct 30, 2020 • edited

facebook-github-bot commented Oct 30, 2020

dr-ci bot commented Oct 30, 2020 • edited

💊 CI failures summary and remediations

🕵️ 2 new failures recognized by patterns

pytorch_linux_xenial_py3_6_gcc5_4_test (1/2)

pytorch_xla_linux_bionic_py3_6_clang9_test (2/2)

ci.pytorch.org: 1 failed

rgommers left a comment

Choose a reason for hiding this comment

rgommers Nov 11, 2020

Choose a reason for hiding this comment

rgommers commented Nov 12, 2020

ezyang Nov 13, 2020

Choose a reason for hiding this comment

facebook-github-bot left a comment

Choose a reason for hiding this comment

facebook-github-bot commented Nov 14, 2020

guilhermeleobas commented Oct 30, 2020 •

edited

dr-ci bot commented Oct 30, 2020 •

edited