Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add type informations to torch.cuda #47134

Closed
wants to merge 6 commits into from
Closed

Add type informations to torch.cuda #47134

wants to merge 6 commits into from

Conversation

guilhermeleobas
Copy link
Collaborator

@guilhermeleobas guilhermeleobas commented Oct 30, 2020

Fixes #47133

@guilhermeleobas guilhermeleobas added the module: typing Related to mypy type annotations label Oct 30, 2020
@guilhermeleobas guilhermeleobas self-assigned this Oct 30, 2020
@facebook-github-bot
Copy link
Contributor

Hi @guilhermeleobas!

Thank you for your pull request and welcome to our community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file.

In order for us to review and merge your code, please sign at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need to sign the corporate CLA.

If you have received this in error or have any questions, please contact us at cla@fb.com. Thanks!

@dr-ci
Copy link

dr-ci bot commented Oct 30, 2020

💊 CI failures summary and remediations

As of commit 22bd4c1 (more details on the Dr. CI page):


  • 3/3 failures possibly* introduced in this PR
    • 1/3 non-CircleCI failure(s)

🕵️ 2 new failures recognized by patterns

The following CI failures do not appear to be due to upstream breakages:

See CircleCI build pytorch_linux_xenial_py3_6_gcc5_4_test (1/2)

Step: "Run tests" (full log | diagnosis details | 🔁 rerun)

Nov 12 03:06:56 sccache: error: couldn't connect to server
Nov 12 03:06:56 +++ eval 'extract_trap_cmd ' 
Nov 12 03:06:56 ++++ extract_trap_cmd 
Nov 12 03:06:56 ++++ printf '%s\n' '' 
Nov 12 03:06:56 +++ printf '%s\n' cleanup 
Nov 12 03:06:56 ++ trap -- ' 
Nov 12 03:06:56 cleanup' EXIT 
Nov 12 03:06:56 ++ [[ pytorch-linux-xenial-py3.6-gcc5.4-test != *pytorch-win-* ]] 
Nov 12 03:06:56 ++ which sccache 
Nov 12 03:06:56 ++ sccache --stop-server 
Nov 12 03:06:56 Stopping sccache server... 
Nov 12 03:06:56 sccache: error: couldn't connect to server 
Nov 12 03:06:56 sccache: caused by: Connection refused (os error 111) 
Nov 12 03:06:56 ++ true 
Nov 12 03:06:56 ++ rm /var/lib/jenkins/sccache_error.log 
Nov 12 03:06:56 ++ [[ pytorch-linux-xenial-py3.6-gcc5.4-test == *rocm* ]] 
Nov 12 03:06:56 ++ SCCACHE_ERROR_LOG=/var/lib/jenkins/sccache_error.log 
Nov 12 03:06:56 ++ SCCACHE_IDLE_TIMEOUT=1200 
Nov 12 03:06:56 ++ RUST_LOG=sccache::server=error 
Nov 12 03:06:56 ++ sccache --start-server 
Nov 12 03:06:56 sccache: Starting the server... 
Nov 12 03:06:56 ++ sccache --zero-stats 

See CircleCI build pytorch_xla_linux_bionic_py3_6_clang9_test (2/2)

Step: "Run tests" (full log | diagnosis details | 🔁 rerun)

Nov 12 03:34:44 sccache: error: couldn't connect to server
Nov 12 03:34:44 +++ eval 'extract_trap_cmd ' 
Nov 12 03:34:44 ++++ extract_trap_cmd 
Nov 12 03:34:44 ++++ printf '%s\n' '' 
Nov 12 03:34:44 +++ printf '%s\n' cleanup 
Nov 12 03:34:44 ++ trap -- ' 
Nov 12 03:34:44 cleanup' EXIT 
Nov 12 03:34:44 ++ [[ pytorch-xla-linux-bionic-py3.6-clang9-test != *pytorch-win-* ]] 
Nov 12 03:34:44 ++ which sccache 
Nov 12 03:34:44 ++ sccache --stop-server 
Nov 12 03:34:44 Stopping sccache server... 
Nov 12 03:34:44 sccache: error: couldn't connect to server 
Nov 12 03:34:44 sccache: caused by: Connection refused (os error 111) 
Nov 12 03:34:44 ++ true 
Nov 12 03:34:44 ++ rm /var/lib/jenkins/sccache_error.log 
Nov 12 03:34:44 ++ [[ pytorch-xla-linux-bionic-py3.6-clang9-test == *rocm* ]] 
Nov 12 03:34:44 ++ SCCACHE_ERROR_LOG=/var/lib/jenkins/sccache_error.log 
Nov 12 03:34:44 ++ SCCACHE_IDLE_TIMEOUT=1200 
Nov 12 03:34:44 ++ RUST_LOG=sccache::server=error 
Nov 12 03:34:44 ++ sccache --start-server 
Nov 12 03:34:44 sccache: Starting the server... 
Nov 12 03:34:44 ++ sccache --zero-stats 

ci.pytorch.org: 1 failed


This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions on the GitHub issue tracker or post in the (internal) Dr. CI Users group.

See how this bot performed.

This comment has been revised 31 times.

@guilhermeleobas guilhermeleobas marked this pull request as ready for review November 10, 2020 14:51
@ailzhang ailzhang added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Nov 11, 2020
Copy link
Collaborator

@rgommers rgommers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @guilhermeleobas. The one CI failure that needed a closer look is test_abs_cuda_complex128 for ROCm:

======================================================================
FAIL: test_abs_cuda_complex128 (__main__.TestForeachCUDA)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/var/lib/jenkins/.local/lib/python3.6/site-packages/torch/testing/_internal/common_utils.py", line 834, in wrapper
    method(*args, **kwargs)
  File "/var/lib/jenkins/.local/lib/python3.6/site-packages/torch/testing/_internal/common_device_type.py", line 272, in instantiated_test
    result = test_fn(self, *args)
  File "test_foreach.py", line 259, in test_abs
    self.assertEqual(res, expected)
  File "/var/lib/jenkins/.local/lib/python3.6/site-packages/torch/testing/_internal/common_utils.py", line 1138, in assertEqual
    exact_dtype=exact_dtype, exact_device=exact_device)
  File "/var/lib/jenkins/.local/lib/python3.6/site-packages/torch/testing/_internal/common_utils.py", line 1112, in assertEqual
    self.assertTrue(result, msg=msg)
AssertionError: False is not true : Tensors failed to compare as equal! Attempted to compare equality of tensors with different dtypes. Got dtypes torch.float64 and torch.complex128.

======================================================================
FAIL: test_abs_cuda_complex64 (__main__.TestForeachCUDA)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/var/lib/jenkins/.local/lib/python3.6/site-packages/torch/testing/_internal/common_utils.py", line 834, in wrapper
    method(*args, **kwargs)
  File "/var/lib/jenkins/.local/lib/python3.6/site-packages/torch/testing/_internal/common_device_type.py", line 272, in instantiated_test
    result = test_fn(self, *args)
  File "test_foreach.py", line 252, in test_abs
    self.assertEqual(res, expected)
  File "/var/lib/jenkins/.local/lib/python3.6/site-packages/torch/testing/_internal/common_utils.py", line 1138, in assertEqual
    exact_dtype=exact_dtype, exact_device=exact_device

It looks like a flake; that job has a different failure in many other PRs, but they're different each time.

Maybe just push a code comment update for my one other comment to check it's not the same failure twice?
LGTM other than that.

with device(self.get_device()):
return super(_CudaBase, self).type(*args, **kwargs)
# We could use a Protocol here to tell mypy that self has `get_device` method
# but it is only available on Python >= 3.8 on typing or mypy_extensions on
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you mean typing_extensions? That is a non-optional dependency already, so could be used. That said, I don't think it's necessary to spend time on this now. Just updating the comment may be good for now.

@rgommers
Copy link
Collaborator

Okay, ROCm failures are different ones now - these are all flakes.

return super(_CudaBase, self).type(*args, **kwargs)
# We could use a Protocol here to tell mypy that self has `get_device` method
# but it is only available in the typing module on Python >= 3.8
# or on typing_extensions module on Python >= 3.6
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

btw we py3.6+ only now

Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ezyang has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@ezyang merged this pull request in 4f9d075.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla signed Merged module: typing Related to mypy type annotations open source triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Enable torch.cuda typechecks during CI
6 participants