Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

make meta tensor data access error message for expressive in assert_close #68802

Closed
wants to merge 13 commits into from

Conversation

pmeier
Copy link
Collaborator

@pmeier pmeier commented Nov 23, 2021

Stack from ghstack:

Without this patch, the error message of comparing meta tensors looks like this after #68722 was merged:

>>> t = torch.empty((), device="meta")
>>> assert_close(t, t)
NotImplementedError: Could not run 'aten::abs.out' with arguments from the 'Meta' backend. [...]
[...]
The above exception was the direct cause of the following exception:
[...]
RuntimeError: Comparing

TensorLikePair(
    id=(),
    actual=tensor(..., device='meta', size=()),
    expected=tensor(..., device='meta', size=()),
    rtol=1.3e-06,
    atol=1e-05,
    equal_nan=False,
    check_device=True,
    check_dtype=True,
    check_layout=True,
    check_stride=False,
    check_is_coalesced=True,
)

resulted in the unexpected exception above. If you are a user and see this message during normal operation please file an issue at https://github.com/pytorch/pytorch/issues. If you are a developer and working on the comparison functions, please except the previous error and raise an expressive `ErrorMeta` instead.

Thus, we follow our own advice and turn it into an expected exception until #68592 is resolved:

>>> t = torch.empty((), device="meta")
>>> assert_close(t, t)
ValueError: Comparing meta tensors is currently not supported

Differential Revision: D33542999

@pytorch-probot
Copy link

pytorch-probot bot commented Nov 23, 2021

CI Flow Status

⚛️ CI Flow

Ruleset - Version: v1
Ruleset - File: https://github.com/pytorch/pytorch/blob/ddd15f277305785ac7c649f869c005dc85511bb9/.github/generated-ciflow-ruleset.json
PR ciflow labels: ciflow/default

Workflows Labels (bold enabled) Status
Triggered Workflows
linux-bionic-py3.6-clang9 ciflow/all, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/noarch, ciflow/trunk, ciflow/xla ✅ triggered
linux-docs ciflow/all, ciflow/cpu, ciflow/default, ciflow/docs, ciflow/linux, ciflow/trunk ✅ triggered
linux-vulkan-bionic-py3.6-clang9 ciflow/all, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/trunk, ciflow/vulkan ✅ triggered
linux-xenial-cuda11.3-py3.6-gcc7 ciflow/all, ciflow/cuda, ciflow/default, ciflow/linux, ciflow/trunk ✅ triggered
linux-xenial-cuda11.3-py3.6-gcc7-bazel-test ciflow/all, ciflow/bazel, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/trunk ✅ triggered
linux-xenial-py3-clang5-mobile-build ciflow/all, ciflow/default, ciflow/linux, ciflow/mobile, ciflow/trunk ✅ triggered
linux-xenial-py3-clang5-mobile-custom-build-static ciflow/all, ciflow/default, ciflow/linux, ciflow/mobile, ciflow/trunk ✅ triggered
linux-xenial-py3.6-clang7-asan ciflow/all, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/sanitizers, ciflow/trunk ✅ triggered
linux-xenial-py3.6-clang7-onnx ciflow/all, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/onnx, ciflow/trunk ✅ triggered
linux-xenial-py3.6-gcc5.4 ciflow/all, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/trunk ✅ triggered
linux-xenial-py3.6-gcc7 ciflow/all, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/trunk ✅ triggered
pytorch-linux-xenial-py3-clang5-android-ndk-r19c-gradle-custom-build-single ciflow/all, ciflow/android, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/trunk ✅ triggered
pytorch-linux-xenial-py3-clang5-android-ndk-r19c-gradle-custom-build-single-full-jit ciflow/all, ciflow/android, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/trunk ✅ triggered
win-vs2019-cpu-py3 ciflow/all, ciflow/cpu, ciflow/default, ciflow/trunk, ciflow/win ✅ triggered
win-vs2019-cuda11.3-py3 ciflow/all, ciflow/cuda, ciflow/default, ciflow/trunk, ciflow/win ✅ triggered
Skipped Workflows
caffe2-linux-xenial-py3.6-gcc5.4 ciflow/all, ciflow/cpu, ciflow/linux, ciflow/trunk 🚫 skipped
docker-builds ciflow/all, ciflow/trunk 🚫 skipped
ios-12-5-1-arm64 ciflow/all, ciflow/ios, ciflow/macos, ciflow/trunk 🚫 skipped
ios-12-5-1-arm64-coreml ciflow/all, ciflow/ios, ciflow/macos, ciflow/trunk 🚫 skipped
ios-12-5-1-arm64-custom-ops ciflow/all, ciflow/ios, ciflow/macos, ciflow/trunk 🚫 skipped
ios-12-5-1-arm64-full-jit ciflow/all, ciflow/ios, ciflow/macos, ciflow/trunk 🚫 skipped
ios-12-5-1-arm64-metal ciflow/all, ciflow/ios, ciflow/macos, ciflow/trunk 🚫 skipped
ios-12-5-1-x86-64 ciflow/all, ciflow/ios, ciflow/macos, ciflow/trunk 🚫 skipped
ios-12-5-1-x86-64-coreml ciflow/all, ciflow/ios, ciflow/macos, ciflow/trunk 🚫 skipped
ios-12-5-1-x86-64-full-jit ciflow/all, ciflow/ios, ciflow/macos, ciflow/trunk 🚫 skipped
libtorch-linux-xenial-cuda10.2-py3.6-gcc7 ciflow/all, ciflow/cuda, ciflow/libtorch, ciflow/linux, ciflow/trunk 🚫 skipped
libtorch-linux-xenial-cuda11.3-py3.6-gcc7 ciflow/all, ciflow/cuda, ciflow/libtorch, ciflow/linux, ciflow/trunk 🚫 skipped
linux-bionic-cuda10.2-py3.9-gcc7 ciflow/all, ciflow/cuda, ciflow/linux, ciflow/slow, ciflow/trunk 🚫 skipped
linux-docs-push ciflow/all, ciflow/cpu, ciflow/linux, ciflow/scheduled 🚫 skipped
macos-10-15-py3-arm64 ciflow/all, ciflow/macos, ciflow/trunk 🚫 skipped
macos-10-15-py3-lite-interpreter-x86-64 ciflow/all, ciflow/macos, ciflow/trunk 🚫 skipped
macos-11-py3-x86-64 ciflow/all, ciflow/macos, ciflow/trunk 🚫 skipped
parallelnative-linux-xenial-py3.6-gcc5.4 ciflow/all, ciflow/cpu, ciflow/linux, ciflow/trunk 🚫 skipped
periodic-libtorch-linux-bionic-cuda11.5-py3.6-gcc7 ciflow/all, ciflow/cuda, ciflow/libtorch, ciflow/linux, ciflow/scheduled 🚫 skipped
periodic-libtorch-linux-xenial-cuda11.1-py3.6-gcc7 ciflow/all, ciflow/cuda, ciflow/libtorch, ciflow/linux, ciflow/scheduled 🚫 skipped
periodic-linux-bionic-cuda11.5-py3.6-gcc7 ciflow/all, ciflow/cuda, ciflow/linux, ciflow/scheduled 🚫 skipped
periodic-linux-xenial-cuda10.2-py3-gcc7-slow-gradcheck ciflow/all, ciflow/cuda, ciflow/linux, ciflow/scheduled, ciflow/slow, ciflow/slow-gradcheck 🚫 skipped
periodic-linux-xenial-cuda11.1-py3.6-gcc7-debug ciflow/all, ciflow/cuda, ciflow/linux, ciflow/scheduled 🚫 skipped
periodic-win-vs2019-cuda11.1-py3 ciflow/all, ciflow/cuda, ciflow/scheduled, ciflow/win 🚫 skipped
periodic-win-vs2019-cuda11.5-py3 ciflow/all, ciflow/cuda, ciflow/scheduled, ciflow/win 🚫 skipped

You can add a comment to the PR and tag @pytorchbot with the following commands:
# ciflow rerun, "ciflow/default" will always be added automatically
@pytorchbot ciflow rerun

# ciflow rerun with additional labels "-l <ciflow/label_name>", which is equivalent to adding these labels manually and trigger the rerun
@pytorchbot ciflow rerun -l ciflow/scheduled -l ciflow/slow

For more information, please take a look at the CI Flow Wiki.

@facebook-github-bot
Copy link
Contributor

facebook-github-bot commented Nov 23, 2021

🔗 Helpful links

💊 CI failures summary and remediations

As of commit ddd15f2 (more details on the Dr. CI page):


💚 💚 Looks good so far! There are no failures yet. 💚 💚


This comment was automatically generated by Dr. CI (expand for details).

Please report bugs/suggestions to the (internal) Dr. CI Users group.

Click here to manually regenerate this comment.

@pmeier pmeier added module: meta tensors module: testing Issues related to the torch.testing module (not tests) labels Nov 23, 2021
Until #68592 is resolved, we explicitly exclude them to have an expressive error message.


[ghstack-poisoned]
Without this patch, the error message of comparing meta tensors looks like this after #68722 was merged:

```python
>>> t = torch.empty((), device="meta")
>>> assert_close(t, t)
NotImplementedError: Could not run 'aten::abs.out' with arguments from the 'Meta' backend. [...]
[...]
The above exception was the direct cause of the following exception:
[...]
RuntimeError: Comparing

TensorLikePair(
    id=(),
    actual=tensor(..., device='meta', size=()),
    expected=tensor(..., device='meta', size=()),
    rtol=1.3e-06,
    atol=1e-05,
    equal_nan=False,
    check_device=True,
    check_dtype=True,
    check_layout=True,
    check_stride=False,
    check_is_coalesced=True,
)

resulted in the unexpected exception above. If you are a user and see this message during normal operation please file an issue at https://github.com/pytorch/pytorch/issues. If you are a developer and working on the comparison functions, please except the previous error and raise an expressive `ErrorMeta` instead.
```

Thus, we follow our own advice and turn it into an expected exception until #68592 is resolved:

```python
>>> t = torch.empty((), device="meta")
>>> assert_close(t, t)
ValueError: Comparing meta tensors is currently not supported
```

[ghstack-poisoned]
Without this patch, the error message of comparing meta tensors looks like this after #68722 was merged:

```python
>>> t = torch.empty((), device="meta")
>>> assert_close(t, t)
NotImplementedError: Could not run 'aten::abs.out' with arguments from the 'Meta' backend. [...]
[...]
The above exception was the direct cause of the following exception:
[...]
RuntimeError: Comparing

TensorLikePair(
    id=(),
    actual=tensor(..., device='meta', size=()),
    expected=tensor(..., device='meta', size=()),
    rtol=1.3e-06,
    atol=1e-05,
    equal_nan=False,
    check_device=True,
    check_dtype=True,
    check_layout=True,
    check_stride=False,
    check_is_coalesced=True,
)

resulted in the unexpected exception above. If you are a user and see this message during normal operation please file an issue at https://github.com/pytorch/pytorch/issues. If you are a developer and working on the comparison functions, please except the previous error and raise an expressive `ErrorMeta` instead.
```

Thus, we follow our own advice and turn it into an expected exception until #68592 is resolved:

```python
>>> t = torch.empty((), device="meta")
>>> assert_close(t, t)
ValueError: Comparing meta tensors is currently not supported
```

[ghstack-poisoned]
Without this patch, the error message of comparing meta tensors looks like this after #68722 was merged:

```python
>>> t = torch.empty((), device="meta")
>>> assert_close(t, t)
NotImplementedError: Could not run 'aten::abs.out' with arguments from the 'Meta' backend. [...]
[...]
The above exception was the direct cause of the following exception:
[...]
RuntimeError: Comparing

TensorLikePair(
    id=(),
    actual=tensor(..., device='meta', size=()),
    expected=tensor(..., device='meta', size=()),
    rtol=1.3e-06,
    atol=1e-05,
    equal_nan=False,
    check_device=True,
    check_dtype=True,
    check_layout=True,
    check_stride=False,
    check_is_coalesced=True,
)

resulted in the unexpected exception above. If you are a user and see this message during normal operation please file an issue at https://github.com/pytorch/pytorch/issues. If you are a developer and working on the comparison functions, please except the previous error and raise an expressive `ErrorMeta` instead.
```

Thus, we follow our own advice and turn it into an expected exception until #68592 is resolved:

```python
>>> t = torch.empty((), device="meta")
>>> assert_close(t, t)
ValueError: Comparing meta tensors is currently not supported
```

[ghstack-poisoned]
Without this patch, the error message of comparing meta tensors looks like this after #68722 was merged:

```python
>>> t = torch.empty((), device="meta")
>>> assert_close(t, t)
NotImplementedError: Could not run 'aten::abs.out' with arguments from the 'Meta' backend. [...]
[...]
The above exception was the direct cause of the following exception:
[...]
RuntimeError: Comparing

TensorLikePair(
    id=(),
    actual=tensor(..., device='meta', size=()),
    expected=tensor(..., device='meta', size=()),
    rtol=1.3e-06,
    atol=1e-05,
    equal_nan=False,
    check_device=True,
    check_dtype=True,
    check_layout=True,
    check_stride=False,
    check_is_coalesced=True,
)

resulted in the unexpected exception above. If you are a user and see this message during normal operation please file an issue at https://github.com/pytorch/pytorch/issues. If you are a developer and working on the comparison functions, please except the previous error and raise an expressive `ErrorMeta` instead.
```

Thus, we follow our own advice and turn it into an expected exception until #68592 is resolved:

```python
>>> t = torch.empty((), device="meta")
>>> assert_close(t, t)
ValueError: Comparing meta tensors is currently not supported
```

[ghstack-poisoned]
@pmeier pmeier changed the title exclude meta tensors from being compared in assert_close make meta tensor data access error message for expressive in assert_close Dec 16, 2021
…in assert_close"


Without this patch, the error message of comparing meta tensors looks like this after #68722 was merged:

```python
>>> t = torch.empty((), device="meta")
>>> assert_close(t, t)
NotImplementedError: Could not run 'aten::abs.out' with arguments from the 'Meta' backend. [...]
[...]
The above exception was the direct cause of the following exception:
[...]
RuntimeError: Comparing

TensorLikePair(
    id=(),
    actual=tensor(..., device='meta', size=()),
    expected=tensor(..., device='meta', size=()),
    rtol=1.3e-06,
    atol=1e-05,
    equal_nan=False,
    check_device=True,
    check_dtype=True,
    check_layout=True,
    check_stride=False,
    check_is_coalesced=True,
)

resulted in the unexpected exception above. If you are a user and see this message during normal operation please file an issue at https://github.com/pytorch/pytorch/issues. If you are a developer and working on the comparison functions, please except the previous error and raise an expressive `ErrorMeta` instead.
```

Thus, we follow our own advice and turn it into an expected exception until #68592 is resolved:

```python
>>> t = torch.empty((), device="meta")
>>> assert_close(t, t)
ValueError: Comparing meta tensors is currently not supported
```

[ghstack-poisoned]
…in assert_close"


Without this patch, the error message of comparing meta tensors looks like this after #68722 was merged:

```python
>>> t = torch.empty((), device="meta")
>>> assert_close(t, t)
NotImplementedError: Could not run 'aten::abs.out' with arguments from the 'Meta' backend. [...]
[...]
The above exception was the direct cause of the following exception:
[...]
RuntimeError: Comparing

TensorLikePair(
    id=(),
    actual=tensor(..., device='meta', size=()),
    expected=tensor(..., device='meta', size=()),
    rtol=1.3e-06,
    atol=1e-05,
    equal_nan=False,
    check_device=True,
    check_dtype=True,
    check_layout=True,
    check_stride=False,
    check_is_coalesced=True,
)

resulted in the unexpected exception above. If you are a user and see this message during normal operation please file an issue at https://github.com/pytorch/pytorch/issues. If you are a developer and working on the comparison functions, please except the previous error and raise an expressive `ErrorMeta` instead.
```

Thus, we follow our own advice and turn it into an expected exception until #68592 is resolved:

```python
>>> t = torch.empty((), device="meta")
>>> assert_close(t, t)
ValueError: Comparing meta tensors is currently not supported
```

[ghstack-poisoned]
…in assert_close"


Without this patch, the error message of comparing meta tensors looks like this after #68722 was merged:

```python
>>> t = torch.empty((), device="meta")
>>> assert_close(t, t)
NotImplementedError: Could not run 'aten::abs.out' with arguments from the 'Meta' backend. [...]
[...]
The above exception was the direct cause of the following exception:
[...]
RuntimeError: Comparing

TensorLikePair(
    id=(),
    actual=tensor(..., device='meta', size=()),
    expected=tensor(..., device='meta', size=()),
    rtol=1.3e-06,
    atol=1e-05,
    equal_nan=False,
    check_device=True,
    check_dtype=True,
    check_layout=True,
    check_stride=False,
    check_is_coalesced=True,
)

resulted in the unexpected exception above. If you are a user and see this message during normal operation please file an issue at https://github.com/pytorch/pytorch/issues. If you are a developer and working on the comparison functions, please except the previous error and raise an expressive `ErrorMeta` instead.
```

Thus, we follow our own advice and turn it into an expected exception until #68592 is resolved:

```python
>>> t = torch.empty((), device="meta")
>>> assert_close(t, t)
ValueError: Comparing meta tensors is currently not supported
```

[ghstack-poisoned]
…in assert_close"


Without this patch, the error message of comparing meta tensors looks like this after #68722 was merged:

```python
>>> t = torch.empty((), device="meta")
>>> assert_close(t, t)
NotImplementedError: Could not run 'aten::abs.out' with arguments from the 'Meta' backend. [...]
[...]
The above exception was the direct cause of the following exception:
[...]
RuntimeError: Comparing

TensorLikePair(
    id=(),
    actual=tensor(..., device='meta', size=()),
    expected=tensor(..., device='meta', size=()),
    rtol=1.3e-06,
    atol=1e-05,
    equal_nan=False,
    check_device=True,
    check_dtype=True,
    check_layout=True,
    check_stride=False,
    check_is_coalesced=True,
)

resulted in the unexpected exception above. If you are a user and see this message during normal operation please file an issue at https://github.com/pytorch/pytorch/issues. If you are a developer and working on the comparison functions, please except the previous error and raise an expressive `ErrorMeta` instead.
```

Thus, we follow our own advice and turn it into an expected exception until #68592 is resolved:

```python
>>> t = torch.empty((), device="meta")
>>> assert_close(t, t)
ValueError: Comparing meta tensors is currently not supported
```

[ghstack-poisoned]
…in assert_close"


Without this patch, the error message of comparing meta tensors looks like this after #68722 was merged:

```python
>>> t = torch.empty((), device="meta")
>>> assert_close(t, t)
NotImplementedError: Could not run 'aten::abs.out' with arguments from the 'Meta' backend. [...]
[...]
The above exception was the direct cause of the following exception:
[...]
RuntimeError: Comparing

TensorLikePair(
    id=(),
    actual=tensor(..., device='meta', size=()),
    expected=tensor(..., device='meta', size=()),
    rtol=1.3e-06,
    atol=1e-05,
    equal_nan=False,
    check_device=True,
    check_dtype=True,
    check_layout=True,
    check_stride=False,
    check_is_coalesced=True,
)

resulted in the unexpected exception above. If you are a user and see this message during normal operation please file an issue at https://github.com/pytorch/pytorch/issues. If you are a developer and working on the comparison functions, please except the previous error and raise an expressive `ErrorMeta` instead.
```

Thus, we follow our own advice and turn it into an expected exception until #68592 is resolved:

```python
>>> t = torch.empty((), device="meta")
>>> assert_close(t, t)
ValueError: Comparing meta tensors is currently not supported
```

[ghstack-poisoned]
try:
yield
except NotImplementedError as error:
if "meta" not in str(error).lower():
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not in love with this mechanism for detecting if the not impl error is based on the tensor being a meta tensor or not, but it seems OK for now

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Me neither. But I don't think there is a better option right now. We should remove this while resolving #68592.

@mruberry
Copy link
Collaborator

@mruberry has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@facebook-github-bot facebook-github-bot deleted the gh/pmeier/57/head branch January 15, 2022 15:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla signed module: meta tensors module: testing Issues related to the torch.testing module (not tests) open source
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants