[ONNX] Fix an assertion failure involving Slice (#71965) #72989

BowenBao · 2022-02-17T02:01:16Z

Stack from ghstack:

Before this change, exporting a model to ONNX involving Slice crashes at axes[i] in line 153 if C++ assertions are enabled:

/usr/include/c++/11.1.0/bits/stl_vector.h:1045: std::vector<_Tp, _Alloc>::reference std::vector<_Tp, _Alloc>::operator[](std::vector<_Tp, _Alloc>::size_type) [with _Tp = long int; _Alloc = std::allocator<long int>; std::vector<_Tp, _Alloc>::reference = long int&; std::vector<_Tp, _Alloc>::size_type = long unsigned int]: Assertion '__n < this->size()' failed.

The relevant check is https://github.com/gcc-mirror/gcc/blob/releases/gcc-11.1.0/libstdc++-v3/include/bits/stl_vector.h#L1045, which checks the vector index.

The issue can be reproduced by exporting Mask R-CNN or similar ones. For example,

import io
import torch
import torchvision as tv

model = tv.models.detection.maskrcnn_resnet50_fpn(pretrained=False)
x = [torch.rand(3, 300, 400), torch.rand(3, 500, 400)]
with io.BytesIO() as f:
    torch.onnx.export(model, x, f, opset_version=11)

(extracted from onnxoptimizer tests)

Tested environment: Arch Linux x86_64 with pytorch and torchvisoin installed from the official repo and AUR, respectively.

Before this change, exporting a model to ONNX involving Slice crashes at `axes[i]` in line 153 if C++ assertions are enabled: ``` /usr/include/c++/11.1.0/bits/stl_vector.h:1045: std::vector<_Tp, _Alloc>::reference std::vector<_Tp, _Alloc>::operator[](std::vector<_Tp, _Alloc>::size_type) [with _Tp = long int; _Alloc = std::allocator<long int>; std::vector<_Tp, _Alloc>::reference = long int&; std::vector<_Tp, _Alloc>::size_type = long unsigned int]: Assertion '__n < this->size()' failed. ``` The relevant check is https://github.com/gcc-mirror/gcc/blob/releases/gcc-11.1.0/libstdc++-v3/include/bits/stl_vector.h#L1045, which checks the vector index. The issue can be reproduced by exporting Mask R-CNN or similar ones. For example, ```Python import io import torch import torchvision as tv model = tv.models.detection.maskrcnn_resnet50_fpn(pretrained=False) x = [torch.rand(3, 300, 400), torch.rand(3, 500, 400)] with io.BytesIO() as f: torch.onnx.export(model, x, f, opset_version=11) ``` (extracted from [onnxoptimizer tests](https://github.com/onnx/optimizer/blob/master/onnxoptimizer/test/optimizer_test.py)) Tested environment: Arch Linux x86_64 with pytorch and torchvisoin installed from [the official repo](https://github.com/archlinux/svntogit-community/blob/packages/python-pytorch/trunk/PKGBUILD) and [AUR](https://aur.archlinux.org/cgit/aur.git/tree/PKGBUILD?h=python-torchvision), respectively. [ghstack-poisoned]

pytorch-bot · 2022-02-17T02:01:20Z

CI Flow Status

⚛️ CI Flow

Ruleset - Version: v1
Ruleset - File: https://github.com/pytorch/pytorch/blob/5f27273b485af6fef46dc62e21b2365f5cc48da9/.github/generated-ciflow-ruleset.json
PR ciflow labels: ciflow/default
Add ciflow labels to this PR to trigger more builds:

Workflows	Labels (bold enabled)	Status
Triggered Workflows
linux-binary-conda	`ciflow/binaries`, `ciflow/binaries_conda`, `ciflow/default`	✅ triggered
linux-binary-libtorch-cxx11-abi	`ciflow/binaries`, `ciflow/binaries_libtorch`, `ciflow/default`	✅ triggered
linux-binary-libtorch-pre-cxx11	`ciflow/binaries`, `ciflow/binaries_libtorch`, `ciflow/default`	✅ triggered
linux-binary-manywheel	`ciflow/binaries`, `ciflow/binaries_wheel`, `ciflow/default`	✅ triggered
linux-bionic-py3.7-clang9	`ciflow/all`, `ciflow/cpu`, `ciflow/default`, `ciflow/linux`, `ciflow/noarch`, `ciflow/trunk`	✅ triggered
linux-bionic-rocm4.5-py3.7	`ciflow/all`, `ciflow/default`, `ciflow/linux`, `ciflow/rocm`, `ciflow/trunk`	✅ triggered
linux-docs	`ciflow/all`, `ciflow/cpu`, `ciflow/default`, `ciflow/docs`, `ciflow/linux`, `ciflow/trunk`	✅ triggered
linux-vulkan-bionic-py3.7-clang9	`ciflow/all`, `ciflow/cpu`, `ciflow/default`, `ciflow/linux`, `ciflow/trunk`, `ciflow/vulkan`	✅ triggered
linux-xenial-cuda11.3-py3.7-gcc7	`ciflow/all`, `ciflow/cuda`, `ciflow/default`, `ciflow/linux`, `ciflow/trunk`	✅ triggered
linux-xenial-cuda11.3-py3.7-gcc7-bazel-test	`ciflow/all`, `ciflow/bazel`, `ciflow/cpu`, `ciflow/default`, `ciflow/linux`, `ciflow/trunk`	✅ triggered
linux-xenial-py3-clang5-mobile-build	`ciflow/all`, `ciflow/default`, `ciflow/linux`, `ciflow/mobile`, `ciflow/trunk`	✅ triggered
linux-xenial-py3-clang5-mobile-custom-build-static	`ciflow/all`, `ciflow/default`, `ciflow/linux`, `ciflow/mobile`, `ciflow/trunk`	✅ triggered
linux-xenial-py3.7-clang7-asan	`ciflow/all`, `ciflow/cpu`, `ciflow/default`, `ciflow/linux`, `ciflow/sanitizers`, `ciflow/trunk`	✅ triggered
linux-xenial-py3.7-clang7-onnx	`ciflow/all`, `ciflow/cpu`, `ciflow/default`, `ciflow/linux`, `ciflow/onnx`, `ciflow/trunk`	✅ triggered
linux-xenial-py3.7-gcc5.4	`ciflow/all`, `ciflow/cpu`, `ciflow/default`, `ciflow/linux`, `ciflow/trunk`	✅ triggered
linux-xenial-py3.7-gcc7	`ciflow/all`, `ciflow/cpu`, `ciflow/default`, `ciflow/linux`, `ciflow/trunk`	✅ triggered
linux-xenial-py3.7-gcc7-no-ops	`ciflow/all`, `ciflow/cpu`, `ciflow/default`, `ciflow/linux`, `ciflow/trunk`	✅ triggered
macos-arm64-binary-conda	`ciflow/binaries`, `ciflow/binaries_conda`, `ciflow/default`	✅ triggered
macos-arm64-binary-wheel	`ciflow/binaries`, `ciflow/binaries_wheel`, `ciflow/default`	✅ triggered
macos-binary-conda	`ciflow/binaries`, `ciflow/binaries_conda`, `ciflow/default`	✅ triggered
macos-binary-libtorch-cxx11-abi	`ciflow/binaries`, `ciflow/binaries_libtorch`, `ciflow/default`	✅ triggered
macos-binary-libtorch-pre-cxx11	`ciflow/binaries`, `ciflow/binaries_libtorch`, `ciflow/default`	✅ triggered
macos-binary-wheel	`ciflow/binaries`, `ciflow/binaries_wheel`, `ciflow/default`	✅ triggered
pytorch-linux-xenial-py3-clang5-android-ndk-r19c-gradle-custom-build-single	`ciflow/all`, `ciflow/android`, `ciflow/cpu`, `ciflow/default`, `ciflow/linux`, `ciflow/trunk`	✅ triggered
pytorch-linux-xenial-py3-clang5-android-ndk-r19c-gradle-custom-build-single-full-jit	`ciflow/all`, `ciflow/android`, `ciflow/cpu`, `ciflow/default`, `ciflow/linux`, `ciflow/trunk`	✅ triggered
win-vs2019-cpu-py3	`ciflow/all`, `ciflow/cpu`, `ciflow/default`, `ciflow/trunk`, `ciflow/win`	✅ triggered
win-vs2019-cuda11.3-py3	`ciflow/all`, `ciflow/cuda`, `ciflow/default`, `ciflow/trunk`, `ciflow/win`	✅ triggered
windows-binary-libtorch-cxx11-abi	`ciflow/binaries`, `ciflow/binaries_libtorch`, `ciflow/default`	✅ triggered
windows-binary-libtorch-pre-cxx11	`ciflow/binaries`, `ciflow/binaries_libtorch`, `ciflow/default`	✅ triggered
windows-binary-wheel	`ciflow/binaries`, `ciflow/binaries_wheel`, `ciflow/default`	✅ triggered
Skipped Workflows
caffe2-linux-xenial-py3.7-gcc5.4	`ciflow/all`, `ciflow/cpu`, `ciflow/linux`, `ciflow/trunk`	🚫 skipped
docker-builds	`ciflow/all`, `ciflow/trunk`	🚫 skipped
ios-12-5-1-arm64	`ciflow/all`, `ciflow/ios`, `ciflow/macos`, `ciflow/trunk`	🚫 skipped
ios-12-5-1-arm64-coreml	`ciflow/all`, `ciflow/ios`, `ciflow/macos`, `ciflow/trunk`	🚫 skipped
ios-12-5-1-arm64-custom-ops	`ciflow/all`, `ciflow/ios`, `ciflow/macos`, `ciflow/trunk`	🚫 skipped
ios-12-5-1-arm64-full-jit	`ciflow/all`, `ciflow/ios`, `ciflow/macos`, `ciflow/trunk`	🚫 skipped
ios-12-5-1-arm64-metal	`ciflow/all`, `ciflow/ios`, `ciflow/macos`, `ciflow/trunk`	🚫 skipped
ios-12-5-1-x86-64	`ciflow/all`, `ciflow/ios`, `ciflow/macos`, `ciflow/trunk`	🚫 skipped
ios-12-5-1-x86-64-coreml	`ciflow/all`, `ciflow/ios`, `ciflow/macos`, `ciflow/trunk`	🚫 skipped
ios-12-5-1-x86-64-full-jit	`ciflow/all`, `ciflow/ios`, `ciflow/macos`, `ciflow/trunk`	🚫 skipped
libtorch-linux-xenial-cuda10.2-py3.7-gcc7	`ciflow/all`, `ciflow/cuda`, `ciflow/libtorch`, `ciflow/linux`, `ciflow/trunk`	🚫 skipped
libtorch-linux-xenial-cuda11.3-py3.7-gcc7	`ciflow/all`, `ciflow/cuda`, `ciflow/libtorch`, `ciflow/linux`, `ciflow/trunk`	🚫 skipped
linux-bionic-cuda10.2-py3.9-gcc7	`ciflow/all`, `ciflow/cuda`, `ciflow/linux`, `ciflow/slow`, `ciflow/trunk`	🚫 skipped
linux-docs-push	`ciflow/all`, `ciflow/cpu`, `ciflow/linux`, `ciflow/scheduled`	🚫 skipped
linux-xenial-cuda11.3-py3.7-gcc7-no-ops	`ciflow/all`, `ciflow/cuda`, `ciflow/linux`, `ciflow/trunk`	🚫 skipped
macos-10-15-py3-arm64	`ciflow/all`, `ciflow/macos`, `ciflow/trunk`	🚫 skipped
macos-10-15-py3-lite-interpreter-x86-64	`ciflow/all`, `ciflow/macos`, `ciflow/trunk`	🚫 skipped
macos-11-py3-x86-64	`ciflow/all`, `ciflow/macos`, `ciflow/trunk`	🚫 skipped
parallelnative-linux-xenial-py3.7-gcc5.4	`ciflow/all`, `ciflow/cpu`, `ciflow/linux`, `ciflow/trunk`	🚫 skipped
periodic-libtorch-linux-bionic-cuda11.5-py3.7-gcc7	`ciflow/all`, `ciflow/cuda`, `ciflow/libtorch`, `ciflow/linux`, `ciflow/scheduled`	🚫 skipped
periodic-libtorch-linux-xenial-cuda11.1-py3.7-gcc7	`ciflow/all`, `ciflow/cuda`, `ciflow/libtorch`, `ciflow/linux`, `ciflow/scheduled`	🚫 skipped
periodic-linux-bionic-cuda11.5-py3.7-gcc7	`ciflow/all`, `ciflow/cuda`, `ciflow/linux`, `ciflow/scheduled`	🚫 skipped
periodic-linux-xenial-cuda10.2-py3-gcc7-slow-gradcheck	`ciflow/all`, `ciflow/cuda`, `ciflow/linux`, `ciflow/scheduled`, `ciflow/slow`, `ciflow/slow-gradcheck`	🚫 skipped
periodic-linux-xenial-cuda11.1-py3.7-gcc7-debug	`ciflow/all`, `ciflow/cuda`, `ciflow/linux`, `ciflow/scheduled`	🚫 skipped
periodic-win-vs2019-cuda11.1-py3	`ciflow/all`, `ciflow/cuda`, `ciflow/scheduled`, `ciflow/win`	🚫 skipped
periodic-win-vs2019-cuda11.5-py3	`ciflow/all`, `ciflow/cuda`, `ciflow/scheduled`, `ciflow/win`	🚫 skipped
pytorch-linux-xenial-py3-clang5-android-ndk-r19c-build	`ciflow/all`, `ciflow/android`, `ciflow/cpu`, `ciflow/linux`, `ciflow/trunk`	🚫 skipped
pytorch-xla-linux-bionic-py3.7-clang8	`ciflow/all`, `ciflow/cpu`, `ciflow/linux`, `ciflow/trunk`, `ciflow/xla`	🚫 skipped

facebook-github-bot · 2022-02-17T02:01:34Z

🔗 Helpful links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/72989
📄 Preview docs built from this PR
📄 Preview C++ docs built from this PR
🔧 Opt-in to CIFlow to control what jobs run on your PRs

💊 CI failures summary and remediations

As of commit 5900947 (more details on the Dr. CI page):

💚 💚 Looks good so far! There are no failures yet. 💚 💚

This comment was automatically generated by Dr. CI (expand for details).

Please report bugs/suggestions to the (internal) Dr. CI Users group.

Click here to manually regenerate this comment.

Before this change, exporting a model to ONNX involving Slice crashes at `axes[i]` in line 153 if C++ assertions are enabled: ``` /usr/include/c++/11.1.0/bits/stl_vector.h:1045: std::vector<_Tp, _Alloc>::reference std::vector<_Tp, _Alloc>::operator[](std::vector<_Tp, _Alloc>::size_type) [with _Tp = long int; _Alloc = std::allocator<long int>; std::vector<_Tp, _Alloc>::reference = long int&; std::vector<_Tp, _Alloc>::size_type = long unsigned int]: Assertion '__n < this->size()' failed. ``` The relevant check is https://github.com/gcc-mirror/gcc/blob/releases/gcc-11.1.0/libstdc++-v3/include/bits/stl_vector.h#L1045, which checks the vector index. The issue can be reproduced by exporting Mask R-CNN or similar ones. For example, ```Python import io import torch import torchvision as tv model = tv.models.detection.maskrcnn_resnet50_fpn(pretrained=False) x = [torch.rand(3, 300, 400), torch.rand(3, 500, 400)] with io.BytesIO() as f: torch.onnx.export(model, x, f, opset_version=11) ``` (extracted from [onnxoptimizer tests](https://github.com/onnx/optimizer/blob/master/onnxoptimizer/test/optimizer_test.py)) Tested environment: Arch Linux x86_64 with pytorch and torchvisoin installed from [the official repo](https://github.com/archlinux/svntogit-community/blob/packages/python-pytorch/trunk/PKGBUILD) and [AUR](https://aur.archlinux.org/cgit/aur.git/tree/PKGBUILD?h=python-torchvision), respectively. [ghstack-poisoned]

garymm

💮

BowenBao · 2022-02-18T18:40:28Z

@pytorchbot merge this

github-actions · 2022-02-18T18:42:25Z

Hey @BowenBao.
You've committed this PR, but it does not have both a 'release notes: ...' and 'topics: ...' label. Please add one of each to the PR. The 'release notes: ...' label should represent the part of PyTorch that this PR changes (fx, autograd, distributed, etc) and the 'topics: ...' label should represent the kind of PR it is (not user facing, new feature, bug fix, perf improvement, etc). The list of valid labels can be found here for the 'release notes: ...' and here for the 'topics: ...'.
For changes that are 'topic: not user facing' there is no need for a release notes label.

Before this change, exporting a model to ONNX involving Slice crashes at `axes[i]` in line 153 if C++ assertions are enabled: ``` /usr/include/c++/11.1.0/bits/stl_vector.h:1045: std::vector<_Tp, _Alloc>::reference std::vector<_Tp, _Alloc>::operator[](std::vector<_Tp, _Alloc>::size_type) [with _Tp = long int; _Alloc = std::allocator<long int>; std::vector<_Tp, _Alloc>::reference = long int&; std::vector<_Tp, _Alloc>::size_type = long unsigned int]: Assertion '__n < this->size()' failed. ``` The relevant check is https://github.com/gcc-mirror/gcc/blob/releases/gcc-11.1.0/libstdc++-v3/include/bits/stl_vector.h#L1045, which checks the vector index. The issue can be reproduced by exporting Mask R-CNN or similar ones. For example, ```Python import io import torch import torchvision as tv model = tv.models.detection.maskrcnn_resnet50_fpn(pretrained=False) x = [torch.rand(3, 300, 400), torch.rand(3, 500, 400)] with io.BytesIO() as f: torch.onnx.export(model, x, f, opset_version=11) ``` (extracted from [onnxoptimizer tests](https://github.com/onnx/optimizer/blob/master/onnxoptimizer/test/optimizer_test.py)) Tested environment: Arch Linux x86_64 with pytorch and torchvisoin installed from [the official repo](https://github.com/archlinux/svntogit-community/blob/packages/python-pytorch/trunk/PKGBUILD) and [AUR](https://aur.archlinux.org/cgit/aur.git/tree/PKGBUILD?h=python-torchvision), respectively. Pull Request resolved: pytorch/pytorch#72989

Summary: Before this change, exporting a model to ONNX involving Slice crashes at `axes[i]` in line 153 if C++ assertions are enabled: ``` /usr/include/c++/11.1.0/bits/stl_vector.h:1045: std::vector<_Tp, _Alloc>::reference std::vector<_Tp, _Alloc>::operator[](std::vector<_Tp, _Alloc>::size_type) [with _Tp = long int; _Alloc = std::allocator<long int>; std::vector<_Tp, _Alloc>::reference = long int&; std::vector<_Tp, _Alloc>::size_type = long unsigned int]: Assertion '__n < this->size()' failed. ``` The relevant check is https://github.com/gcc-mirror/gcc/blob/releases/gcc-11.1.0/libstdc++-v3/include/bits/stl_vector.h#L1045, which checks the vector index. The issue can be reproduced by exporting Mask R-CNN or similar ones. For example, ```Python import io import torch import torchvision as tv model = tv.models.detection.maskrcnn_resnet50_fpn(pretrained=False) x = [torch.rand(3, 300, 400), torch.rand(3, 500, 400)] with io.BytesIO() as f: torch.onnx.export(model, x, f, opset_version=11) ``` (extracted from [onnxoptimizer tests](https://github.com/onnx/optimizer/blob/master/onnxoptimizer/test/optimizer_test.py)) Tested environment: Arch Linux x86_64 with pytorch and torchvisoin installed from [the official repo](https://github.com/archlinux/svntogit-community/blob/packages/python-pytorch/trunk/PKGBUILD) and [AUR](https://aur.archlinux.org/cgit/aur.git/tree/PKGBUILD?h=python-torchvision), respectively. Pull Request resolved: #72989 Test Plan: automation Reviewed By: seemethere Differential Revision: D34347596 Pulled By: seemethere fbshipit-source-id: 3d33eee5654dfae23d12247f1e74e6d6bc9fdd9d

Before this change, exporting a model to ONNX involving Slice crashes at `axes[i]` in line 153 if C++ assertions are enabled: ``` /usr/include/c++/11.1.0/bits/stl_vector.h:1045: std::vector<_Tp, _Alloc>::reference std::vector<_Tp, _Alloc>::operator[](std::vector<_Tp, _Alloc>::size_type) [with _Tp = long int; _Alloc = std::allocator<long int>; std::vector<_Tp, _Alloc>::reference = long int&; std::vector<_Tp, _Alloc>::size_type = long unsigned int]: Assertion '__n < this->size()' failed. ``` The relevant check is https://github.com/gcc-mirror/gcc/blob/releases/gcc-11.1.0/libstdc++-v3/include/bits/stl_vector.h#L1045, which checks the vector index. The issue can be reproduced by exporting Mask R-CNN or similar ones. For example, ```Python import io import torch import torchvision as tv model = tv.models.detection.maskrcnn_resnet50_fpn(pretrained=False) x = [torch.rand(3, 300, 400), torch.rand(3, 500, 400)] with io.BytesIO() as f: torch.onnx.export(model, x, f, opset_version=11) ``` (extracted from [onnxoptimizer tests](https://github.com/onnx/optimizer/blob/master/onnxoptimizer/test/optimizer_test.py)) Tested environment: Arch Linux x86_64 with pytorch and torchvisoin installed from [the official repo](https://github.com/archlinux/svntogit-community/blob/packages/python-pytorch/trunk/PKGBUILD) and [AUR](https://aur.archlinux.org/cgit/aur.git/tree/PKGBUILD?h=python-torchvision), respectively. Pull Request resolved: pytorch/pytorch#72989

BowenBao requested a review from shubhambhokare1 as a code owner February 17, 2022 02:01

pytorch-bot bot added the ciflow/default label Feb 17, 2022

facebook-github-bot added the cla signed label Feb 17, 2022

pytorchbot added the open source label Feb 17, 2022

facebook-github-bot added the oncall: jit Add this issue/PR to JIT oncall triage queue label Feb 17, 2022

garymm approved these changes Feb 18, 2022

View reviewed changes

pytorchmergebot closed this in 98f9ff9 Feb 18, 2022

facebook-github-bot deleted the gh/BowenBao/136/head branch February 22, 2022 15:17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ONNX] Fix an assertion failure involving Slice (#71965) #72989

[ONNX] Fix an assertion failure involving Slice (#71965) #72989

BowenBao commented Feb 17, 2022 •

edited

pytorch-bot bot commented Feb 17, 2022

⚛️ CI Flow

facebook-github-bot commented Feb 17, 2022 •

edited

garymm left a comment

BowenBao commented Feb 18, 2022

github-actions bot commented Feb 18, 2022

[ONNX] Fix an assertion failure involving Slice (#71965) #72989

[ONNX] Fix an assertion failure involving Slice (#71965) #72989

Conversation

BowenBao commented Feb 17, 2022 • edited

pytorch-bot bot commented Feb 17, 2022

⚛️ CI Flow

facebook-github-bot commented Feb 17, 2022 • edited

🔗 Helpful links

💊 CI failures summary and remediations

garymm left a comment

Choose a reason for hiding this comment

BowenBao commented Feb 18, 2022

github-actions bot commented Feb 18, 2022

BowenBao commented Feb 17, 2022 •

edited

facebook-github-bot commented Feb 17, 2022 •

edited