Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sync with upstream #156

Merged
merged 217 commits into from
Aug 29, 2018
Merged

sync with upstream #156

merged 217 commits into from
Aug 29, 2018

Conversation

rohithkrn
Copy link

No description provided.

James Reed and others added 30 commits August 14, 2018 18:13
Summary:
After this, all combinations of {String frontend, Python AST Frontend}{Python 3-style type annotations, MyPy-style type comments}{Script method, Script function} should properly accept type annotations.

Possible TODOs:
- Clean up the functions marked HACK
- Clean up the Subscript tree-view to better match the Python AST versions
- Can we use this for Python functions? That's the only place annotations.get_signature() is still needed
Pull Request resolved: pytorch#10279

Differential Revision: D9319726

Pulled By: jamesr66a

fbshipit-source-id: b13f7d4f066b0283d4fc1421a1abb9305c3b28fa
…h#10520)

Summary:
setup.py is the official install script, setup_caffe2.py is not used any more
Pull Request resolved: pytorch#10520

Reviewed By: yinghai

Differential Revision: D9325548

Pulled By: bddppq

fbshipit-source-id: 3dda87f3dff061b574fd1d5c91859044f065ee33
Summary:
Pull Request resolved: pytorch#10514

fix the bug which break the windows build in fused_rowwise_random_quantization_ops.h

Reviewed By: ezyang, jspark1105

Differential Revision: D9322291

fbshipit-source-id: a6a27e87423b6caa973414ffd7ccb12076f2e1e4
Summary:
Previously, it's easy to do `x[0].accessor<float, 2>()`. However, x[0] is a temporary, so the accessor will point to invalid strides/sizes and probably segfault. With this change, such unsafe code is a compile error.
Pull Request resolved: pytorch#10518

Reviewed By: goldsborough

Differential Revision: D9329288

Pulled By: ebetica

fbshipit-source-id: d08763bee9a19a898b9d1ea5ba648f27baa1992f
…ytorch#10227)

Summary:
Based on: pytorch#10199
Added:
(1) send, recv, recvanysource, and barrier for MPI process group.
(2) python binding
(3) testing

Please review: pytorch@2e64f5d
Pull Request resolved: pytorch#10227

Reviewed By: ailzhang

Differential Revision: D9327138

Pulled By: teng-li

fbshipit-source-id: 80496714550a3ca498eb474465ddbd1b8d657d49
Summary:
Breaking out of pytorch#8338

This PR is a workaround for a bug with CUDA9.2 + GCC7.

Here is the error this PR fixed:
.../pytorch/caffe2/operators/elementwise_ops.h: In constructor ‘caffe2::BinaryElementwiseWithArgsOp<InputTypes, Context, Functor, OutputTypeMap>::BinaryElementwiseWithArgsOp(const caffe2::OperatorDef&, caffe2::Workspace*)’:
.../pytorch/caffe2/operators/elementwise_ops.h:106:189: error: ‘GetSingleArgument<bool>’ is not a member of ‘caffe2::BinaryElementwiseWithArgsOp<InputTypes, Context, Functor, OutputTypeMap>’
   BinaryElementwiseWithArgsOp(const OperatorDef& operator_def, Workspace* ws)
Pull Request resolved: pytorch#10510

Reviewed By: orionr

Differential Revision: D9319742

Pulled By: mingzhe09088

fbshipit-source-id: ce59e3db14539f071f3c20301e77ca36a6fc3f81
…iants (pytorch#10496)

Summary:
- fixes pytorch#6219
- removed invariants at pytorch#4707
- assume a sparse tensor with coalesced=true when:
1. its elements are unique and
2. the indices are in sorted order
Pull Request resolved: pytorch#10496

Differential Revision: D9311214

Pulled By: weiyangfb

fbshipit-source-id: 167fa5a8e9e5f9c800db02f728a1194029f7e4f3
pytorch#10257)

Summary:
Initial jenkins builds / test scripts for ppc64le.
Pull Request resolved: pytorch#10257

Differential Revision: D9331278

Pulled By: ezyang

fbshipit-source-id: 6d9a4f300a0233faf3051f8151beb31786dcd838
…orch#10379)

Summary:
Background: we run pytorch in embedded C++ pipelines, running in C++ GUIs in https://github.com/Kitware/VIAME and without this addition, the call was failing with the below error, but only on certain windows platforms/configurations:

OSError: [WinError6] The handle is invalid
At:
C:\Program Files\VIAME\Python36\site-packages\torch\cuda_init_.py(162):_lazy_init
C:\Program Files\VIAME\Python36\site-packages\torch\nn\modules\module.py(249): <lambda>
C:\Program Files\VIAME\Python36\site-packages\torch\nn\modules\module.py(182): _apply
C:\Program Files\VIAME\Python36\site-packages\torch\nn\modules\module.py(176): _apply
C:\Program Files\VIAME\Python36\site-packages\torch\nn\modules\module.py(249): cuda
C:\Program Files\VIAME\lib\python3.6None\site-packages\kwiver\arrows\pytorch\pytorch_resnet_f_extractor.py(74):_init_
C:\Program Files\VIAME\lib\python3.6None\site-packages\kwiver\processes\resnet_descriptors.py(132): _configure
Pull Request resolved: pytorch#10379

Differential Revision: D9330772

Pulled By: ezyang

fbshipit-source-id: 657ae7590879004558158d3c4abef2ec11d9ed57
…boxes than specified. (pytorch#10390)

Summary:
Pull Request resolved: pytorch#10390

Fixed a bug in box_with_nms_limit where it may produce more bounding boxes than specified.
* The original code first finds the threshold for the boxes at the 'detectons_per_im' position, and filters out boxes lower than the threshold.
* In some cases that there are multiple boxes have the same threshold, the op will return more boxes than 'detectons_per_im'.

Reviewed By: wat3rBro

Differential Revision: D9252726

fbshipit-source-id: 63f40829bcd275cb181692bc7547c384cee01499
Summary:
We can't rely on the ATen fallback pathway here because we need to parse out the constant attributes explicitly
Pull Request resolved: pytorch#10513

Reviewed By: dzhulgakov

Differential Revision: D9322133

Pulled By: jamesr66a

fbshipit-source-id: 52af947e6c44532ef220cb4b94838ca838b5df06
Summary:
Pull Request resolved: pytorch#10395

Order switch ops (NCHW2NHWC and NHWC2NCHW) were only supporting 2D images.
This diff generalizes them to 1D and 3D, and also add a unit test we didn't have.

Reviewed By: protonu

Differential Revision: D9261177

fbshipit-source-id: 56e7ec54c9a8fb71781ac1336f3f28cf024b4bda
Summary:
I've implemented affine grid generation for volumetric (5d) inputs. The implementation is based off of the spatial implementation, extended by one dimension. I have a few questions about my implementation vs. the existing one that I will add inline.

I have some extensive test cases for the forward pass here: https://gist.github.com/elistevens/6e3bfb20d8d0652b83bd16b3e911285b However, they use `pytest.fixture` extensively, so I'm not sure the best way to incorporate them into the pytorch test suite. Suggestions? I have not tested backwards at all.

Diff probably best viewed with whitespace changes ignored.

Thanks for considering!
Pull Request resolved: pytorch#8322

Differential Revision: D9332335

Pulled By: SsnL

fbshipit-source-id: 1b3a91d078ef41a6d0a800514e49298fd817e4df
…torch#10531)

Summary:
Pull Request resolved: pytorch#10531

fixed a naming issue in pairwise_similarity

Reviewed By: huayuli00

Differential Revision: D9331716

fbshipit-source-id: d7de36f20504c08b1c7871ccdffa343221a3da0c
Summary:
optimize max and min reduction for ATen CPU path, current code path from TH module runs in sequential on CPU.
Pull Request resolved: pytorch#10343

Differential Revision: D9330799

Pulled By: ezyang

fbshipit-source-id: 5b8271e0ca3e3e73f88a9075aa541c8756001b7c
…Node (pytorch#10512)

Summary:
Pull Request resolved: pytorch#10512

SubtreeMatchCriteria now becomes a graph of MatchNode

MatchNode consists of NodeMatchCriteria, nonTerminal and count. This is a cleaner internal representation of the data structure and will bring us much closer to DAG matching.

Note that I still keep the debugString method because convertToDotGraph doesn't currently work with Subgraph.

Reviewed By: bwasti

Differential Revision: D9321695

fbshipit-source-id: 58a76f007a9a95d18cf807d419c2b595e9bc847f
Summary:
Two tests in the 'nn' test bucket may fail when the torch.half
(float16) data type is used. The assertions used in the tests
intend to allow slight floating point imprecision in the results,
but the tolerances used for the comparisons are too strict for
the half type.

Relax the tolerances so that slight float16 imprecision won't
cause test failures.

The affected tests are:

- test_variable_sequence_cuda
- test_Conv2d_groups_nobias

For more information, see issue:

pytorch#7420
Pull Request resolved: pytorch#10519

Differential Revision: D9343751

Pulled By: soumith

fbshipit-source-id: 90aedf48f6e22dd4fed9c7bde7cd7c7b6885845a
Summary:
Fixes pytorch#9934
Pull Request resolved: pytorch#10416

Differential Revision: D9276252

Pulled By: ailzhang

fbshipit-source-id: ea7d9d4f9390edefcd0865a98498f6c4307c291d
Summary:
Needed by the Gloo development team. Verifying nothing breaks in CI.
Pull Request resolved: pytorch#10545

Reviewed By: Maratyszcza

Differential Revision: D9344413

Pulled By: orionr

fbshipit-source-id: 207edb71170870bacec47a635a12d7f55b6c1275
Summary:
Support broadcasting in _kl_categorical_categorical

this makes it possible to do:
```
import torch.distributions as dist
import torch
p_dist = dist.Categorical(torch.ones(1,10))
q_dist = dist.Categorical(torch.ones(100,10))
dist.kl_divergence(p_dist, q_dist)
```
Pull Request resolved: pytorch#10533

Differential Revision: D9341252

Pulled By: soumith

fbshipit-source-id: 34575b30160b43b6c9e4c3070dd7ef07c00ff5d7
Summary:
Pull Request resolved: pytorch#10522

Move filler interface to operator schema to avoid extra code for
caffe2 mobile.

Reviewed By: dzhulgakov

Differential Revision: D9312940

fbshipit-source-id: 77fb2406f0c6b171a1912a207e05e36da50c6966
Summary:
Since we can't specify version number to `choco install curl`, we should not assume that `7.57.0` is the curl version that's in the Windows AMI.
Pull Request resolved: pytorch#10476

Differential Revision: D9303129

Pulled By: yf225

fbshipit-source-id: 198544be68330860fbcf93c99bc995f4e280bda7
Summary:
Fixes pytorch#10238
Pull Request resolved: pytorch#10277

Reviewed By: SsnL

Differential Revision: D9199825

Pulled By: soumith

fbshipit-source-id: 8ee7f9a72d9546d429f311c3f6028461d3c93fe2
Summary: reduce flakiness of test

Reviewed By: Maratyszcza

Differential Revision: D9344877

fbshipit-source-id: 24d5e1b873f94d816c980f3b7db93248cf10aca5
Summary:
In the shortcut for n_sample=1, when category 0 has 0 weight,
we should not map the (uniform) sample 0 to category 0.
The conversion uniform->multinomial was apparently written to work on
a (0,1] range (like curand uses), but PyTorch uses a [0,1) range.

Fixes: pytorch#4858. Thank you, Roy Fejgin for reporting.
Pull Request resolved: pytorch#9960

Reviewed By: soumith

Differential Revision: D9341793

Pulled By: ailzhang

fbshipit-source-id: 6b1a96419a7bc58cc594f761f34c6408ff6354cf
Summary:
This is the first of two changes that are supposed to improve how we handle RNNs in the JIT. They still get traced as `PythonOp`s, but now it will be much easier to actually expose them to the JIT as e.g. `aten::lstm`, and ignore the Python interpreter entirely. This needs some symbolic adjustments that will be part of a second PR.

Even when we fix symbolics, there will still be a bit of a problem with statefulness of the cuDNN API (we need a mutable cache for the dropout state, but our IR has no way of representing that).

zdevito ezyang
Pull Request resolved: pytorch#10481

Reviewed By: ezyang

Differential Revision: D9341113

Pulled By: apaszke

fbshipit-source-id: 0ae30ead72a1b12044b7c12369d11e5ca8ec30b5
Summary:
This PR removes couple of macros throughout TH* as part of the re-factoring effort for ATen. Removing these macros should avoid confusion among developers who are trying to move things from TH* to ATen. This PR is part of the THCNumerics deprecation that I have been working on following up on mruberry's pytorch#9318. I am separating these two commits to see if removal of these macros doesn't upset the pytorch public CI, as well as internal builds.

- Commit pytorch@1248de7 removes the code paths guarded by `CUDA_HALF_INSTRUCTIONS` macro. Since the macro was removed in commit pytorch@2f186df, `ifdef CUDA_HALF_INSTRUCTIONS` would return false and hence the code path that is kept after this change is for the false case of `ifdef CUDA_HALF_INSTRUCTIONS`

- Commit pytorch@520c99b removes the code paths guarded by `CUDA_HALF_TENSOR` macro. Since Pytorch now provides support for only CUDA 8.0 and above, `CUDA_HALF_TENSOR` is always true since CUDA 8.0 satisfies `CUDA_HAS_FP16` and hence, the code path that is kept after this change is for the true case of `ifdef CUDA_HALF_TENSOR`.
Pull Request resolved: pytorch#10147

Differential Revision: D9345940

Pulled By: soumith

fbshipit-source-id: c9392261dd432d304f1cdaf961760cbd164a59d0
Differential Revision:
D9276252

Original commit changeset: ea7d9d4f9390

fbshipit-source-id: 5977bf90d4c84b47e15bc8266cc3ce5602c4e05f
…ensorByteStringToUInt8FillOp (pytorch#10385)

Summary:
Pull Request resolved: pytorch#10385

Pull Request resolved: pytorch#10354

Pull Request resolved: pytorch#10316

Because Protobuf encodes uint8_t tensors using a less space efficient varint uin32_t encoding, we are adding a new operator that reads back a byte string into a uint8_t tensor.

Reviewed By: harouwu

Differential Revision: D9004839

fbshipit-source-id: dfd27085c813fdeff13fee15eef4a2e7fef72845
…pytorch#10530)

Summary:
In my environment, it looks like setup.py hangs when running

```
FULL_CAFFE2=1 python setup.py build_deps
```

Removing this fixes things, but we might also want to look at `tests_require`, which came over from `setup_caffe2.py`.

cc pjh5
Pull Request resolved: pytorch#10530

Differential Revision: D9349597

Pulled By: orionr

fbshipit-source-id: 589145eca507dfaf16386884ee2fbe60299660b4
apaszke and others added 24 commits August 24, 2018 20:25
Summary:
This disables the symbolic override hacks and makes tracing emit the recently added ATen ops for RNNs (`aten::lstm`, `aten::gru`, ...). I managed to reuse pretty much all of the translation code for their symbolics.

zdevito
Pull Request resolved: pytorch#10638

Differential Revision: D9385830

Pulled By: apaszke

fbshipit-source-id: ff06ef7b1ae7c3b7774825e0991bc3887e1ff59b
Summary:
Pull Request resolved: pytorch#10239

Make Conv + BN fusion also work for 3D convolutions

Reviewed By: duc0

Differential Revision: D9176314

fbshipit-source-id: 6604aa569c5c3afdb4480a5810890bc617e449c4
Summary: Pull Request resolved: pytorch#10827

Reviewed By: boryiingsu

Differential Revision: D9484567

fbshipit-source-id: 275eddc9406b5f427d72c0ab9b0da481b5e59ece
Summary: Pull Request resolved: pytorch#10696

Differential Revision: D9437963

Pulled By: cpuhrsch

fbshipit-source-id: 7217682f5e4b69c73d943411d738e4892bb465f5
Summary: Update all the caller for the new interface

Reviewed By: highker

Differential Revision: D9323167

fbshipit-source-id: a39335ceb402db0719f5f2314085ba9a81380308
Summary: Pull Request resolved: pytorch#10854

Reviewed By: ezyang

Differential Revision: D9498721

Pulled By: Jorghi12

fbshipit-source-id: 4018383fea5a2a6baff7183b0c0197a4b7a09f20
…ytorch#10844)

Summary:
Please review the expects carefully to make sure there are no regressions. I tried to go over them one by one when they changed, but it's sometimes easy to miss finer details.

Summary of changes:

- Renamed `TensorType` to `CompleteTensorType`. Added a new `TensorType` which records only the scalar type, number of dimensions, and device of a value. The argument behind the rename is to encourage people to use `CompleteTensorType` less, as most passes will only have limited information available. To make transition easier `complete_type->cast<TensorType>()` works, and makes our passes work with both kinds of specialization if they don't need extra the extra detail.
- Renamed `ArgumentSpec` to `CompleteArgumentSpec`. Added a new `ArgumentSpec`, which matches argument only at the level of the new `TensorType`.
- Shape analysis can process graphs with both `CompleteTensorType` and `TensorType`.
- Fuser was a part that heavily relied on full shape information being available. Now, we simply try to fuse the largest possible graphs, and have to do run-time checks to make sure they match the code we generate. If they don't, we fall back to regular interpretation. The shape checks are implementing using an optimized method exploiting algebraic properties of shapes with broadcasting, and the relations of broadcasting with pointwise ops. A full written proof of correctness of the shape checking algorithm is included in a comment in `graph_fuser.cpp`.

zdevito ezyang mruberry ngimel csarofeen
Pull Request resolved: pytorch#10844

Differential Revision: D9498705

Pulled By: apaszke

fbshipit-source-id: 0c53c2fcebd871cc2a29c260f8d012276479cc61
Summary:
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Pull Request resolved: pytorch#10883

Differential Revision: D9513997

Pulled By: ezyang

fbshipit-source-id: 37db956e57d86471323d284869bb844f5a4753ac
Summary: Pull Request resolved: pytorch#10889

Differential Revision: D9512589

Pulled By: gchanan

fbshipit-source-id: 8b2b26c9f3a4da31a46f684793ab237e9ef9a323
Summary:
PackedSequence is never supposed to be created by user, but unfortunately some community repo is already doing this (e.g., [here](https://github.com/huggingface/torchMoji/blob/7c191048ce906fc0404fe156827d97cb990ebecb/torchmoji/model_def.py#L218-L229)). Some change we made break the calling pattern `PackedSequence(data=x, batch_sizes=y)`. This patch adds back support for that.
Pull Request resolved: pytorch#9864

Differential Revision: D9011739

Pulled By: SsnL

fbshipit-source-id: 0e2012655d7f4863ec54803550df30874ec35d75
Summary:
The scalar situation has gotten a lot better and now we can
remove all instances of FIXME_zerol().

cc zdevito
Pull Request resolved: pytorch#10900

Differential Revision: D9514206

Pulled By: zou3519

fbshipit-source-id: e4e522f324126c5454cd6de14b832d2d1f6cb0ce
Summary:
- Added `__repr__` for Constraints and Transforms.
- Arguments passed to the constructor are now rendered with :attr:

Closes pytorch#10884
Pull Request resolved: pytorch#10894

Differential Revision: D9514161

Pulled By: apaszke

fbshipit-source-id: 4abf60335d876449f2b6477eb9655afed9d5b80b
Summary:
I missed these in pytorch#10900

cc apaszke jamesr66a zdevito
Pull Request resolved: pytorch#10905

Differential Revision: D9516748

Pulled By: zou3519

fbshipit-source-id: a5c3e3b65a33c339d5c4e9fc160462c3d35705f3
Summary: Pull Request resolved: pytorch#10859

Reviewed By: newstzpz

Differential Revision: D9498312

fbshipit-source-id: 08b8a596f774c9102286019f286ca0b74d1f5304
…es. (pytorch#10812)

Summary:
* Fix the necessary pathways so that tuples and lists can be inputs to the script.

* prevent linear algebra functions from being run in shape prop because
they frequently will error out for nonsense data.

* favor schema-driven python input conversion where possible.
remaining cases where we directly create Stacks without schema are
only for debugging

* Make the error messages when calling script/trace functions more pythonic

* Simplify FlattenTuples -- now that tuples are supported we can choose to only flatten tuples when needed. This may have to be revisited pending onnx test results, but is necessary for making tuple io work.
Pull Request resolved: pytorch#10812

Differential Revision: D9477982

Pulled By: zdevito

fbshipit-source-id: ed06fc426e6ef6deb404602a26c435a7fc40ea0c
Summary: Pull Request resolved: pytorch#10909

Differential Revision: D9516837

Pulled By: gchanan

fbshipit-source-id: fad7e3284e74c599b873ebaae2dcdf5013505855
…ytorch#10877)

Summary:
Pull Request resolved: pytorch#10877

change default value of DeviceOption.numa_node_id to 0 and use has_numa_node_id() to check existence

Reviewed By: ilia-cher

Differential Revision: D9473891

fbshipit-source-id: 91ac6a152f445644691023110c93d20a3ce80d43
Summary:
Previously when tracing slicing & select negative indices would get normalized, fixing the index to the size of the traced tensor. This makes the behavior the same as script so aten::select with negative indices is emitted.
Pull Request resolved: pytorch#10560

Differential Revision: D9493614

Pulled By: eellison

fbshipit-source-id: ce7a8bae59863723247208d86b9f2948051ccc6c
…ch#10833)

Summary:
Commits:

1. Make `torch.cuda.*` take device objects
2. Update `torch.distributed` docs to emphasize calling `torch.cuda.set_device` before `init_process_group`
Pull Request resolved: pytorch#10833

Differential Revision: D9514241

Pulled By: SsnL

fbshipit-source-id: 2497464305fb1e63d6c495291a5744aaa7e2696e
Summary:
The goal of this PR is to enable miopen engine(for hip devices) for recurrent operator and also enable corresponding unit test.
bddppq petrex
Pull Request resolved: pytorch#10840

Differential Revision: D9518980

Pulled By: bddppq

fbshipit-source-id: 214661e79a47c5dc6b712ef0fba986bd99db051f
Summary:
Moved kl div loss to aten.

benchmarks for 5000 iterations on input size (1000,100)

New
```
cuda:
forward [0.9736350309103727, 0.9922929517924786, 0.9694818360731006]
input requires_grad=True:
backward [0.5595634011551738, 0.558339926879853, 0.5546616851352155]
double backward [1.2445648494176567, 1.2245905152522027, 1.2349751549772918]
target requires_grad=True:
backward (new C++) [0.9489959231577814, 0.9553070571273565, 0.9556351029314101]
double backward (new C++) [1.8184774098917842, 1.8164670099504292, 1.845708406995982]

cpu:
forward (new C++) [7.892430987209082, 8.3068826389499, 7.985283812973648]
input requires_grad=True:
backward (new C++) [4.328460982069373, 4.45323242014274, 4.27946363389492]
double backward (new C++) [5.153504415880889, 4.629372010007501, 4.712803596165031]
target requires_grad=True:
backward (new C++) [3.4181493939831853, 3.3771288259886205, 3.7086612950079143]
double backward (new C++) [0.21922698011621833, 0.1858532396145165, 0.19477044604718685]
```

Old
```
cuda:
forward [3.101281268056482, 3.068499860819429, 3.0527669726870954]
input requires_grad=True:
backward [0.5650290949270129, 0.5730433077551425, 0.5588279226794839]
double backward [1.1287697306834161, 1.13834543293342, 1.1298578432761133]
target requires_grad=True:
backward [0.9470391101203859, 0.9560198178514838, 0.9750375030562282]
double backward [1.85760727385059, 1.7989214668050408, 1.788982989732176]

cpu:
forward (new C++) [12.474591840058565, 12.511441555805504, 12.666544185951352]
input requires_grad=True:
backward (new C++) [7.660991386976093, 7.449987292289734, 7.513917901087552]
double backward (new C++) [4.073225498665124, 4.264980792999268, 4.429787891916931]
target requires_grad=True:
backward (new C++) [3.448499082121998, 3.9072313378565013, 3.2433970272541046]
double backward (new C++) [2.126378359273076, 1.9045450473204255, 1.7932004742324352]
```
Pull Request resolved: pytorch#10336

Differential Revision: D9213636

Pulled By: li-roy

fbshipit-source-id: 27cc530f6276f58d35dc7a1d56dfc758a0fc4a7b
Summary:
Pull Request resolved: pytorch#10824

API additions:
- Tensor(c10::intrusive_ptr<TensorImpl,UndefinedTensor>&&)
- Tensor(const c10::intrusive_ptr<TensorImpl,UndefinedTensor>&)
- Tensor::operator=(Tensor&&) && (for completeness sake)
- TensorBase::unsafeGetTensorImpl()
- TensorBase::unsafeReleaseTensorImpl()
- TensorBase::getIntrusivePtr()
- TensorImpl::type_id()
- Tensor::set_data()
- Tensor::is_same(Tensor)
- Tensor::use_count()
- Tensor::type_id()
- Tensor::scalar_type()
- WeakTensor::is_same(WeakTensor)
- intrusive_ptr::weak_use_count()
- weak_intrusive_ptr::weak_use_count()
- c10::raw::intrusive_ptr::{incref,decref,make_weak}
- c10::raw::weak_intrusive_ptr::{incref,decref,lock}

API changes:
- Tensor::pImpl is no longer public (and now named tensor_impl_)
    - Most methods accessed this way are now accessible on Tensor
      maybe_zero_dim() and set_wrapped_number() being prominent exceptions
      (they are now accessed through unsafeGetTensorImpl())
- Type is no longer friend of Tensor
- TensorBase::reset(TensorImpl*) is deleted
- TensorBase::reset(TensorImpl*, bool should_retain) is deleted
- TensorBase::swap(TensorBaseImpl&) is deleted; use std::swap instead
- TensorBase::get() is deleted; use unsafeGetTensorImpl() instead
- TensorBase::detach() is deleted; use unsafeReleaseTensorImpl() instead
- TensorBase::retain() is deleted; use _raw_incref() instead
- TensorBase::release() is deleted; use _raw_decref() instead
- WeakTensor lost most of its methods (it no longer inherits from
  TensorBase)
- TensorImpl::storage() is now a const method
- Tensor(TensorBase) constructor removed, instead
  we go through getIntrusivePtr().  I'm not sure about
  this change; I happened to have accidentally removed the
  TensorBase constructor and decided to fix call sites,
  but I could go the other way.
- detail::set_data() is deleted; use Tensor::set_data() instead
- c10::raw_intrusive_ptr_target removed; use the functions in c10::raw instead.
  (The reason for this change, is that it is invalid to cast an intrusive_ptr_target*
  to a raw_intrusive_ptr_target* to take advantage of the methods. But there is
  no reason the incref/decref methods shouldn't also work on intrusive_ptr_target;
  it is primarily an API consideration. We can be more standards compliant by
  keeping them as functions, which are universally applicable.)
- intrusive_ptr::reclaim() and weak_intrusive_ptr::reclaim() now work on
  pointers of the NullType. (This counts as a bug fix, because the documentation
  specified that pointers produced by release() are valid to reclaim(), and
  a release() on a null intrusive_ptr produces the NullType::singleton())

Bug fixes:
- Dispatch code for mutable references incorrectly returned
  a reference to a value argument (which would immediately
  go out of scope).  They now correctly return a tensor by
  value.
- intrusive_ptr copy/move assignment did not work correctly when
  an object was assigned to itself. We now check for this case and
  no-op if so. (This bug manifested itself as a Tensor mysteriously
  becoming an UndefinedTensor after lines of code like
  'x = x.mul_(y)')

Other changes:
- The checked cast functions in Utils.h have now been
  renamed and detemplatized into checked unwrap functions.
- Added type_id() and scalar_type() methods to Tensor
- pImpl is no longer public
- Documented what the && overloads are doing
- All occurrences of 'new TensorImpl' (and similar spellings, like 'new THTensor')
  have been expunged. This is NO LONGER a valid way to create a new
  tensor, and if you do this, upon your first incref, you will catch an ASSERT
  failure saying that only tensors created by intrusive_ptr::release() are valid
  to reclaim(). Use c10::make_intrusive instead in this situation.
- IValue is adjusted to use intrusive_ptr instead of Retainable, and all
  other sub-classes of Retainable were modified to use intrusive_ptr.
  When doing this, I had to make the constructors of sub-classes like
  ConstantList public, so that c10::make_intrusive could invoke them.  Fortunately,
  if you incorrectly stack allocate a ConstantList, and then try to get an
  intrusive_ptr to it, it will fail, as stack allocated ConstantLists have refcount 0.
- IValue very narrowly sidesteps the problem of handling NullType, as it
  considers intrusive_ptr<TensorImpl> identical to intrusive_ptr<TensorImpl, UndefinedTensor>
  which is not always true. This was always the case, but there's now a comment
  explaining what's going on.

Some MSVC bugs were uncovered during the preparation of this patch.
They are documented as comments in the code.

Reviewed By: gchanan

Differential Revision: D9481140

fbshipit-source-id: 14a8ea0c231ed88b5715fb86d92730926f9f92fc
@rohithkrn rohithkrn requested a review from ezyang as a code owner August 28, 2018 02:24
@rohithkrn rohithkrn removed the request for review from ezyang August 28, 2018 02:25
@rohithkrn rohithkrn merged commit 3fd5e93 into ROCm:caffe2_specific Aug 29, 2018
lcskrishna pushed a commit to lcskrishna/pytorch that referenced this pull request May 15, 2023
When tensor is resized, reference array to it's sizes may become invalid. Make a copy in advance.

<details>
<summary>ASAN report</summary>

```
=================================================================
==1115867==ERROR: AddressSanitizer: heap-use-after-free on address 0x61000013d790 at pc 0x03ff8e7da360 bp 0x03fff53c83a0 sp 0x03fff53c8390
READ of size 8 at 0x61000013d790 thread T0
    #0 0x3ff8e7da35f in c10::SymInt::is_heap_allocated() const /home/user/pytorch/c10/core/SymInt.h:154
    ROCm#1 0x3ff8e7da35f in c10::SymInt::maybe_as_int() const /home/user/pytorch/c10/core/SymInt.h:215
    ROCm#2 0x3ff8e7d0a6d in c10::SymInt::sym_eq(c10::SymInt const&) const /home/user/pytorch/c10/core/SymInt.cpp:69
    ROCm#3 0x3ff7a9ab0bd in c10::SymInt::operator==(c10::SymInt const&) const /home/user/pytorch/c10/core/SymInt.h:177
    ROCm#4 0x3ff7a9aaedd in bool std::__equal<false>::equal<c10::SymInt const*, c10::SymInt const*>(c10::SymInt const*, c10::SymInt const*, c10::SymInt const*) /usr/lib/gcc/s390x-ibm-linux-gnu/11/include/g++-
v11/bits/stl_algobase.h:1162
    ROCm#5 0x3ff7a9aae4b in bool std::__equal_aux1<c10::SymInt const*, c10::SymInt const*>(c10::SymInt const*, c10::SymInt const*, c10::SymInt const*) /usr/lib/gcc/s390x-ibm-linux-gnu/11/include/g++-v11/bits/
stl_algobase.h:1211
    ROCm#6 0x3ff7a9aae05 in bool std::__equal_aux<c10::SymInt const*, c10::SymInt const*>(c10::SymInt const*, c10::SymInt const*, c10::SymInt const*) /usr/lib/gcc/s390x-ibm-linux-gnu/11/include/g++-v11/bits/s
tl_algobase.h:1219
    ROCm#7 0x3ff7a9aad97 in bool std::equal<c10::SymInt const*, c10::SymInt const*>(c10::SymInt const*, c10::SymInt const*, c10::SymInt const*) /usr/lib/gcc/s390x-ibm-linux-gnu/11/include/g++-v11/bits/stl_alg
obase.h:1556
    ROCm#8 0x3ff4b23c771 in c10::ArrayRef<c10::SymInt>::equals(c10::ArrayRef<c10::SymInt>) const /home/user/pytorch/c10/util/ArrayRef.h:188
    ROCm#9 0x3ff4cb91bc1 in bool c10::operator!=<c10::SymInt>(c10::ArrayRef<c10::SymInt>, c10::ArrayRef<c10::SymInt>) /home/user/pytorch/c10/util/ArrayRef.h:341
    ROCm#10 0x3ff6d1b57ff in torch::ADInplaceOrView::resize_(c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat>) /home/user/pytorch/torch/csrc/autograd/Variab
leTypeManual.cpp:408
    ROCm#11 0x3ff6d1e59c7 in c10::impl::detail::WrapFunctionIntoFunctor_<c10::CompileTimeFunctionPointer<at::Tensor const& (c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10::optional<c1
0::MemoryFormat>), &torch::ADInplaceOrView::resize_>, at::Tensor const&, c10::guts::typelist::typelist<c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat>
> >::operator()(c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat>) /home/user/pytorch/aten/src/ATen/core/boxing/impl/WrapFunctionIntoFunctor.h:13
    ROCm#12 0x3ff6d1e59c7 in c10::impl::wrap_kernel_functor_unboxed_<c10::impl::detail::WrapFunctionIntoFunctor_<c10::CompileTimeFunctionPointer<at::Tensor const& (c10::DispatchKeySet, at::Tensor const&, c10:
:ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat>), &torch::ADInplaceOrView::resize_>, at::Tensor const&, c10::guts::typelist::typelist<c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<c10::Sy
mInt>, c10::optional<c10::MemoryFormat> > >, at::Tensor const& (c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat>)>::call(c10::OperatorKernel*, c10::Disp
atchKeySet, at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat>) /home/user/pytorch/aten/src/ATen/core/boxing/impl/make_boxed_from_unboxed_functor.h:480
    ROCm#13 0x3ff51ca5129 in at::Tensor const& c10::callUnboxedKernelFunction<at::Tensor const&, at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat> >(void*, c10::OperatorKernel*,
c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<c10::SymInt>&&, c10::optional<c10::MemoryFormat>&&) /home/user/pytorch/aten/src/ATen/core/boxing/KernelFunction_impl.h:50
    ROCm#14 0x3ff51ca6e8f in at::Tensor const& c10::KernelFunction::call<at::Tensor const&, at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat> >(c10::OperatorHandle const&, c10::D
ispatchKeySet, at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat>) const /home/user/pytorch/aten/src/ATen/core/boxing/KernelFunction_impl.h:90
    ROCm#15 0x3ff51ca6e8f in at::Tensor const& c10::Dispatcher::redispatch<at::Tensor const&, at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat> >(c10::TypedOperatorHandle<at::Ten
sor const& (at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat>)> const&, c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat>)
const /home/user/pytorch/aten/src/ATen/core/dispatch/Dispatcher.h:656
    ROCm#16 0x3ff5182006b in c10::TypedOperatorHandle<at::Tensor const& (at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat>)>::redispatch(c10::DispatchKeySet, at::Tensor const&, c
10::ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat>) const /home/user/pytorch/aten/src/ATen/core/dispatch/Dispatcher.h:492
    ROCm#17 0x3ff5182006b in at::_ops::resize_::redispatch(c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat>) aten/src/ATen/Operators_4.cpp:2144
    ROCm#18 0x3ff6d1d5e07 in at::redispatch::resize__symint(c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat>) aten/src/ATen/RedispatchFunctions.h:2847
    ROCm#19 0x3ff6d1bbb67 in torch::autograd::VariableType::(anonymous namespace)::resize_(c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat>) /home/user/pyto
rch/torch/csrc/autograd/VariableTypeManual.cpp:243
    ROCm#20 0x3ff6d1bd197 in c10::impl::detail::WrapFunctionIntoFunctor_<c10::CompileTimeFunctionPointer<at::Tensor const& (c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10::optional<c1
0::MemoryFormat>), &torch::autograd::VariableType::(anonymous namespace)::resize_>, at::Tensor const&, c10::guts::typelist::typelist<c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10
::optional<c10::MemoryFormat> > >::operator()(c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat>) /home/user/pytorch/aten/src/ATen/core/boxing/impl/WrapFu
nctionIntoFunctor.h:13
    ROCm#21 0x3ff6d1bd197 in c10::impl::wrap_kernel_functor_unboxed_<c10::impl::detail::WrapFunctionIntoFunctor_<c10::CompileTimeFunctionPointer<at::Tensor const& (c10::DispatchKeySet, at::Tensor const&, c10:
:ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat>), &torch::autograd::VariableType::(anonymous namespace)::resize_>, at::Tensor const&, c10::guts::typelist::typelist<c10::DispatchKeySet, at::Tensor
 const&, c10::ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat> > >, at::Tensor const& (c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat>)>::call(c
10::OperatorKernel*, c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat>) /home/user/pytorch/aten/src/ATen/core/boxing/impl/make_boxed_from_unboxed_functor
.h:480
    ROCm#22 0x3ff51ca5129 in at::Tensor const& c10::callUnboxedKernelFunction<at::Tensor const&, at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat> >(void*, c10::OperatorKernel*,
c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<c10::SymInt>&&, c10::optional<c10::MemoryFormat>&&) /home/user/pytorch/aten/src/ATen/core/boxing/KernelFunction_impl.h:50
    ROCm#23 0x3ff5181ead1 in at::Tensor const& c10::KernelFunction::call<at::Tensor const&, at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat> >(c10::OperatorHandle const&, c10::D
ispatchKeySet, at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat>) const /home/user/pytorch/aten/src/ATen/core/boxing/KernelFunction_impl.h:90
    ROCm#24 0x3ff5181ead1 in at::Tensor const& c10::Dispatcher::call<at::Tensor const&, at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat> >(c10::TypedOperatorHandle<at::Tensor co
nst& (at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat>)> const&, at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat>) const /home/user/pytorch/at
en/src/ATen/core/dispatch/Dispatcher.h:639
    ROCm#25 0x3ff5181ead1 in c10::TypedOperatorHandle<at::Tensor const& (at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat>)>::call(at::Tensor const&, c10::ArrayRef<c10::SymInt>,
c10::optional<c10::MemoryFormat>) const /home/user/pytorch/aten/src/ATen/core/dispatch/Dispatcher.h:487
    ROCm#26 0x3ff5181ead1 in at::_ops::resize_::call(at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat>) aten/src/ATen/Operators_4.cpp:2137
    ROCm#27 0x3ff79b44fcf in at::Tensor::resize__symint(c10::ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat>) const aten/src/ATen/core/TensorBody.h:2452
    ROCm#28 0x3ff79a802db in torch::autograd::THPVariable_resize_(_object*, _object*, _object*)::$_0::operator()(at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat>) const /home/us
er/pytorch/torch/csrc/autograd/generated/python_variable_methods.cpp:13417
    ROCm#29 0x3ff7999f1eb in torch::autograd::THPVariable_resize_(_object*, _object*, _object*) /home/user/pytorch/torch/csrc/autograd/generated/python_variable_methods.cpp:13419
    ROCm#30 0x3ffa2c9b009 in method_vectorcall_VARARGS_KEYWORDS Objects/descrobject.c:344
    ROCm#31 0x3ffa2df00a9 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114
    ROCm#32 0x3ffa2df013d in PyObject_Vectorcall Include/cpython/abstract.h:123
    ROCm#33 0x3ffa2e05447 in call_function Python/ceval.c:5891
    ROCm#34 0x3ffa2dff7d7 in _PyEval_EvalFrameDefault Python/ceval.c:4198
    ROCm#35 0x3ffa2df052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    ROCm#36 0x3ffa2e02b67 in _PyEval_Vector Python/ceval.c:5065
    ROCm#37 0x3ffa2c8aec1 in _PyFunction_Vectorcall Objects/call.c:342
    ROCm#38 0x3ffa2c8ab15 in PyVectorcall_Call Objects/call.c:255
    ROCm#39 0x3ffa2c8ac65 in _PyObject_Call Objects/call.c:290
    ROCm#40 0x3ffa2c8ada9 in PyObject_Call Objects/call.c:317
    ROCm#41 0x3ffa2e059c7 in do_call_core Python/ceval.c:5943
    ROCm#42 0x3ffa2dffd39 in _PyEval_EvalFrameDefault Python/ceval.c:4277
    ROCm#43 0x3ffa2df052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    ROCm#44 0x3ffa2e02b67 in _PyEval_Vector Python/ceval.c:5065
    ROCm#45 0x3ffa2c8aec1 in _PyFunction_Vectorcall Objects/call.c:342
    ROCm#46 0x3ffa2c8ab15 in PyVectorcall_Call Objects/call.c:255
    ROCm#47 0x3ffa2c8ac65 in _PyObject_Call Objects/call.c:290
    ROCm#48 0x3ffa2c8ada9 in PyObject_Call Objects/call.c:317
    ROCm#49 0x3ffa2e059c7 in do_call_core Python/ceval.c:5943
    ROCm#50 0x3ffa2dffd39 in _PyEval_EvalFrameDefault Python/ceval.c:4277
    ROCm#51 0x3ffa2df052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    ROCm#52 0x3ffa2e02b67 in _PyEval_Vector Python/ceval.c:5065
    ROCm#53 0x3ffa2c8aec1 in _PyFunction_Vectorcall Objects/call.c:342
    ROCm#54 0x3ffa2df00a9 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114
    ROCm#55 0x3ffa2df013d in PyObject_Vectorcall Include/cpython/abstract.h:123
    ROCm#56 0x3ffa2e05447 in call_function Python/ceval.c:5891
    ROCm#57 0x3ffa2dff7d7 in _PyEval_EvalFrameDefault Python/ceval.c:4198
    ROCm#58 0x3ffa2df052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    ROCm#59 0x3ffa2e02b67 in _PyEval_Vector Python/ceval.c:5065
    ROCm#60 0x3ffa2c8aec1 in _PyFunction_Vectorcall Objects/call.c:342
    ROCm#61 0x3ffa2c8e941 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114
    ROCm#62 0x3ffa2c8eddd in method_vectorcall Objects/classobject.c:53
    ROCm#63 0x3ffa2df00a9 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114
    ROCm#64 0x3ffa2df013d in PyObject_Vectorcall Include/cpython/abstract.h:123
    ROCm#65 0x3ffa2e05447 in call_function Python/ceval.c:5891
    ROCm#66 0x3ffa2dff905 in _PyEval_EvalFrameDefault Python/ceval.c:4213
    ROCm#67 0x3ffa2df052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    ROCm#68 0x3ffa2e02b67 in _PyEval_Vector Python/ceval.c:5065
    ROCm#69 0x3ffa2c8aec1 in _PyFunction_Vectorcall Objects/call.c:342
    ROCm#70 0x3ffa2df00a9 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114
    ROCm#71 0x3ffa2df013d in PyObject_Vectorcall Include/cpython/abstract.h:123
    ROCm#72 0x3ffa2e05447 in call_function Python/ceval.c:5891
    ROCm#73 0x3ffa2dff7d7 in _PyEval_EvalFrameDefault Python/ceval.c:4198
    ROCm#74 0x3ffa2df052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    ROCm#75 0x3ffa2e02b67 in _PyEval_Vector Python/ceval.c:5065
    ROCm#76 0x3ffa2c8aec1 in _PyFunction_Vectorcall Objects/call.c:342
    ROCm#77 0x3ffa2c8e941 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114
    ROCm#78 0x3ffa2c8eddd in method_vectorcall Objects/classobject.c:53
    ROCm#79 0x3ffa2df00a9 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114
    ROCm#80 0x3ffa2df013d in PyObject_Vectorcall Include/cpython/abstract.h:123
    ROCm#81 0x3ffa2e05447 in call_function Python/ceval.c:5891
    ROCm#82 0x3ffa2dffa57 in _PyEval_EvalFrameDefault Python/ceval.c:4231
    ROCm#83 0x3ffa2df052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    ROCm#84 0x3ffa2e02b67 in _PyEval_Vector Python/ceval.c:5065
    ROCm#85 0x3ffa2c8aec1 in _PyFunction_Vectorcall Objects/call.c:342
    ROCm#86 0x3ffa2c8e941 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114
    ROCm#87 0x3ffa2c8eddd in method_vectorcall Objects/classobject.c:53
    ROCm#88 0x3ffa2df00a9 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114
    ROCm#89 0x3ffa2df013d in PyObject_Vectorcall Include/cpython/abstract.h:123
    ROCm#90 0x3ffa2e05447 in call_function Python/ceval.c:5891
    ROCm#91 0x3ffa2dffa57 in _PyEval_EvalFrameDefault Python/ceval.c:4231
    ROCm#92 0x3ffa2df052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    ROCm#93 0x3ffa2e02b67 in _PyEval_Vector Python/ceval.c:5065
    ROCm#94 0x3ffa2c8aec1 in _PyFunction_Vectorcall Objects/call.c:342
    ROCm#95 0x3ffa2c8e941 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114
    ROCm#96 0x3ffa2c8eddd in method_vectorcall Objects/classobject.c:53
    ROCm#97 0x3ffa2c8ab9b in PyVectorcall_Call Objects/call.c:267
    ROCm#98 0x3ffa2c8ac65 in _PyObject_Call Objects/call.c:290
    ROCm#99 0x3ffa2c8ada9 in PyObject_Call Objects/call.c:317
    ROCm#100 0x3ffa2e059c7 in do_call_core Python/ceval.c:5943
    ROCm#101 0x3ffa2dffd39 in _PyEval_EvalFrameDefault Python/ceval.c:4277
    ROCm#102 0x3ffa2df052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    ROCm#103 0x3ffa2e02b67 in _PyEval_Vector Python/ceval.c:5065
    ROCm#104 0x3ffa2c8aec1 in _PyFunction_Vectorcall Objects/call.c:342
    ROCm#105 0x3ffa2c8a695 in _PyObject_FastCallDictTstate Objects/call.c:153
    ROCm#106 0x3ffa2c8b271 in _PyObject_Call_Prepend Objects/call.c:431
    ROCm#107 0x3ffa2d3f307 in slot_tp_call Objects/typeobject.c:7494
    ROCm#108 0x3ffa2c8a933 in _PyObject_MakeTpCall Objects/call.c:215
    ROCm#109 0x3ffa2df0081 in _PyObject_VectorcallTstate Include/cpython/abstract.h:112
    ROCm#110 0x3ffa2df013d in PyObject_Vectorcall Include/cpython/abstract.h:123
    ROCm#111 0x3ffa2e05447 in call_function Python/ceval.c:5891
    ROCm#112 0x3ffa2dffa57 in _PyEval_EvalFrameDefault Python/ceval.c:4231
    ROCm#113 0x3ffa2df052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    ROCm#114 0x3ffa2e02b67 in _PyEval_Vector Python/ceval.c:5065
    ROCm#115 0x3ffa2c8aec1 in _PyFunction_Vectorcall Objects/call.c:342
    ROCm#116 0x3ffa2df00a9 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114
    ROCm#117 0x3ffa2df013d in PyObject_Vectorcall Include/cpython/abstract.h:123
    ROCm#118 0x3ffa2e05447 in call_function Python/ceval.c:5891
    ROCm#119 0x3ffa2dff7d7 in _PyEval_EvalFrameDefault Python/ceval.c:4198
    ROCm#120 0x3ffa2df052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    ROCm#121 0x3ffa2e02b67 in _PyEval_Vector Python/ceval.c:5065
    ROCm#122 0x3ffa2c8aec1 in _PyFunction_Vectorcall Objects/call.c:342
    ROCm#123 0x3ffa2c8ab15 in PyVectorcall_Call Objects/call.c:255
    ROCm#124 0x3ffa2c8ac65 in _PyObject_Call Objects/call.c:290
    ROCm#125 0x3ffa2c8ada9 in PyObject_Call Objects/call.c:317
    ROCm#126 0x3ffa2e059c7 in do_call_core Python/ceval.c:5943
    ROCm#127 0x3ffa2dffd39 in _PyEval_EvalFrameDefault Python/ceval.c:4277
    ROCm#128 0x3ffa2df052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    ROCm#129 0x3ffa2e02b67 in _PyEval_Vector Python/ceval.c:5065
    ROCm#130 0x3ffa2c8aec1 in _PyFunction_Vectorcall Objects/call.c:342
    ROCm#131 0x3ffa2df00a9 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114
    ROCm#132 0x3ffa2df013d in PyObject_Vectorcall Include/cpython/abstract.h:123
    ROCm#133 0x3ffa2e05447 in call_function Python/ceval.c:5891
    ROCm#134 0x3ffa2dff779 in _PyEval_EvalFrameDefault Python/ceval.c:4181
    ROCm#135 0x3ffa2df052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    ROCm#136 0x3ffa2e02b67 in _PyEval_Vector Python/ceval.c:5065
    ROCm#137 0x3ffa2c8aec1 in _PyFunction_Vectorcall Objects/call.c:342
    ROCm#138 0x3ffa2c8e941 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114
    ROCm#139 0x3ffa2c8eddd in method_vectorcall Objects/classobject.c:53
    ROCm#140 0x3ffa2df00a9 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114
    ROCm#141 0x3ffa2df013d in PyObject_Vectorcall Include/cpython/abstract.h:123
    ROCm#142 0x3ffa2e05447 in call_function Python/ceval.c:5891
    ROCm#143 0x3ffa2dff779 in _PyEval_EvalFrameDefault Python/ceval.c:4181
    ROCm#144 0x3ffa2df052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    ROCm#145 0x3ffa2e02b67 in _PyEval_Vector Python/ceval.c:5065
    ROCm#146 0x3ffa2c8aec1 in _PyFunction_Vectorcall Objects/call.c:342
    ROCm#147 0x3ffa2c8a695 in _PyObject_FastCallDictTstate Objects/call.c:153
    ROCm#148 0x3ffa2c8b271 in _PyObject_Call_Prepend Objects/call.c:431
    ROCm#149 0x3ffa2d3f307 in slot_tp_call Objects/typeobject.c:7494
    ROCm#150 0x3ffa2c8ad17 in _PyObject_Call Objects/call.c:305
    ROCm#151 0x3ffa2c8ada9 in PyObject_Call Objects/call.c:317
    ROCm#152 0x3ffa2e059c7 in do_call_core Python/ceval.c:5943
    ROCm#153 0x3ffa2dffd39 in _PyEval_EvalFrameDefault Python/ceval.c:4277
    ROCm#154 0x3ffa2df052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    ROCm#155 0x3ffa2e02b67 in _PyEval_Vector Python/ceval.c:5065
    ROCm#156 0x3ffa2c8aec1 in _PyFunction_Vectorcall Objects/call.c:342
    ROCm#157 0x3ffa2df00a9 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114
    ROCm#158 0x3ffa2df013d in PyObject_Vectorcall Include/cpython/abstract.h:123
    ROCm#159 0x3ffa2e05447 in call_function Python/ceval.c:5891
    ROCm#160 0x3ffa2dff905 in _PyEval_EvalFrameDefault Python/ceval.c:4213
    ROCm#161 0x3ffa2df052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    ROCm#162 0x3ffa2e02b67 in _PyEval_Vector Python/ceval.c:5065
    ROCm#163 0x3ffa2c8aec1 in _PyFunction_Vectorcall Objects/call.c:342
    ROCm#164 0x3ffa2c8e941 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114
    ROCm#165 0x3ffa2c8eddd in method_vectorcall Objects/classobject.c:53
    ROCm#166 0x3ffa2df00a9 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114
    ROCm#167 0x3ffa2df013d in PyObject_Vectorcall Include/cpython/abstract.h:123
    ROCm#168 0x3ffa2e05447 in call_function Python/ceval.c:5891
    ROCm#169 0x3ffa2dffa57 in _PyEval_EvalFrameDefault Python/ceval.c:4231
    ROCm#170 0x3ffa2df052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    ROCm#171 0x3ffa2e02b67 in _PyEval_Vector Python/ceval.c:5065
    ROCm#172 0x3ffa2c8aec1 in _PyFunction_Vectorcall Objects/call.c:342
    ROCm#173 0x3ffa2c8ab15 in PyVectorcall_Call Objects/call.c:255
    ROCm#174 0x3ffa2c8ac65 in _PyObject_Call Objects/call.c:290
    ROCm#175 0x3ffa2c8ada9 in PyObject_Call Objects/call.c:317
    ROCm#176 0x3ffa2e059c7 in do_call_core Python/ceval.c:5943
    ROCm#177 0x3ffa2dffd39 in _PyEval_EvalFrameDefault Python/ceval.c:4277
    ROCm#178 0x3ffa2df052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    ROCm#179 0x3ffa2e02b67 in _PyEval_Vector Python/ceval.c:5065
    ROCm#180 0x3ffa2c8aec1 in _PyFunction_Vectorcall Objects/call.c:342
    ROCm#181 0x3ffa2df00a9 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114
    ROCm#182 0x3ffa2df013d in PyObject_Vectorcall Include/cpython/abstract.h:123
    ROCm#183 0x3ffa2e05447 in call_function Python/ceval.c:5891
    ROCm#184 0x3ffa2dff905 in _PyEval_EvalFrameDefault Python/ceval.c:4213
    ROCm#185 0x3ffa2df052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    ROCm#186 0x3ffa2e02b67 in _PyEval_Vector Python/ceval.c:5065
    ROCm#187 0x3ffa2c8aec1 in _PyFunction_Vectorcall Objects/call.c:342
    ROCm#188 0x3ffa2df00a9 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114
    ROCm#189 0x3ffa2df013d in PyObject_Vectorcall Include/cpython/abstract.h:123
    ROCm#190 0x3ffa2e05447 in call_function Python/ceval.c:5891
    ROCm#191 0x3ffa2dffa57 in _PyEval_EvalFrameDefault Python/ceval.c:4231
    ROCm#192 0x3ffa2df052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    ROCm#193 0x3ffa2e02b67 in _PyEval_Vector Python/ceval.c:5065
    ROCm#194 0x3ffa2c8aec1 in _PyFunction_Vectorcall Objects/call.c:342
    ROCm#195 0x3ffa2c8ab15 in PyVectorcall_Call Objects/call.c:255
    ROCm#196 0x3ffa2c8ac65 in _PyObject_Call Objects/call.c:290
    ROCm#197 0x3ffa2c8ada9 in PyObject_Call Objects/call.c:317
    ROCm#198 0x3ffa2e059c7 in do_call_core Python/ceval.c:5943
    ROCm#199 0x3ffa2dffd39 in _PyEval_EvalFrameDefault Python/ceval.c:4277
    ROCm#200 0x3ffa2df052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    ROCm#201 0x3ffa2e02b67 in _PyEval_Vector Python/ceval.c:5065
    ROCm#202 0x3ffa2c8aec1 in _PyFunction_Vectorcall Objects/call.c:342
    ROCm#203 0x3ffa2df00a9 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114
    ROCm#204 0x3ffa2df013d in PyObject_Vectorcall Include/cpython/abstract.h:123
    ROCm#205 0x3ffa2e05447 in call_function Python/ceval.c:5891
    ROCm#206 0x3ffa2dff779 in _PyEval_EvalFrameDefault Python/ceval.c:4181
    ROCm#207 0x3ffa2df052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    ROCm#208 0x3ffa2e02b67 in _PyEval_Vector Python/ceval.c:5065
    ROCm#209 0x3ffa2c8aec1 in _PyFunction_Vectorcall Objects/call.c:342
    ROCm#210 0x3ffa2c8e941 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114
    ROCm#211 0x3ffa2c8eddd in method_vectorcall Objects/classobject.c:53
    ROCm#212 0x3ffa2df00a9 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114
    ROCm#213 0x3ffa2df013d in PyObject_Vectorcall Include/cpython/abstract.h:123
    ROCm#214 0x3ffa2e05447 in call_function Python/ceval.c:5891
    ROCm#215 0x3ffa2dff779 in _PyEval_EvalFrameDefault Python/ceval.c:4181
    ROCm#216 0x3ffa2df052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    ROCm#217 0x3ffa2e02b67 in _PyEval_Vector Python/ceval.c:5065
    ROCm#218 0x3ffa2c8aec1 in _PyFunction_Vectorcall Objects/call.c:342
    ROCm#219 0x3ffa2c8a695 in _PyObject_FastCallDictTstate Objects/call.c:153
    ROCm#220 0x3ffa2c8b271 in _PyObject_Call_Prepend Objects/call.c:431
    ROCm#221 0x3ffa2d3f307 in slot_tp_call Objects/typeobject.c:7494
    ROCm#222 0x3ffa2c8a933 in _PyObject_MakeTpCall Objects/call.c:215
    ROCm#223 0x3ffa2df0081 in _PyObject_VectorcallTstate Include/cpython/abstract.h:112
    ROCm#224 0x3ffa2df013d in PyObject_Vectorcall Include/cpython/abstract.h:123
    ROCm#225 0x3ffa2e05447 in call_function Python/ceval.c:5891
    ROCm#226 0x3ffa2dffa57 in _PyEval_EvalFrameDefault Python/ceval.c:4231
    ROCm#227 0x3ffa2df052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    ROCm#228 0x3ffa2e02b67 in _PyEval_Vector Python/ceval.c:5065
    ROCm#229 0x3ffa2c8aec1 in _PyFunction_Vectorcall Objects/call.c:342
    ROCm#230 0x3ffa2c8ab15 in PyVectorcall_Call Objects/call.c:255
    ROCm#231 0x3ffa2c8ac65 in _PyObject_Call Objects/call.c:290
    ROCm#232 0x3ffa2c8ada9 in PyObject_Call Objects/call.c:317
    ROCm#233 0x3ffa2e059c7 in do_call_core Python/ceval.c:5943
    ROCm#234 0x3ffa2dffd39 in _PyEval_EvalFrameDefault Python/ceval.c:4277
    ROCm#235 0x3ffa2df052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    ROCm#236 0x3ffa2e02b67 in _PyEval_Vector Python/ceval.c:5065
    ROCm#237 0x3ffa2c8aec1 in _PyFunction_Vectorcall Objects/call.c:342
    ROCm#238 0x3ffa2df00a9 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114
    ROCm#239 0x3ffa2df013d in PyObject_Vectorcall Include/cpython/abstract.h:123
    ROCm#240 0x3ffa2e05447 in call_function Python/ceval.c:5891
    ROCm#241 0x3ffa2dff779 in _PyEval_EvalFrameDefault Python/ceval.c:4181
    ROCm#242 0x3ffa2df052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    ROCm#243 0x3ffa2e02b67 in _PyEval_Vector Python/ceval.c:5065
    ROCm#244 0x3ffa2c8aec1 in _PyFunction_Vectorcall Objects/call.c:342
    ROCm#245 0x3ffa2c8e941 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114
    ROCm#246 0x3ffa2c8eddd in method_vectorcall Objects/classobject.c:53
    ROCm#247 0x3ffa2df00a9 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114
    ROCm#248 0x3ffa2df013d in PyObject_Vectorcall Include/cpython/abstract.h:123
    ROCm#249 0x3ffa2e05447 in call_function Python/ceval.c:5891
    ROCm#250 0x3ffa2dff779 in _PyEval_EvalFrameDefault Python/ceval.c:4181
    ROCm#251 0x3ffa2df052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    ROCm#252 0x3ffa2e02b67 in _PyEval_Vector Python/ceval.c:5065
    ROCm#253 0x3ffa2c8aec1 in _PyFunction_Vectorcall Objects/call.c:342
    ROCm#254 0x3ffa2c8a695 in _PyObject_FastCallDictTstate Objects/call.c:153
    ROCm#255 0x3ffa2c8b271 in _PyObject_Call_Prepend Objects/call.c:431
    ROCm#256 0x3ffa2d3f307 in slot_tp_call Objects/typeobject.c:7494
    ROCm#257 0x3ffa2c8a933 in _PyObject_MakeTpCall Objects/call.c:215

0x61000013d790 is located 80 bytes inside of 192-byte region [0x61000013d740,0x61000013d800)
freed by thread T0 here:
    #0 0x3ffa3237de5 in operator delete(void*) /var/tmp/portage/sys-devel/gcc-11.3.1_p20230303/work/gcc-11-20230303/libsanitizer/asan/asan_new_delete.cpp:160
    ROCm#1 0x3ff8e7e3221 in c10::TensorImpl::~TensorImpl() /home/user/pytorch/c10/core/TensorImpl.cpp:75

previously allocated by thread T0 here:
    #0 0x3ffa323734f in operator new(unsigned long) /var/tmp/portage/sys-devel/gcc-11.3.1_p20230303/work/gcc-11-20230303/libsanitizer/asan/asan_new_delete.cpp:99
    ROCm#1 0x3ff4aeeb3d1 in c10::intrusive_ptr<c10::TensorImpl, c10::detail::intrusive_target_default_null_type<c10::TensorImpl> > c10::intrusive_ptr<c10::TensorImpl, c10::detail::intrusive_target_default_nul
l_type<c10::TensorImpl> >::make<c10::intrusive_ptr<c10::StorageImpl, c10::detail::intrusive_target_default_null_type<c10::StorageImpl> >, c10::DispatchKeySet&, caffe2::TypeMeta&>(c10::intrusive_ptr<c10::S
torageImpl, c10::detail::intrusive_target_default_null_type<c10::StorageImpl> >&&, c10::DispatchKeySet&, caffe2::TypeMeta&) /home/user/pytorch/c10/util/intrusive_ptr.h:498
    ROCm#2 0x3ff76f79e17  (/home/user/pytorch/build/lib.linux-s390x-cpython-310/torch/lib/libtorch_cpu.so+0x2fb79e17)

SUMMARY: AddressSanitizer: heap-use-after-free /home/user/pytorch/c10/core/SymInt.h:154 in c10::SymInt::is_heap_allocated() const
Shadow bytes around the buggy address:
  0x100c2000027aa0: fa fa fa fa fa fa fa fa fd fd fd fd fd fd fd fd
  0x100c2000027ab0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x100c2000027ac0: fa fa fa fa fa fa fa fa fd fd fd fd fd fd fd fd
  0x100c2000027ad0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x100c2000027ae0: fa fa fa fa fa fa fa fa fd fd fd fd fd fd fd fd
=>0x100c2000027af0: fd fd[fd]fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x100c2000027b00: fa fa fa fa fa fa fa fa 00 00 00 00 00 00 00 00
  0x100c2000027b10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x100c2000027b20: fa fa fa fa fa fa fa fa 00 00 00 00 00 00 00 00
  0x100c2000027b30: 00 00 00 00 04 fa fa fa fa fa fa fa fa fa fa fa
  0x100c2000027b40: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
  Shadow gap:              cc
==1115867==ABORTING
```
</details>

<details>
<summary>Additional backtraces (not full)</summary>

Memory deallocation:
```
#0  operator delete (ptr=0x61000013d740) at /var/tmp/portage/sys-devel/gcc-11.3.1_p20230303/work/gcc-11-20230303/libsanitizer/asan/asan_new_delete.cpp:160
ROCm#1  0x000003ffa77e3222 in c10::TensorImpl::~TensorImpl (this=0x61000013d740) at /home/user/pytorch/c10/core/TensorImpl.cpp:75
ROCm#2  0x000003ff63e76e8c in c10::intrusive_ptr<c10::TensorImpl, c10::UndefinedTensorImpl>::reset_ (this=0x3ffd7ec8230) at /home/user/pytorch/c10/util/intrusive_ptr.h:291
ROCm#3  0x000003ff63e76910 in c10::intrusive_ptr<c10::TensorImpl, c10::UndefinedTensorImpl>::~intrusive_ptr (this=0x3ffd7ec8230) at /home/user/pytorch/c10/util/intrusive_ptr.h:370
ROCm#4  0x000003ff63e67240 in at::TensorBase::~TensorBase (this=0x3ffd7ec8230) at /home/user/pytorch/aten/src/ATen/core/TensorBase.h:80
ROCm#5  0x000003ff63e85ee0 in at::Tensor::~Tensor (this=0x3ffd7ec8230) at aten/src/ATen/core/TensorBody.h:90
ROCm#6  0x000003ff63f67304 in resize__functionalization (dispatchKeySet=..., self=..., size=..., memory_format=...) at /home/user/pytorch/aten/src/ATen/FunctionalizeFallbackKernel.cpp:173
ROCm#7  0x000003ff63f89258 in c10::impl::detail::WrapFunctionIntoFunctor_<c10::CompileTimeFunctionPointer<at::Tensor const& (c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<long>, c10::optional<c10::MemoryFormat>), &(resize__functionalization(c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<long>, c10::optional<c10::MemoryFormat>))>, at::Tensor const&, c10::guts::typelist::typelist<c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<long>, c10::optional<c10::MemoryFormat> > >::operator()(c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<long>, c10::optional<c10::MemoryFormat>) (
    this=0x6030000390a0, args=..., args=..., args=..., args=...) at /home/user/pytorch/aten/src/ATen/core/boxing/impl/WrapFunctionIntoFunctor.h:13
ROCm#8  c10::impl::wrap_kernel_functor_unboxed_<c10::impl::detail::WrapFunctionIntoFunctor_<c10::CompileTimeFunctionPointer<at::Tensor const& (c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<long>, c10::optional<c10::MemoryFormat>), &(resize__functionalization(c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<long>, c10::optional<c10::MemoryFormat>))>, at::Tensor const&, c10::guts::typelist::typelist<c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<long>, c10::optional<c10::MemoryFormat> > >, at::Tensor const& (c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<long>, c10::optional<c10::MemoryFormat>)>::call(c10::OperatorKernel*, c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<long>, c10::optional<c10::MemoryFormat>) (functor=0x6030000390a0, dispatchKeySet=..., args=..., args=...,
    args=...) at /home/user/pytorch/aten/src/ATen/core/boxing/impl/make_boxed_from_unboxed_functor.h:480
ROCm#9  0x000003ff6aca560a in c10::callUnboxedKernelFunction<at::Tensor const&, at::Tensor const&, c10::ArrayRef<long>, c10::optional<c10::MemoryFormat> > (
    unboxed_kernel_func=0x3ff63f88a80 <c10::impl::wrap_kernel_functor_unboxed_<c10::impl::detail::WrapFunctionIntoFunctor_<c10::CompileTimeFunctionPointer<at::Tensor const& (c10::DispatchKeySet, at::Tenso
r const&, c10::ArrayRef<long>, c10::optional<c10::MemoryFormat>), &(resize__functionalization(c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<long>, c10::optional<c10::MemoryFormat>))>, at::Tensor const&, c10::guts::typelist::typelist<c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<long>, c10::optional<c10::MemoryFormat> > >, at::Tensor const& (c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<long>, c10::optional<c10::MemoryFormat>)>::call(c10::OperatorKernel*, c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<long>, c10::optional<c10::MemoryFormat>)>, functor=0x6030000390a0,
    dispatchKeySet=..., args=..., args=..., args=...) at /home/user/pytorch/aten/src/ATen/core/boxing/KernelFunction_impl.h:50
ROCm#10 0x000003ff6aca715c in c10::KernelFunction::call<at::Tensor const&, at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat> > (this=0x6210005e1b28, opHandle=...,
    dispatchKeySet=..., args=..., args=..., args=...) at /home/user/pytorch/aten/src/ATen/core/boxing/KernelFunction_impl.h:96
ROCm#11 c10::Dispatcher::redispatch<at::Tensor const&, at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat> >(c10::TypedOperatorHandle<at::Tensor const& (at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat>)> const&, c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat>) const (
    this=0x3ff919400e0 <c10::Dispatcher::realSingleton()::_singleton>, op=..., currentDispatchKeySet=..., args=..., args=..., args=...) at /home/user/pytorch/aten/src/ATen/core/dispatch/Dispatcher.h:656
ROCm#12 0x000003ff6a82006c in c10::TypedOperatorHandle<at::Tensor const& (at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat>)>::redispatch(c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat>) const (
    this=0x3ff919a07e0 <at::_ops::resize_::redispatch(c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat>)::op>, currentDispatchKeySet=..., args=...,
    args=..., args=...) at /home/user/pytorch/aten/src/ATen/core/dispatch/Dispatcher.h:492
ROCm#13 at::_ops::resize_::redispatch (dispatchKeySet=..., self=..., size=..., memory_format=...) at /home/user/pytorch/build/aten/src/ATen/Operators_4.cpp:2144
ROCm#14 0x000003ff861d5e08 in at::redispatch::resize__symint (dispatchKeySet=..., self=..., size=..., memory_format=...) at aten/src/ATen/RedispatchFunctions.h:2847
ROCm#15 0x000003ff861b579e in torch::ADInplaceOrView::resize_ (ks=..., self=..., size=..., optional_memory_format=...) at /home/user/pytorch/torch/csrc/autograd/VariableTypeManual.cpp:401
```

Memory access:
```
#0  c10::SymInt::maybe_as_int (this=0x61000013d790) at /home/user/pytorch/c10/core/SymInt.h:215
ROCm#1  0x000003ff734d0a6e in c10::SymInt::sym_eq (this=0x61000013d790, sci=...) at /home/user/pytorch/c10/core/SymInt.cpp:69
ROCm#2  0x000003ff5f6ab0be in c10::SymInt::operator== (this=0x61000013d790, o=...) at /home/user/pytorch/c10/core/SymInt.h:177
ROCm#3  0x000003ff5f6aaede in std::__equal<false>::equal<c10::SymInt const*, c10::SymInt const*> (__first1=0x61000013d790, __last1=0x61000013d7a0, __first2=0x602000015c30)
    at /usr/lib/gcc/s390x-ibm-linux-gnu/11/include/g++-v11/bits/stl_algobase.h:1162
ROCm#4  0x000003ff5f6aae4c in std::__equal_aux1<c10::SymInt const*, c10::SymInt const*> (__first1=0x61000013d790, __last1=0x61000013d7a0, __first2=0x602000015c30)
    at /usr/lib/gcc/s390x-ibm-linux-gnu/11/include/g++-v11/bits/stl_algobase.h:1211
ROCm#5  0x000003ff5f6aae06 in std::__equal_aux<c10::SymInt const*, c10::SymInt const*> (__first1=0x61000013d790, __last1=0x61000013d7a0, __first2=0x602000015c30)
    at /usr/lib/gcc/s390x-ibm-linux-gnu/11/include/g++-v11/bits/stl_algobase.h:1219
ROCm#6  0x000003ff5f6aad98 in std::equal<c10::SymInt const*, c10::SymInt const*> (__first1=0x61000013d790, __last1=0x61000013d7a0, __first2=0x602000015c30)
    at /usr/lib/gcc/s390x-ibm-linux-gnu/11/include/g++-v11/bits/stl_algobase.h:1556
ROCm#7  0x000003ff2ff3c772 in c10::ArrayRef<c10::SymInt>::equals (this=0x3ffed7c9900, RHS=...) at /home/user/pytorch/c10/util/ArrayRef.h:188
ROCm#8  0x000003ff31891bc2 in c10::operator!=<c10::SymInt> (a1=..., a2=...) at /home/user/pytorch/c10/util/ArrayRef.h:341
ROCm#9  0x000003ff51eb5800 in torch::ADInplaceOrView::resize_ (ks=..., self=..., size=..., optional_memory_format=...) at /home/user/pytorch/torch/csrc/autograd/VariableTypeManual.cpp:408
ROCm#10 0x000003ff51ee59c8 in c10::impl::detail::WrapFunctionIntoFunctor_<c10::CompileTimeFunctionPointer<at::Tensor const& (c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10::optional<c
10::MemoryFormat>), &torch::ADInplaceOrView::resize_>, at::Tensor const&, c10::guts::typelist::typelist<c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat>
 > >::operator()(c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat>) (this=0x6030007dca40, args=..., args=..., args=..., args=...)
    at /home/user/pytorch/aten/src/ATen/core/boxing/impl/WrapFunctionIntoFunctor.h:13
ROCm#11 c10::impl::wrap_kernel_functor_unboxed_<c10::impl::detail::WrapFunctionIntoFunctor_<c10::CompileTimeFunctionPointer<at::Tensor const& (c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<c10::SymInt
>, c10::optional<c10::MemoryFormat>), &torch::ADInplaceOrView::resize_>, at::Tensor const&, c10::guts::typelist::typelist<c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10::optional<
c10::MemoryFormat> > >, at::Tensor const& (c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat>)>::call(c10::OperatorKernel*, c10::DispatchKeySet, at::Tenso
r const&, c10::ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat>) (functor=0x6030007dca40, dispatchKeySet=..., args=..., args=..., args=...)
    at /home/user/pytorch/aten/src/ATen/core/boxing/impl/make_boxed_from_unboxed_functor.h:480
ROCm#12 0x000003ff369a512a in c10::callUnboxedKernelFunction<at::Tensor const&, at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat> > (
    unboxed_kernel_func=0x3ff51ee51f0 <c10::impl::wrap_kernel_functor_unboxed_<c10::impl::detail::WrapFunctionIntoFunctor_<c10::CompileTimeFunctionPointer<at::Tensor const& (c10::DispatchKeySet, at::Tenso
r const&, c10::ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat>), &torch::ADInplaceOrView::resize_>, at::Tensor const&, c10::guts::typelist::typelist<c10::DispatchKeySet, at::Tensor const&, c10::Ar
rayRef<c10::SymInt>, c10::optional<c10::MemoryFormat> > >, at::Tensor const& (c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat>)>::call(c10::OperatorKern
el*, c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat>)>, functor=0x6030007dca40, dispatchKeySet=..., args=..., args=..., args=...)
    at /home/user/pytorch/aten/src/ATen/core/boxing/KernelFunction_impl.h:50
ROCm#13 0x000003ff369a6e90 in c10::KernelFunction::call<at::Tensor const&, at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat> > (this=0x6210005e1bc8, opHandle=...,
    dispatchKeySet=..., args=..., args=..., args=...) at /home/user/pytorch/aten/src/ATen/core/boxing/KernelFunction_impl.h:90
ROCm#14 c10::Dispatcher::redispatch<at::Tensor const&, at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat> >(c10::TypedOperatorHandle<at::Tensor const& (at::Tensor const&, c10::Arr
ayRef<c10::SymInt>, c10::optional<c10::MemoryFormat>)> const&, c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat>) const (
    this=0x3ff5d6400e0 <c10::Dispatcher::realSingleton()::_singleton>, op=..., currentDispatchKeySet=..., args=..., args=..., args=...) at /home/user/pytorch/aten/src/ATen/core/dispatch/Dispatcher.h:656
ROCm#15 0x000003ff3652006c in c10::TypedOperatorHandle<at::Tensor const& (at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat>)>::redispatch(c10::DispatchKeySet, at::Tensor const&,
c10::ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat>) const (
    this=0x3ff5d6a07e0 <at::_ops::resize_::redispatch(c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<c10::SymInt>, c10::optional<c10::MemoryFormat>)::op>, currentDispatchKeySet=..., args=...,
    args=..., args=...) at /home/user/pytorch/aten/src/ATen/core/dispatch/Dispatcher.h:492
ROCm#16 at::_ops::resize_::redispatch (dispatchKeySet=..., self=..., size=..., memory_format=...) at /home/user/pytorch/build/aten/src/ATen/Operators_4.cpp:2144
ROCm#17 0x000003ff51ed5e08 in at::redispatch::resize__symint (dispatchKeySet=..., self=..., size=..., memory_format=...) at aten/src/ATen/RedispatchFunctions.h:2847
ROCm#18 0x000003ff51ebbb68 in torch::autograd::VariableType::(anonymous namespace)::resize_ (ks=..., self=..., size=..., optional_memory_format=...)
    at /home/user/pytorch/torch/csrc/autograd/VariableTypeManual.cpp:243
```
</details>
Pull Request resolved: pytorch#101064
Approved by: https://github.com/Skylion007, https://github.com/albanD
alugorey pushed a commit to alugorey/pytorch that referenced this pull request May 17, 2023
arguments() returns vector member of object returned by schema() call.
When object returned by schema() call is destroyed, the vector is deallocated as well,
it's lifetime isn't extended.

This issue detected while running `pytest -v test/mobile/test_lite_script_type.py -k test_nest_typing_namedtuple_custom_classtype` with ASAN.

<details>
<summary>ASAN output</summary>

```
==1134126==ERROR: AddressSanitizer: heap-use-after-free on address 0x60d0005a5790 at pc 0x03ff844488d8 bp 0x03fff584afe8 sp 0x03fff584afd8
READ of size 8 at 0x60d0005a5790 thread T0
    #0 0x3ff844488d7 in __gnu_cxx::__normal_iterator<c10::Argument const*, std::vector<c10::Argument, std::allocator<c10::Argument> > >::__normal_iterator(c10::Argument const* const&) /usr/lib/gcc/s390x-i
bm-linux-gnu/11/include/g++-v11/bits/stl_iterator.h:1028
    ROCm#1 0x3ff8444293f in std::vector<c10::Argument, std::allocator<c10::Argument> >::begin() const /usr/lib/gcc/s390x-ibm-linux-gnu/11/include/g++-v11/bits/stl_vector.h:821
    ROCm#2 0x3ff84d807d1 in torch::jit::toPyObject(c10::IValue) /home/user/pytorch/torch/csrc/jit/python/pybind_utils.cpp:617
    ROCm#3 0x3ff84d80305 in torch::jit::toPyObject(c10::IValue) /home/user/pytorch/torch/csrc/jit/python/pybind_utils.cpp:604
    ROCm#4 0x3ff84856871 in pybind11::detail::type_caster<c10::IValue, void>::cast(c10::IValue, pybind11::return_value_policy, pybind11::handle) /home/user/pytorch/torch/csrc/jit/python/pybind.h:138
    ROCm#5 0x3ff85318191 in pybind11::cpp_function::initialize<torch::jit::initJitScriptBindings(_object*)::$_45, c10::IValue, torch::jit::mobile::Module&, pybind11::tuple const&, pybind11::name, pybind11::is
_method, pybind11::sibling, pybind11::arg>(torch::jit::initJitScriptBindings(_object*)::$_45&&, c10::IValue (*)(torch::jit::mobile::Module&, pybind11::tuple const&), pybind11::name const&, pybind11::is_me
thod const&, pybind11::sibling const&, pybind11::arg const&)::{lambda(pybind11::detail::function_call&)ROCm#1}::operator()(pybind11::detail::function_call&) const /home/user/pytorch/cmake/../third_party/pybin
d11/include/pybind11/pybind11.h:249
    ROCm#6 0x3ff85317cfd in pybind11::cpp_function::initialize<torch::jit::initJitScriptBindings(_object*)::$_45, c10::IValue, torch::jit::mobile::Module&, pybind11::tuple const&, pybind11::name, pybind11::is
_method, pybind11::sibling, pybind11::arg>(torch::jit::initJitScriptBindings(_object*)::$_45&&, c10::IValue (*)(torch::jit::mobile::Module&, pybind11::tuple const&), pybind11::name const&, pybind11::is_me
thod const&, pybind11::sibling const&, pybind11::arg const&)::{lambda(pybind11::detail::function_call&)ROCm#1}::__invoke(pybind11::detail::function_call&) /home/user/pytorch/cmake/../third_party/pybind11/incl
ude/pybind11/pybind11.h:224
    ROCm#7 0x3ff82ee52e9 in pybind11::cpp_function::dispatcher(_object*, _object*, _object*) /home/user/pytorch/cmake/../third_party/pybind11/include/pybind11/pybind11.h:929
    ROCm#8 0x3ffab002903 in cfunction_call Objects/methodobject.c:543
    ROCm#9 0x3ffaaf8a933 in _PyObject_MakeTpCall Objects/call.c:215
    ROCm#10 0x3ffaaf8e919 in _PyObject_VectorcallTstate Include/cpython/abstract.h:112
    ROCm#11 0x3ffaaf8eddd in method_vectorcall Objects/classobject.c:53
    ROCm#12 0x3ffab0f00a9 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114
    ROCm#13 0x3ffab0f013d in PyObject_Vectorcall Include/cpython/abstract.h:123
    ROCm#14 0x3ffab105447 in call_function Python/ceval.c:5891
    ROCm#15 0x3ffab0ff779 in _PyEval_EvalFrameDefault Python/ceval.c:4181
    ROCm#16 0x3ffab0f052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    ROCm#17 0x3ffab102b67 in _PyEval_Vector Python/ceval.c:5065
    ROCm#18 0x3ffaaf8aec1 in _PyFunction_Vectorcall Objects/call.c:342
    ROCm#19 0x3ffaaf8a615 in _PyObject_FastCallDictTstate Objects/call.c:142
    ROCm#20 0x3ffaaf8b271 in _PyObject_Call_Prepend Objects/call.c:431
    ROCm#21 0x3ffab03f307 in slot_tp_call Objects/typeobject.c:7494
    ROCm#22 0x3ffaaf8a933 in _PyObject_MakeTpCall Objects/call.c:215
    ROCm#23 0x3ffab0f0081 in _PyObject_VectorcallTstate Include/cpython/abstract.h:112
    ROCm#24 0x3ffab0f013d in PyObject_Vectorcall Include/cpython/abstract.h:123
    ROCm#25 0x3ffab105447 in call_function Python/ceval.c:5891
    ROCm#26 0x3ffab0ff905 in _PyEval_EvalFrameDefault Python/ceval.c:4213
    ROCm#27 0x3ffab0f052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    ROCm#28 0x3ffab102b67 in _PyEval_Vector Python/ceval.c:5065
    ROCm#29 0x3ffaaf8aec1 in _PyFunction_Vectorcall Objects/call.c:342
    ROCm#30 0x3ffaaf8e941 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114
    ROCm#31 0x3ffaaf8eddd in method_vectorcall Objects/classobject.c:53
    ROCm#32 0x3ffab0f00a9 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114
    ROCm#33 0x3ffab0f013d in PyObject_Vectorcall Include/cpython/abstract.h:123
    ROCm#34 0x3ffab105447 in call_function Python/ceval.c:5891
    ROCm#35 0x3ffab0ff905 in _PyEval_EvalFrameDefault Python/ceval.c:4213
    ROCm#36 0x3ffab0f052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    ROCm#37 0x3ffab102b67 in _PyEval_Vector Python/ceval.c:5065
    ROCm#38 0x3ffaaf8aec1 in _PyFunction_Vectorcall Objects/call.c:342
    ROCm#39 0x3ffab0f00a9 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114
    ROCm#40 0x3ffab0f013d in PyObject_Vectorcall Include/cpython/abstract.h:123
    ROCm#41 0x3ffab105447 in call_function Python/ceval.c:5891
    ROCm#42 0x3ffab0ff7d7 in _PyEval_EvalFrameDefault Python/ceval.c:4198
    ROCm#43 0x3ffab0f052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    ROCm#44 0x3ffab102b67 in _PyEval_Vector Python/ceval.c:5065
    ROCm#45 0x3ffaaf8aec1 in _PyFunction_Vectorcall Objects/call.c:342
    ROCm#46 0x3ffaaf8e941 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114
    ROCm#47 0x3ffaaf8eddd in method_vectorcall Objects/classobject.c:53
    ROCm#48 0x3ffab0f00a9 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114
    ROCm#49 0x3ffab0f013d in PyObject_Vectorcall Include/cpython/abstract.h:123
    ROCm#50 0x3ffab105447 in call_function Python/ceval.c:5891
    ROCm#51 0x3ffab0ffa57 in _PyEval_EvalFrameDefault Python/ceval.c:4231
    ROCm#52 0x3ffab0f052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    ROCm#53 0x3ffab102b67 in _PyEval_Vector Python/ceval.c:5065
    ROCm#54 0x3ffaaf8aec1 in _PyFunction_Vectorcall Objects/call.c:342
    ROCm#55 0x3ffaaf8e941 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114
    ROCm#56 0x3ffaaf8eddd in method_vectorcall Objects/classobject.c:53
    ROCm#57 0x3ffab0f00a9 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114
    ROCm#58 0x3ffab0f013d in PyObject_Vectorcall Include/cpython/abstract.h:123
    ROCm#59 0x3ffab105447 in call_function Python/ceval.c:5891
    ROCm#60 0x3ffab0ffa57 in _PyEval_EvalFrameDefault Python/ceval.c:4231
    ROCm#61 0x3ffab0f052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    ROCm#62 0x3ffab102b67 in _PyEval_Vector Python/ceval.c:5065
    ROCm#63 0x3ffaaf8aec1 in _PyFunction_Vectorcall Objects/call.c:342
    ROCm#64 0x3ffaaf8e941 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114
    ROCm#65 0x3ffaaf8eddd in method_vectorcall Objects/classobject.c:53
    ROCm#66 0x3ffaaf8ab9b in PyVectorcall_Call Objects/call.c:267
    ROCm#67 0x3ffaaf8ac65 in _PyObject_Call Objects/call.c:290
    ROCm#68 0x3ffaaf8ada9 in PyObject_Call Objects/call.c:317
    ROCm#69 0x3ffab1059c7 in do_call_core Python/ceval.c:5943
    ROCm#70 0x3ffab0ffd39 in _PyEval_EvalFrameDefault Python/ceval.c:4277
    ROCm#71 0x3ffab0f052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    ROCm#72 0x3ffab102b67 in _PyEval_Vector Python/ceval.c:5065
    ROCm#73 0x3ffaaf8aec1 in _PyFunction_Vectorcall Objects/call.c:342
    ROCm#74 0x3ffaaf8a695 in _PyObject_FastCallDictTstate Objects/call.c:153
    ROCm#75 0x3ffaaf8b271 in _PyObject_Call_Prepend Objects/call.c:431
    ROCm#76 0x3ffab03f307 in slot_tp_call Objects/typeobject.c:7494
    ROCm#77 0x3ffaaf8a933 in _PyObject_MakeTpCall Objects/call.c:215
    ROCm#78 0x3ffab0f0081 in _PyObject_VectorcallTstate Include/cpython/abstract.h:112
    ROCm#79 0x3ffab0f013d in PyObject_Vectorcall Include/cpython/abstract.h:123
    ROCm#80 0x3ffab105447 in call_function Python/ceval.c:5891
    ROCm#81 0x3ffab0ffa57 in _PyEval_EvalFrameDefault Python/ceval.c:4231
    ROCm#82 0x3ffab0f052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    ROCm#83 0x3ffab102b67 in _PyEval_Vector Python/ceval.c:5065
    ROCm#84 0x3ffaaf8aec1 in _PyFunction_Vectorcall Objects/call.c:342
    ROCm#85 0x3ffab0f00a9 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114
    ROCm#86 0x3ffab0f013d in PyObject_Vectorcall Include/cpython/abstract.h:123
    ROCm#87 0x3ffab105447 in call_function Python/ceval.c:5891
    ROCm#88 0x3ffab0ff7d7 in _PyEval_EvalFrameDefault Python/ceval.c:4198
    ROCm#89 0x3ffab0f052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    ROCm#90 0x3ffab102b67 in _PyEval_Vector Python/ceval.c:5065
    ROCm#91 0x3ffaaf8aec1 in _PyFunction_Vectorcall Objects/call.c:342
    ROCm#92 0x3ffaaf8ab15 in PyVectorcall_Call Objects/call.c:255
    ROCm#93 0x3ffaaf8ac65 in _PyObject_Call Objects/call.c:290
    ROCm#94 0x3ffaaf8ada9 in PyObject_Call Objects/call.c:317
    ROCm#95 0x3ffab1059c7 in do_call_core Python/ceval.c:5943
    ROCm#96 0x3ffab0ffd39 in _PyEval_EvalFrameDefault Python/ceval.c:4277
    ROCm#97 0x3ffab0f052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    ROCm#98 0x3ffab102b67 in _PyEval_Vector Python/ceval.c:5065
    ROCm#99 0x3ffaaf8aec1 in _PyFunction_Vectorcall Objects/call.c:342
    ROCm#100 0x3ffab0f00a9 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114
    ROCm#101 0x3ffab0f013d in PyObject_Vectorcall Include/cpython/abstract.h:123
    ROCm#102 0x3ffab105447 in call_function Python/ceval.c:5891
    ROCm#103 0x3ffab0ff779 in _PyEval_EvalFrameDefault Python/ceval.c:4181
    ROCm#104 0x3ffab0f052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    ROCm#105 0x3ffab102b67 in _PyEval_Vector Python/ceval.c:5065
    ROCm#106 0x3ffaaf8aec1 in _PyFunction_Vectorcall Objects/call.c:342
    ROCm#107 0x3ffaaf8e941 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114
    ROCm#108 0x3ffaaf8eddd in method_vectorcall Objects/classobject.c:53
    ROCm#109 0x3ffab0f00a9 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114
    ROCm#110 0x3ffab0f013d in PyObject_Vectorcall Include/cpython/abstract.h:123
    ROCm#111 0x3ffab105447 in call_function Python/ceval.c:5891
    ROCm#112 0x3ffab0ff779 in _PyEval_EvalFrameDefault Python/ceval.c:4181
    ROCm#113 0x3ffab0f052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    ROCm#114 0x3ffab102b67 in _PyEval_Vector Python/ceval.c:5065
    ROCm#115 0x3ffaaf8aec1 in _PyFunction_Vectorcall Objects/call.c:342
    ROCm#116 0x3ffaaf8a695 in _PyObject_FastCallDictTstate Objects/call.c:153
    ROCm#117 0x3ffaaf8b271 in _PyObject_Call_Prepend Objects/call.c:431
    ROCm#118 0x3ffab03f307 in slot_tp_call Objects/typeobject.c:7494
    ROCm#119 0x3ffaaf8ad17 in _PyObject_Call Objects/call.c:305
    ROCm#120 0x3ffaaf8ada9 in PyObject_Call Objects/call.c:317
    ROCm#121 0x3ffab1059c7 in do_call_core Python/ceval.c:5943
    ROCm#122 0x3ffab0ffd39 in _PyEval_EvalFrameDefault Python/ceval.c:4277
    ROCm#123 0x3ffab0f052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    ROCm#124 0x3ffab102b67 in _PyEval_Vector Python/ceval.c:5065
    ROCm#125 0x3ffaaf8aec1 in _PyFunction_Vectorcall Objects/call.c:342
    ROCm#126 0x3ffab0f00a9 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114
    ROCm#127 0x3ffab0f013d in PyObject_Vectorcall Include/cpython/abstract.h:123
    ROCm#128 0x3ffab105447 in call_function Python/ceval.c:5891
    ROCm#129 0x3ffab0ff905 in _PyEval_EvalFrameDefault Python/ceval.c:4213
    ROCm#130 0x3ffab0f052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    ROCm#131 0x3ffab102b67 in _PyEval_Vector Python/ceval.c:5065
    ROCm#132 0x3ffaaf8aec1 in _PyFunction_Vectorcall Objects/call.c:342
    ROCm#133 0x3ffaaf8e941 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114
    ROCm#134 0x3ffaaf8eddd in method_vectorcall Objects/classobject.c:53
    ROCm#135 0x3ffab0f00a9 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114
    ROCm#136 0x3ffab0f013d in PyObject_Vectorcall Include/cpython/abstract.h:123
    ROCm#137 0x3ffab105447 in call_function Python/ceval.c:5891
    ROCm#138 0x3ffab0ffa57 in _PyEval_EvalFrameDefault Python/ceval.c:4231
    ROCm#139 0x3ffab0f052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    ROCm#140 0x3ffab102b67 in _PyEval_Vector Python/ceval.c:5065
    ROCm#141 0x3ffaaf8aec1 in _PyFunction_Vectorcall Objects/call.c:342
    ROCm#142 0x3ffaaf8ab15 in PyVectorcall_Call Objects/call.c:255
    ROCm#143 0x3ffaaf8ac65 in _PyObject_Call Objects/call.c:290
    ROCm#144 0x3ffaaf8ada9 in PyObject_Call Objects/call.c:317
    ROCm#145 0x3ffab1059c7 in do_call_core Python/ceval.c:5943
    ROCm#146 0x3ffab0ffd39 in _PyEval_EvalFrameDefault Python/ceval.c:4277
    ROCm#147 0x3ffab0f052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    ROCm#148 0x3ffab102b67 in _PyEval_Vector Python/ceval.c:5065
    ROCm#149 0x3ffaaf8aec1 in _PyFunction_Vectorcall Objects/call.c:342
    ROCm#150 0x3ffab0f00a9 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114
    ROCm#151 0x3ffab0f013d in PyObject_Vectorcall Include/cpython/abstract.h:123
    ROCm#152 0x3ffab105447 in call_function Python/ceval.c:5891
    ROCm#153 0x3ffab0ff905 in _PyEval_EvalFrameDefault Python/ceval.c:4213
    ROCm#154 0x3ffab0f052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    ROCm#155 0x3ffab102b67 in _PyEval_Vector Python/ceval.c:5065
    ROCm#156 0x3ffaaf8aec1 in _PyFunction_Vectorcall Objects/call.c:342
    ROCm#157 0x3ffab0f00a9 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114
    ROCm#158 0x3ffab0f013d in PyObject_Vectorcall Include/cpython/abstract.h:123
    ROCm#159 0x3ffab105447 in call_function Python/ceval.c:5891
    ROCm#160 0x3ffab0ffa57 in _PyEval_EvalFrameDefault Python/ceval.c:4231
    ROCm#161 0x3ffab0f052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    ROCm#162 0x3ffab102b67 in _PyEval_Vector Python/ceval.c:5065
    ROCm#163 0x3ffaaf8aec1 in _PyFunction_Vectorcall Objects/call.c:342
    ROCm#164 0x3ffaaf8ab15 in PyVectorcall_Call Objects/call.c:255
    ROCm#165 0x3ffaaf8ac65 in _PyObject_Call Objects/call.c:290
    ROCm#166 0x3ffaaf8ada9 in PyObject_Call Objects/call.c:317
    ROCm#167 0x3ffab1059c7 in do_call_core Python/ceval.c:5943
    ROCm#168 0x3ffab0ffd39 in _PyEval_EvalFrameDefault Python/ceval.c:4277
    ROCm#169 0x3ffab0f052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    ROCm#170 0x3ffab102b67 in _PyEval_Vector Python/ceval.c:5065
    ROCm#171 0x3ffaaf8aec1 in _PyFunction_Vectorcall Objects/call.c:342
    ROCm#172 0x3ffab0f00a9 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114
    ROCm#173 0x3ffab0f013d in PyObject_Vectorcall Include/cpython/abstract.h:123
    ROCm#174 0x3ffab105447 in call_function Python/ceval.c:5891
    ROCm#175 0x3ffab0ff779 in _PyEval_EvalFrameDefault Python/ceval.c:4181
    ROCm#176 0x3ffab0f052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    ROCm#177 0x3ffab102b67 in _PyEval_Vector Python/ceval.c:5065
    ROCm#178 0x3ffaaf8aec1 in _PyFunction_Vectorcall Objects/call.c:342
    ROCm#179 0x3ffaaf8e941 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114
    ROCm#180 0x3ffaaf8eddd in method_vectorcall Objects/classobject.c:53
    ROCm#181 0x3ffab0f00a9 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114
    ROCm#182 0x3ffab0f013d in PyObject_Vectorcall Include/cpython/abstract.h:123
    ROCm#183 0x3ffab105447 in call_function Python/ceval.c:5891
    ROCm#184 0x3ffab0ff779 in _PyEval_EvalFrameDefault Python/ceval.c:4181
    ROCm#185 0x3ffab0f052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    ROCm#186 0x3ffab102b67 in _PyEval_Vector Python/ceval.c:5065
    ROCm#187 0x3ffaaf8aec1 in _PyFunction_Vectorcall Objects/call.c:342
    ROCm#188 0x3ffaaf8a695 in _PyObject_FastCallDictTstate Objects/call.c:153
    ROCm#189 0x3ffaaf8b271 in _PyObject_Call_Prepend Objects/call.c:431
    ROCm#190 0x3ffab03f307 in slot_tp_call Objects/typeobject.c:7494
    ROCm#191 0x3ffaaf8a933 in _PyObject_MakeTpCall Objects/call.c:215
    ROCm#192 0x3ffab0f0081 in _PyObject_VectorcallTstate Include/cpython/abstract.h:112
    ROCm#193 0x3ffab0f013d in PyObject_Vectorcall Include/cpython/abstract.h:123
    ROCm#194 0x3ffab105447 in call_function Python/ceval.c:5891
    ROCm#195 0x3ffab0ffa57 in _PyEval_EvalFrameDefault Python/ceval.c:4231
    ROCm#196 0x3ffab0f052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    ROCm#197 0x3ffab102b67 in _PyEval_Vector Python/ceval.c:5065
    ROCm#198 0x3ffaaf8aec1 in _PyFunction_Vectorcall Objects/call.c:342
    ROCm#199 0x3ffaaf8ab15 in PyVectorcall_Call Objects/call.c:255
    ROCm#200 0x3ffaaf8ac65 in _PyObject_Call Objects/call.c:290
    ROCm#201 0x3ffaaf8ada9 in PyObject_Call Objects/call.c:317
    ROCm#202 0x3ffab1059c7 in do_call_core Python/ceval.c:5943
    ROCm#203 0x3ffab0ffd39 in _PyEval_EvalFrameDefault Python/ceval.c:4277
    ROCm#204 0x3ffab0f052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    ROCm#205 0x3ffab102b67 in _PyEval_Vector Python/ceval.c:5065
    ROCm#206 0x3ffaaf8aec1 in _PyFunction_Vectorcall Objects/call.c:342
    ROCm#207 0x3ffab0f00a9 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114
    ROCm#208 0x3ffab0f013d in PyObject_Vectorcall Include/cpython/abstract.h:123
    ROCm#209 0x3ffab105447 in call_function Python/ceval.c:5891
    ROCm#210 0x3ffab0ff779 in _PyEval_EvalFrameDefault Python/ceval.c:4181
    ROCm#211 0x3ffab0f052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    ROCm#212 0x3ffab102b67 in _PyEval_Vector Python/ceval.c:5065
    ROCm#213 0x3ffaaf8aec1 in _PyFunction_Vectorcall Objects/call.c:342
    ROCm#214 0x3ffaaf8e941 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114
    ROCm#215 0x3ffaaf8eddd in method_vectorcall Objects/classobject.c:53
    ROCm#216 0x3ffab0f00a9 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114
    ROCm#216 0x3ffab0f00a9 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114
    ROCm#217 0x3ffab0f013d in PyObject_Vectorcall Include/cpython/abstract.h:123
    ROCm#218 0x3ffab105447 in call_function Python/ceval.c:5891
    ROCm#219 0x3ffab0ff779 in _PyEval_EvalFrameDefault Python/ceval.c:4181
    ROCm#220 0x3ffab0f052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    ROCm#221 0x3ffab102b67 in _PyEval_Vector Python/ceval.c:5065
    ROCm#222 0x3ffaaf8aec1 in _PyFunction_Vectorcall Objects/call.c:342
    ROCm#223 0x3ffaaf8a695 in _PyObject_FastCallDictTstate Objects/call.c:153
    ROCm#224 0x3ffaaf8b271 in _PyObject_Call_Prepend Objects/call.c:431
    ROCm#225 0x3ffab03f307 in slot_tp_call Objects/typeobject.c:7494
    ROCm#226 0x3ffaaf8a933 in _PyObject_MakeTpCall Objects/call.c:215
    ROCm#227 0x3ffab0f0081 in _PyObject_VectorcallTstate Include/cpython/abstract.h:112
    ROCm#228 0x3ffab0f013d in PyObject_Vectorcall Include/cpython/abstract.h:123
    ROCm#229 0x3ffab105447 in call_function Python/ceval.c:5891
    ROCm#230 0x3ffab0ffa57 in _PyEval_EvalFrameDefault Python/ceval.c:4231
    ROCm#231 0x3ffab0f052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    ROCm#232 0x3ffab102b67 in _PyEval_Vector Python/ceval.c:5065
    ROCm#233 0x3ffaaf8aec1 in _PyFunction_Vectorcall Objects/call.c:342
    ROCm#234 0x3ffab0f00a9 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114
    ROCm#235 0x3ffab0f013d in PyObject_Vectorcall Include/cpython/abstract.h:123
    ROCm#236 0x3ffab105447 in call_function Python/ceval.c:5891
    ROCm#237 0x3ffab0ff905 in _PyEval_EvalFrameDefault Python/ceval.c:4213
    ROCm#238 0x3ffab0f052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    ROCm#239 0x3ffab102b67 in _PyEval_Vector Python/ceval.c:5065
    ROCm#240 0x3ffaaf8aec1 in _PyFunction_Vectorcall Objects/call.c:342
    ROCm#241 0x3ffab0f00a9 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114
    ROCm#242 0x3ffab0f013d in PyObject_Vectorcall Include/cpython/abstract.h:123
    ROCm#243 0x3ffab105447 in call_function Python/ceval.c:5891
    ROCm#244 0x3ffab0ff905 in _PyEval_EvalFrameDefault Python/ceval.c:4213
    ROCm#245 0x3ffab0f052b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    ROCm#246 0x3ffab102b67 in _PyEval_Vector Python/ceval.c:5065
    ROCm#247 0x3ffaaf8aec1 in _PyFunction_Vectorcall Objects/call.c:342
    ROCm#248 0x3ffaaf8ab15 in PyVectorcall_Call Objects/call.c:255
    ROCm#249 0x3ffaaf8ac65 in _PyObject_Call Objects/call.c:290

0x60d0005a5790 is located 80 bytes inside of 136-byte region [0x60d0005a5740,0x60d0005a57c8)
freed by thread T0 here:
    #0 0x3ffab537de5 in operator delete(void*) /var/tmp/portage/sys-devel/gcc-11.3.1_p20230303/work/gcc-11-20230303/libsanitizer/asan/asan_new_delete.cpp:160
    ROCm#1 0x3ff55984fdb in __gnu_cxx::new_allocator<std::_Sp_counted_ptr_inplace<c10::FunctionSchema, std::allocator<c10::FunctionSchema>, (__gnu_cxx::_Lock_policy)2> >::deallocate(std::_Sp_counted_ptr_inplace<c10::FunctionSchema, std::allocator<c10::FunctionSchema>, (__gnu_cxx::_Lock_policy)2>*, unsigned long) /usr/lib/gcc/s390x-ibm-linux-gnu/11/include/g++-v11/ext/new_allocator.h:145

previously allocated by thread T0 here:
    #0 0x3ffab53734f in operator new(unsigned long) /var/tmp/portage/sys-devel/gcc-11.3.1_p20230303/work/gcc-11-20230303/libsanitizer/asan/asan_new_delete.cpp:99
    ROCm#1 0x3ff5598443f in __gnu_cxx::new_allocator<std::_Sp_counted_ptr_inplace<c10::FunctionSchema, std::allocator<c10::FunctionSchema>, (__gnu_cxx::_Lock_policy)2> >::allocate(unsigned long, void const*) /usr/lib/gcc/s390x-ibm-linux-gnu/11/include/g++-v11/ext/new_allocator.h:127
    ROCm#2 0x3fff5849ecf  ([stack]+0xb2ecf)

SUMMARY: AddressSanitizer: heap-use-after-free /usr/lib/gcc/s390x-ibm-linux-gnu/11/include/g++-v11/bits/stl_iterator.h:1028 in __gnu_cxx::__normal_iterator<c10::Argument const*, std::vector<c10::Argument, std::allocator<c10::Argument> > >::__normal_iterator(c10::Argument const* const&)
Shadow bytes around the buggy address:
  0x100c1a000b4aa0: fd fd fd fd fd fd fd fd fd fd fd fa fa fa fa fa
  0x100c1a000b4ab0: fa fa fa fa fd fd fd fd fd fd fd fd fd fd fd fd
  0x100c1a000b4ac0: fd fd fd fd fd fa fa fa fa fa fa fa fa fa fd fd
  0x100c1a000b4ad0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fa
  0x100c1a000b4ae0: fa fa fa fa fa fa fa fa fd fd fd fd fd fd fd fd
=>0x100c1a000b4af0: fd fd[fd]fd fd fd fd fd fd fa fa fa fa fa fa fa
  0x100c1a000b4b00: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x100c1a000b4b10: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x100c1a000b4b20: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x100c1a000b4b30: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x100c1a000b4b40: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
  Shadow gap:              cc
==1134126==ABORTING
```

Additional backtraces (not full):
Allocation:
```
#0  __memset_z196 () at ../sysdeps/s390/memset-z900.S:144
ROCm#1  0x000003ff96f3072a in __asan::Allocator::Allocate (this=this@entry=0x3ff97041eb8 <__asan::instance>, size=size@entry=136, alignment=8, alignment@entry=0, stack=<optimized out>,
    stack@entry=0x3ffdbb45d78, alloc_type=<optimized out>, can_fill=true) at /var/tmp/portage/sys-devel/gcc-11.3.1_p20230303/work/gcc-11-20230303/libsanitizer/asan/asan_allocator.cpp:599
ROCm#2  0x000003ff96f2c088 in __asan::asan_memalign (alignment=alignment@entry=0, size=size@entry=136, stack=stack@entry=0x3ffdbb45d78, alloc_type=alloc_type@entry=__asan::FROM_NEW)
    at /var/tmp/portage/sys-devel/gcc-11.3.1_p20230303/work/gcc-11-20230303/libsanitizer/asan/asan_allocator.cpp:1039
ROCm#3  0x000003ff96fb73b0 in operator new (size=136) at /var/tmp/portage/sys-devel/gcc-11.3.1_p20230303/work/gcc-11-20230303/libsanitizer/asan/asan_new_delete.cpp:99
ROCm#4  0x000003ff41404440 in __gnu_cxx::new_allocator<std::_Sp_counted_ptr_inplace<c10::FunctionSchema, std::allocator<c10::FunctionSchema>, (__gnu_cxx::_Lock_policy)2> >::allocate (this=0x3ffdbb468c0,
    __n=1) at /usr/lib/gcc/s390x-ibm-linux-gnu/11/include/g++-v11/ext/new_allocator.h:127
ROCm#5  0x000003ff414042a0 in std::allocator_traits<std::allocator<std::_Sp_counted_ptr_inplace<c10::FunctionSchema, std::allocator<c10::FunctionSchema>, (__gnu_cxx::_Lock_policy)2> > >::allocate (__a=...,
    __n=1) at /usr/lib/gcc/s390x-ibm-linux-gnu/11/include/g++-v11/bits/alloc_traits.h:464
ROCm#6  0x000003ff41403b66 in std::__allocate_guarded<std::allocator<std::_Sp_counted_ptr_inplace<c10::FunctionSchema, std::allocator<c10::FunctionSchema>, (__gnu_cxx::_Lock_policy)2> > > (__a=...)
    at /usr/lib/gcc/s390x-ibm-linux-gnu/11/include/g++-v11/bits/allocated_ptr.h:98
ROCm#7  0x000003ff4140372a in std::__shared_count<(__gnu_cxx::_Lock_policy)2>::__shared_count<c10::FunctionSchema, std::allocator<c10::FunctionSchema>, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::vector<c10::Argument, std::allocator<c10::Argument> >, std::vector<c10::Argument, std::allocator<c10::Argument> > > (this=0x3ffdbb47888, __p=@0x3ffdbb47880: 0x0, __a=..., __args=..., __args=..., __args=..., __args=...)
    at /usr/lib/gcc/s390x-ibm-linux-gnu/11/include/g++-v11/bits/shared_ptr_base.h:648
ROCm#8  0x000003ff41403328 in std::__shared_ptr<c10::FunctionSchema, (__gnu_cxx::_Lock_policy)2>::__shared_ptr<std::allocator<c10::FunctionSchema>, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::vector<c10::Argument, std::allocator<c10::Argument> >, std::vector<c10::Argument, std::allocator<c10::Argument> > > (this=0x3ffdbb47880, __tag=..., __args=..., __args=..., __args=..., __args=...) at /usr/lib/gcc/s390x-ibm-linux-gnu/11/include/g++-v11/bits/shared_ptr_base.h:1342
ROCm#9  0x000003ff41402f06 in std::shared_ptr<c10::FunctionSchema>::shared_ptr<std::allocator<c10::FunctionSchema>, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::vector<c10::Argument, std::allocator<c10::Argument> >, std::vector<c10::Argument, std::allocator<c10::Argument> > > (
    this=0x3ffdbb47880, __tag=..., __args=..., __args=..., __args=..., __args=...) at /usr/lib/gcc/s390x-ibm-linux-gnu/11/include/g++-v11/bits/shared_ptr.h:409
ROCm#10 0x000003ff41402b6e in std::allocate_shared<c10::FunctionSchema, std::allocator<c10::FunctionSchema>, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::vector<c10::Argument, std::allocator<c10::Argument> >, std::vector<c10::Argument, std::allocator<c10::Argument> > > (__a=...,
    __args=..., __args=..., __args=..., __args=...) at /usr/lib/gcc/s390x-ibm-linux-gnu/11/include/g++-v11/bits/shared_ptr.h:862
ROCm#11 0x000003ff4140215c in std::make_shared<c10::FunctionSchema, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::vector<c10::Argument, std::allocator<c10::Argument> >, std::vector<c10::Argument, std::allocator<c10::Argument> > > (__args=..., __args=..., __args=..., __args=...)
    at /usr/lib/gcc/s390x-ibm-linux-gnu/11/include/g++-v11/bits/shared_ptr.h:878
ROCm#12 0x000003ff413d180c in c10::TupleType::createWithSpec<c10::basic_string_view<char> > (qualName=..., field_names=std::vector of length 1, capacity 1 = {...},
    field_types=std::vector of length 1, capacity 1 = {...}, field_defaults=std::vector of length 0, capacity 0) at /home/user/pytorch/aten/src/ATen/core/type.cpp:769
ROCm#13 0x000003ff413b9ca6 in c10::TupleType::createNamed (qualName=..., field_names=std::vector of length 1, capacity 1 = {...}, field_types=std::vector of length 1, capacity 1 = {...})
    at /home/user/pytorch/aten/src/ATen/core/type.cpp:725
ROCm#14 0x000003ff4115fbac in c10::ivalue::TupleTypeFactory<c10::TupleType>::fallback (type=...) at /home/user/pytorch/aten/src/ATen/core/dynamic_type.cpp:383
ROCm#15 0x000003ff708217fe in c10::ivalue::Tuple::type<c10::TupleType> (this=0x6080004b8520) at /home/user/pytorch/aten/src/ATen/core/ivalue_inl.h:781
ROCm#16 0x000003ff70800740 in torch::jit::toPyObject (ivalue=...) at /home/user/pytorch/torch/csrc/jit/python/pybind_utils.cpp:613
ROCm#17 0x000003ff70800306 in torch::jit::toPyObject (ivalue=...) at /home/user/pytorch/torch/csrc/jit/python/pybind_utils.cpp:604
ROCm#18 0x000003ff702d6872 in pybind11::detail::type_caster<c10::IValue, void>::cast (src=...) at /home/user/pytorch/torch/csrc/jit/python/pybind.h:138
ROCm#19 0x000003ff70d98192 in pybind11::cpp_function::initialize<torch::jit::initJitScriptBindings(_object*)::$_45, c10::IValue, torch::jit::mobile::Module&, pybind11::tuple const&, pybind11::name, pybind11::is_method, pybind11::sibling, pybind11::arg>(torch::jit::initJitScriptBindings(_object*)::$_45&&, c10::IValue (*)(torch::jit::mobile::Module&, pybind11::tuple const&), pybind11::name const&, pybind11::is_method const&, pybind11::sibling const&, pybind11::arg const&)::{lambda(pybind11::detail::function_call&)ROCm#1}::operator()(pybind11::detail::function_call&) const (this=0x3ffdbb4ca20, call=...)
    at /home/user/pytorch/cmake/../third_party/pybind11/include/pybind11/pybind11.h:249
ROCm#20 0x000003ff70d97cfe in pybind11::cpp_function::initialize<torch::jit::initJitScriptBindings(_object*)::$_45, c10::IValue, torch::jit::mobile::Module&, pybind11::tuple const&, pybind11::name, pybind11::is_method, pybind11::sibling, pybind11::arg>(torch::jit::initJitScriptBindings(_object*)::$_45&&, c10::IValue (*)(torch::jit::mobile::Module&, pybind11::tuple const&), pybind11::name const&, pybind11::is_method const&, pybind11::sibling const&, pybind11::arg const&)::{lambda(pybind11::detail::function_call&)ROCm#1}::__invoke(pybind11::detail::function_call&) (call=...)
    at /home/user/pytorch/cmake/../third_party/pybind11/include/pybind11/pybind11.h:224
ROCm#21 0x000003ff6e9652ea in pybind11::cpp_function::dispatcher (self=<PyCapsule at remote 0x3ff83e27720>,
    args_in=(<torch._C.LiteScriptModule at remote 0x3ff811844b0>, (<Tensor at remote 0x3ff814efb00>,)), kwargs_in=0x0) at /home/user/pytorch/cmake/../third_party/pybind11/include/pybind11/pybind11.h:929
```

Deallocation:
```
#0  operator delete (ptr=0x60d0005a5740) at /var/tmp/portage/sys-devel/gcc-11.3.1_p20230303/work/gcc-11-20230303/libsanitizer/asan/asan_new_delete.cpp:160
ROCm#1  0x000003ff44904fdc in __gnu_cxx::new_allocator<std::_Sp_counted_ptr_inplace<c10::FunctionSchema, std::allocator<c10::FunctionSchema>, (__gnu_cxx::_Lock_policy)2> >::deallocate (this=0x3ffc5dc8020,
    __p=0x60d0005a5740, __t=1) at /usr/lib/gcc/s390x-ibm-linux-gnu/11/include/g++-v11/ext/new_allocator.h:145
ROCm#2  0x000003ff44904fa8 in std::allocator_traits<std::allocator<std::_Sp_counted_ptr_inplace<c10::FunctionSchema, std::allocator<c10::FunctionSchema>, (__gnu_cxx::_Lock_policy)2> > >::deallocate (
    __a=..., __p=0x60d0005a5740, __n=1) at /usr/lib/gcc/s390x-ibm-linux-gnu/11/include/g++-v11/bits/alloc_traits.h:496
ROCm#3  0x000003ff449041f2 in std::__allocated_ptr<std::allocator<std::_Sp_counted_ptr_inplace<c10::FunctionSchema, std::allocator<c10::FunctionSchema>, (__gnu_cxx::_Lock_policy)2> > >::~__allocated_ptr (
    this=0x3ffc5dc8030) at /usr/lib/gcc/s390x-ibm-linux-gnu/11/include/g++-v11/bits/allocated_ptr.h:74
ROCm#4  0x000003ff44904888 in std::_Sp_counted_ptr_inplace<c10::FunctionSchema, std::allocator<c10::FunctionSchema>, (__gnu_cxx::_Lock_policy)2>::_M_destroy (this=0x60d0005a5740)
    at /usr/lib/gcc/s390x-ibm-linux-gnu/11/include/g++-v11/bits/shared_ptr_base.h:538
ROCm#5  0x000003ff43895a62 in std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release (this=0x60d0005a5740) at /usr/lib/gcc/s390x-ibm-linux-gnu/11/include/g++-v11/bits/shared_ptr_base.h:184
ROCm#6  0x000003ff43895420 in std::__shared_count<(__gnu_cxx::_Lock_policy)2>::~__shared_count (this=0x611000c40648) at /usr/lib/gcc/s390x-ibm-linux-gnu/11/include/g++-v11/bits/shared_ptr_base.h:705
ROCm#7  0x000003ff4466e7f4 in std::__shared_ptr<c10::FunctionSchema, (__gnu_cxx::_Lock_policy)2>::~__shared_ptr (this=0x611000c40640)
    at /usr/lib/gcc/s390x-ibm-linux-gnu/11/include/g++-v11/bits/shared_ptr_base.h:1154
ROCm#8  0x000003ff4466d820 in std::shared_ptr<c10::FunctionSchema>::~shared_ptr (this=0x611000c40640) at /usr/lib/gcc/s390x-ibm-linux-gnu/11/include/g++-v11/bits/shared_ptr.h:122
ROCm#9  0x000003ff448d82f6 in c10::TupleType::~TupleType (this=0x611000c40580) at /home/user/pytorch/aten/src/ATen/core/jit_type.h:1142
ROCm#10 0x000003ff448d8346 in c10::TupleType::~TupleType (this=0x611000c40580) at /home/user/pytorch/aten/src/ATen/core/jit_type.h:1142
ROCm#11 0x000003ff731296a4 in std::_Sp_counted_ptr<c10::TupleType*, (__gnu_cxx::_Lock_policy)2>::_M_dispose (this=0x603000c43ae0)
    at /usr/lib/gcc/s390x-ibm-linux-gnu/11/include/g++-v11/bits/shared_ptr_base.h:348
ROCm#12 0x000003ff71eaf666 in std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release (this=0x603000c43ae0) at /usr/lib/gcc/s390x-ibm-linux-gnu/11/include/g++-v11/bits/shared_ptr_base.h:168
ROCm#13 0x000003ff71eaf330 in std::__shared_count<(__gnu_cxx::_Lock_policy)2>::~__shared_count (this=0x3ffc5dc9368) at /usr/lib/gcc/s390x-ibm-linux-gnu/11/include/g++-v11/bits/shared_ptr_base.h:705
ROCm#14 0x000003ff73129ee4 in std::__shared_ptr<c10::TupleType, (__gnu_cxx::_Lock_policy)2>::~__shared_ptr (this=0x3ffc5dc9360)
    at /usr/lib/gcc/s390x-ibm-linux-gnu/11/include/g++-v11/bits/shared_ptr_base.h:1154
ROCm#15 0x000003ff73122390 in std::shared_ptr<c10::TupleType>::~shared_ptr (this=0x3ffc5dc9360) at /usr/lib/gcc/s390x-ibm-linux-gnu/11/include/g++-v11/bits/shared_ptr.h:122
ROCm#16 0x000003ff73d00788 in torch::jit::toPyObject (ivalue=...) at /home/user/pytorch/torch/csrc/jit/python/pybind_utils.cpp:613
ROCm#17 0x000003ff73d00306 in torch::jit::toPyObject (ivalue=...) at /home/user/pytorch/torch/csrc/jit/python/pybind_utils.cpp:604
```
</details>
Pull Request resolved: pytorch#101400
Approved by: https://github.com/zou3519
lcskrishna pushed a commit to lcskrishna/pytorch that referenced this pull request May 29, 2023
3 disabled functions are attempting out of bounds reads. Disable them until sleef library is fixed.

<details>
<summary>ASAN report</summary>

```
=================================================================
==2030580==ERROR: AddressSanitizer: global-buffer-overflow on address 0x03ff70f54570 at pc 0x03ff6704e960 bp 0x03ffce128940 sp 0x03ffce128930
READ of size 4 at 0x03ff70f54570 thread T0
    #0 0x3ff6704e95f in vgather_vf_p_vi2 /home/user/pytorch/third_party/sleef/src/arch/helpers390x_128.h:129
    ROCm#1 0x3ff6704e95f in rempif /home/user/pytorch/third_party/sleef/src/libm/sleefsimdsp.c:550
    ROCm#2 0x3ff6704e95f in Sleef_cosf4_u10vxe2 /home/user/pytorch/third_party/sleef/src/libm/sleefsimdsp.c:1021
    ROCm#3 0x3ff67029cfb in Sleef_cosf4_u10 /home/user/pytorch/build/sleef/src/libm/disps390x_128.c:182
    ROCm#4 0x3ff55d21941 in at::vec::ZVECTOR::Vectorized<float, void> at::vec::ZVECTOR::Vectorized<float, void>::mapSleef<float __vector(4) const (*)(float __vector(4)), double __vector(2) const (*)(double __
vector(2)), float, 0>(float __vector(4) const (*)(float __vector(4)), double __vector(2) const (*)(double __vector(2))) const /home/user/pytorch/aten/src/ATen/cpu/vec/vec256/zarch/vec256_zarch.h:991
    ROCm#5 0x3ff5689ad01 in at::vec::ZVECTOR::Vectorized<float, void>::cos() const /home/user/pytorch/aten/src/ATen/cpu/vec/vec256/zarch/vec256_zarch.h:1074
    ROCm#6 0x3ff5685df97 in at::vml::ZVECTOR::vcos<float>(float*, float const*, long)::{lambda(at::vec::ZVECTOR::Vectorized<float, void>)ROCm#1}::operator()(at::vec::ZVECTOR::Vectorized<float, void>) const /home/
user/pytorch/aten/src/ATen/cpu/vml.h:71
    ROCm#7 0x3ff5689b691 in void at::vec::map<float, at::vml::ZVECTOR::vcos<float>(float*, float const*, long)::{lambda(at::vec::ZVECTOR::Vectorized<float, void>)ROCm#1}, 0>(at::vml::ZVECTOR::vcos<float>(float*,
float const*, long)::{lambda(at::vec::ZVECTOR::Vectorized<float, void>)ROCm#1} const&, float*, float const*, long) /home/user/pytorch/aten/src/ATen/cpu/vec/functional_base.h:239
    ROCm#8 0x3ff5685e0df in void at::vml::ZVECTOR::vcos<float>(float*, float const*, long) /home/user/pytorch/aten/src/ATen/cpu/vml.h:71
    ROCm#9 0x3ff563fdde3 in operator() /home/user/pytorch/aten/src/ATen/native/cpu/UnaryOpsKernel.cpp:770
    ROCm#10 0x3ff5648e4a3 in operator() /home/user/pytorch/aten/src/ATen/TensorIterator.h:406
    ROCm#11 0x3ff5663cae1 in callback_fn<at::TensorIteratorBase::loop_2d_from_1d<at::native::ZVECTOR::cos_kernel(at::TensorIteratorBase&)::<lambda()>::<lambda()>::<lambda(char**, const int64_t*, int64_t)> >(c
onst at::native::ZVECTOR::cos_kernel(at::TensorIteratorBase&)::<lambda()>::<lambda()>::<lambda(char**, const int64_t*, int64_t)>&)::<lambda(char**, const int64_t*, int64_t, int64_t)> > /home/user/pytorch/
c10/util/FunctionRef.h:43
    ROCm#12 0x3ff4d45a933 in c10::function_ref<void (char**, long const*, long, long)>::operator()(char**, long const*, long, long) const /home/user/pytorch/c10/util/FunctionRef.h:64
    ROCm#13 0x3ff4d455133 in at::internal::serial_for_each(c10::ArrayRef<long>, c10::ArrayRef<long>, char**, unsigned long, c10::function_ref<void (char**, long const*, long, long)>, at::Range) /home/user/pyt
orch/aten/src/ATen/TensorIteratorInternal.h:52
    ROCm#14 0x3ff4d43b703 in at::TensorIteratorBase::serial_for_each(c10::function_ref<void (char**, long const*, long, long)>, at::Range) const /home/user/pytorch/aten/src/ATen/TensorIterator.cpp:777
    ROCm#15 0x3ff4d43ab59 in at::TensorIteratorBase::for_each(c10::function_ref<void (char**, long const*, long, long)>, long) /home/user/pytorch/aten/src/ATen/TensorIterator.cpp:749
    ROCm#16 0x3ff5648e851 in for_each<at::native::ZVECTOR::cos_kernel(at::TensorIteratorBase&)::<lambda()>::<lambda()>::<lambda(char**, const int64_t*, int64_t)> > /home/user/pytorch/aten/src/ATen/TensorItera
tor.h:421
    ROCm#17 0x3ff563fe5f9 in operator() /home/user/pytorch/aten/src/ATen/native/cpu/UnaryOpsKernel.cpp:770
    ROCm#18 0x3ff56400915 in operator() /home/user/pytorch/aten/src/ATen/native/cpu/UnaryOpsKernel.cpp:770
    ROCm#19 0x3ff56400f1d in at::native::ZVECTOR::cos_kernel(at::TensorIteratorBase&) /home/user/pytorch/aten/src/ATen/native/cpu/UnaryOpsKernel.cpp:770
    ROCm#20 0x3ff4f303007 in void at::native::DispatchStub<void (*)(at::TensorIteratorBase&), at::native::cos_stub>::operator()<at::native::structured_cos_out&>(c10::DeviceType, at::native::structured_cos_out
&) /home/user/pytorch/aten/src/ATen/native/DispatchStub.h:158
    ROCm#21 0x3ff4f2edb3f in at::native::structured_cos_out::impl(at::Tensor const&, at::Tensor const&) /home/user/pytorch/aten/src/ATen/native/UnaryOps.cpp:330
    ROCm#22 0x3ff526ef739 in wrapper_CPU_cos /home/user/pytorch/build/aten/src/ATen/RegisterCPU.cpp:4307
    ROCm#23 0x3ff52c651d9 in operator() /home/user/pytorch/aten/src/ATen/core/boxing/impl/WrapFunctionIntoFunctor.h:13
    ROCm#24 0x3ff52c651d9 in call /home/user/pytorch/aten/src/ATen/core/boxing/impl/make_boxed_from_unboxed_functor.h:463
    ROCm#25 0x3ff5076df2f in at::Tensor c10::callUnboxedKernelFunction<at::Tensor, at::Tensor const&>(void*, c10::OperatorKernel*, c10::DispatchKeySet, at::Tensor const&) /home/user/pytorch/aten/src/ATen/core
/boxing/KernelFunction_impl.h:50
    ROCm#26 0x3ff5009a93f in at::Tensor c10::KernelFunction::call<at::Tensor, at::Tensor const&>(c10::OperatorHandle const&, c10::DispatchKeySet, at::Tensor const&) const /home/user/pytorch/aten/src/ATen/core
/boxing/KernelFunction_impl.h:103
    ROCm#27 0x3ff5009a93f in at::Tensor c10::Dispatcher::call<at::Tensor, at::Tensor const&>(c10::TypedOperatorHandle<at::Tensor (at::Tensor const&)> const&, at::Tensor const&) const /home/user/pytorch/aten/s
rc/ATen/core/dispatch/Dispatcher.h:639
    ROCm#28 0x3ff5009a93f in c10::TypedOperatorHandle<at::Tensor (at::Tensor const&)>::call(at::Tensor const&) const /home/user/pytorch/aten/src/ATen/core/dispatch/Dispatcher.h:487
    ROCm#29 0x3ff5009a93f in at::_ops::cos::call(at::Tensor const&) /home/user/pytorch/build/aten/src/ATen/Operators_0.cpp:2215
    ROCm#30 0x3ff7d813741 in at::Tensor::cos() const /home/user/pytorch/build/aten/src/ATen/core/TensorBody.h:2107
    ROCm#31 0x3ff7dc0f2b7 in operator() /home/user/pytorch/torch/csrc/autograd/generated/python_torch_functions_2.cpp:2953
    ROCm#32 0x3ff7dc0faf7 in THPVariable_cos /home/user/pytorch/torch/csrc/autograd/generated/python_torch_functions_2.cpp:2955
    ROCm#33 0x3ffa5ef5ae1 in cfunction_call Objects/methodobject.c:543
    ROCm#34 0x3ffa5e843f3 in _PyObject_Call Objects/call.c:305
    ROCm#35 0x3ffa5e84483 in PyObject_Call Objects/call.c:317
    ROCm#36 0x3ffa5feb50d in do_call_core Python/ceval.c:5915
    ROCm#37 0x3ffa5fe6019 in _PyEval_EvalFrameDefault Python/ceval.c:4277
    ROCm#38 0x3ffa5fd7aed in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    ROCm#39 0x3ffa5fe8ba9 in _PyEval_Vector Python/ceval.c:5065
    ROCm#40 0x3ffa5e8459b in _PyFunction_Vectorcall Objects/call.c:342
    ROCm#41 0x3ffa5e841fb in PyVectorcall_Call Objects/call.c:255
    ROCm#42 0x3ffa5e84347 in _PyObject_Call Objects/call.c:290
    ROCm#43 0x3ffa5e84483 in PyObject_Call Objects/call.c:317
    ROCm#44 0x3ff7f87a393 in torch::impl::dispatch::PythonKernelHolder::operator()(c10::OperatorHandle const&, c10::DispatchKeySet, std::vector<c10::IValue, std::allocator<c10::IValue> >*) /home/user/pytorch/
torch/csrc/utils/python_dispatch.cpp:175
    ROCm#45 0x3ff7f8871a7 in c10::BoxedKernel::makeFromFunctor<torch::impl::dispatch::PythonKernelHolder>(std::unique_ptr<torch::impl::dispatch::PythonKernelHolder, std::default_delete<torch::impl::dispatch::
PythonKernelHolder> >)::{lambda(c10::OperatorKernel*, c10::OperatorHandle const&, c10::DispatchKeySet, std::vector<c10::IValue, std::allocator<c10::IValue> >*)ROCm#1}::operator()(c10::OperatorKernel*, c10::Op
eratorHandle const&, c10::DispatchKeySet, std::vector<c10::IValue, std::allocator<c10::IValue> >*) const /home/user/pytorch/aten/src/ATen/core/boxing/BoxedKernel_impl.h:87
    ROCm#46 0x3ff7f887261 in c10::BoxedKernel::makeFromFunctor<torch::impl::dispatch::PythonKernelHolder>(std::unique_ptr<torch::impl::dispatch::PythonKernelHolder, std::default_delete<torch::impl::dispatch::
PythonKernelHolder> >)::{lambda(c10::OperatorKernel*, c10::OperatorHandle const&, c10::DispatchKeySet, std::vector<c10::IValue, std::allocator<c10::IValue> >*)ROCm#1}::_FUN(c10::OperatorKernel*, c10::Operator
Handle const&, c10::DispatchKeySet, std::vector<c10::IValue, std::allocator<c10::IValue> >*) /home/user/pytorch/aten/src/ATen/core/boxing/BoxedKernel_impl.h:86
    ROCm#47 0x3ff7e0d10ab in c10::BoxedKernel::callBoxed(c10::OperatorHandle const&, c10::DispatchKeySet, std::vector<c10::IValue, std::allocator<c10::IValue> >*) const /home/user/pytorch/aten/src/ATen/core/b
oxing/BoxedKernel_impl.h:41
    ROCm#48 0x3ff7e0d1459 in c10::KernelFunction::callBoxed(c10::OperatorHandle const&, c10::DispatchKeySet, std::vector<c10::IValue, std::allocator<c10::IValue> >*) const /home/user/pytorch/aten/src/ATen/cor
e/boxing/KernelFunction_impl.h:43
    ROCm#49 0x3ff7f876421 in c10::Dispatcher::callBoxed(c10::OperatorHandle const&, std::vector<c10::IValue, std::allocator<c10::IValue> >*) const /home/user/pytorch/aten/src/ATen/core/dispatch/Dispatcher.h:6
91
    ROCm#50 0x3ff4d22bcdd in c10::OperatorHandle::callBoxed(std::vector<c10::IValue, std::allocator<c10::IValue> >*) const /home/user/pytorch/aten/src/ATen/core/dispatch/Dispatcher.h:417
    ROCm#51 0x3ff65a092d5 in c10::OperatorHandle::callBoxed(std::vector<c10::IValue, std::allocator<c10::IValue> >&) const /home/user/pytorch/aten/src/ATen/core/dispatch/Dispatcher.h:421
    ROCm#52 0x3ff65a05641 in operator() /home/user/pytorch/torch/csrc/jit/runtime/register_c10_ops.cpp:15
    ROCm#53 0x3ff65a08cb5 in __invoke_impl<void, torch::jit::(anonymous namespace)::createOperatorFromC10(const c10::OperatorHandle&)::<lambda(torch::jit::Stack&)>&, std::vector<c10::IValue, std::allocator<c1
0::IValue> >&> /usr/lib/gcc/s390x-ibm-linux-gnu/11/include/g++-v11/bits/invoke.h:61
    ROCm#54 0x3ff65a0897b in __invoke_r<void, torch::jit::(anonymous namespace)::createOperatorFromC10(const c10::OperatorHandle&)::<lambda(torch::jit::Stack&)>&, std::vector<c10::IValue, std::allocator<c10::
IValue> >&> /usr/lib/gcc/s390x-ibm-linux-gnu/11/include/g++-v11/bits/invoke.h:111
    ROCm#55 0x3ff65a084e1 in _M_invoke /usr/lib/gcc/s390x-ibm-linux-gnu/11/include/g++-v11/bits/std_function.h:290
    ROCm#56 0x3ff7eb2cb21 in std::function<void (std::vector<c10::IValue, std::allocator<c10::IValue> >&)>::operator()(std::vector<c10::IValue, std::allocator<c10::IValue> >&) const /usr/lib/gcc/s390x-ibm-lin
ux-gnu/11/include/g++-v11/bits/std_function.h:590
    ROCm#57 0x3ff7eb1b659 in torch::jit::Operation::operator()(std::vector<c10::IValue, std::allocator<c10::IValue> >&) /home/user/pytorch/aten/src/ATen/core/stack.h:41
    ROCm#58 0x3ff7eb08449 in torch::jit::invokeOperatorFromPython(std::vector<std::shared_ptr<torch::jit::Operator>, std::allocator<std::shared_ptr<torch::jit::Operator> > > const&, pybind11::args, pybind11::
kwargs const&, c10::optional<c10::DispatchKey>) /home/user/pytorch/torch/csrc/jit/python/pybind_utils.cpp:764
    ROCm#59 0x3ff7eb09d85 in torch::jit::_get_operation_for_overload_or_packet(std::vector<std::shared_ptr<torch::jit::Operator>, std::allocator<std::shared_ptr<torch::jit::Operator> > > const&, c10::Symbol,
pybind11::args, pybind11::kwargs const&, bool, c10::optional<c10::DispatchKey>) /home/user/pytorch/torch/csrc/jit/python/pybind_utils.cpp:829
    ROCm#60 0x3ff7e573eb9 in operator() /home/user/pytorch/torch/csrc/jit/python/init.cpp:1549
    ROCm#61 0x3ff7e6728dd in call_impl<pybind11::object, torch::jit::initJITBindings(PyObject*)::<lambda(const string&, const string&)>::<lambda(pybind11::args, pybind11::kwargs)>&, 0, 1, pybind11::detail::vo
id_type> /home/user/pytorch/third_party/pybind11/include/pybind11/cast.h:1439
    ROCm#62 0x3ff7e64312f in call<pybind11::object, pybind11::detail::void_type, torch::jit::initJITBindings(PyObject*)::<lambda(const string&, const string&)>::<lambda(pybind11::args, pybind11::kwargs)>&> /h
ome/user/pytorch/third_party/pybind11/include/pybind11/cast.h:1408
    ROCm#63 0x3ff7e5da259 in operator() /home/user/pytorch/third_party/pybind11/include/pybind11/pybind11.h:249
    ROCm#64 0x3ff7e5da441 in _FUN /home/user/pytorch/third_party/pybind11/include/pybind11/pybind11.h:224
    ROCm#65 0x3ff7d317a1f in pybind11::cpp_function::dispatcher(_object*, _object*, _object*) /home/user/pytorch/third_party/pybind11/include/pybind11/pybind11.h:929
    ROCm#66 0x3ffa5ef5ae1 in cfunction_call Objects/methodobject.c:543
    ROCm#67 0x3ffa5e843f3 in _PyObject_Call Objects/call.c:305
    ROCm#68 0x3ffa5e84483 in PyObject_Call Objects/call.c:317
    ROCm#69 0x3ffa5feb50d in do_call_core Python/ceval.c:5915
    ROCm#70 0x3ffa5fe6019 in _PyEval_EvalFrameDefault Python/ceval.c:4277
    ROCm#71 0x3ffa5fd7aed in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    ROCm#72 0x3ffa5fe8ba9 in _PyEval_Vector Python/ceval.c:5065
    ROCm#73 0x3ffa5e8459b in _PyFunction_Vectorcall Objects/call.c:342
    ROCm#74 0x3ffa5e83d1f in _PyObject_FastCallDictTstate Objects/call.c:142
    ROCm#75 0x3ffa5e84937 in _PyObject_Call_Prepend Objects/call.c:431
    ROCm#76 0x3ffa5f2f577 in slot_tp_call Objects/typeobject.c:7494
    ROCm#77 0x3ffa5e843f3 in _PyObject_Call Objects/call.c:305
    ROCm#78 0x3ffa5e84483 in PyObject_Call Objects/call.c:317
    ROCm#79 0x3ffa5feb7cf in do_call_core Python/ceval.c:5943
    ROCm#80 0x3ffa5fe6019 in _PyEval_EvalFrameDefault Python/ceval.c:4277
    ROCm#81 0x3ffa5fd7aed in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    ROCm#82 0x3ffa5fe8ba9 in _PyEval_Vector Python/ceval.c:5065
    ROCm#83 0x3ffa5e8459b in _PyFunction_Vectorcall Objects/call.c:342
    ROCm#84 0x3ffa5fd76a3 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114
    ROCm#85 0x3ffa5fd772f in PyObject_Vectorcall Include/cpython/abstract.h:123
    ROCm#86 0x3ffa5feb289 in call_function Python/ceval.c:5891
    ROCm#87 0x3ffa5fe5c3b in _PyEval_EvalFrameDefault Python/ceval.c:4213
    ROCm#88 0x3ffa5fd7aed in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    ROCm#89 0x3ffa5fe8ba9 in _PyEval_Vector Python/ceval.c:5065
    ROCm#90 0x3ffa5e8459b in _PyFunction_Vectorcall Objects/call.c:342
    ROCm#91 0x3ffa5e841fb in PyVectorcall_Call Objects/call.c:255
    ROCm#92 0x3ffa5e84347 in _PyObject_Call Objects/call.c:290
    ROCm#93 0x3ffa5e84483 in PyObject_Call Objects/call.c:317
    ROCm#94 0x3ffa5feb7cf in do_call_core Python/ceval.c:5943
    ROCm#95 0x3ffa5fe6019 in _PyEval_EvalFrameDefault Python/ceval.c:4277
    ROCm#96 0x3ffa5fd7aed in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    ROCm#97 0x3ffa5fe8ba9 in _PyEval_Vector Python/ceval.c:5065
    ROCm#98 0x3ffa5e8459b in _PyFunction_Vectorcall Objects/call.c:342
    ROCm#99 0x3ffa5e841fb in PyVectorcall_Call Objects/call.c:255
    ROCm#100 0x3ffa5e84347 in _PyObject_Call Objects/call.c:290
    ROCm#101 0x3ffa5e84483 in PyObject_Call Objects/call.c:317
    ROCm#102 0x3ff7f87a393 in torch::impl::dispatch::PythonKernelHolder::operator()(c10::OperatorHandle const&, c10::DispatchKeySet, std::vector<c10::IValue, std::allocator<c10::IValue> >*) /home/user/pytorch
/torch/csrc/utils/python_dispatch.cpp:175
    ROCm#103 0x3ff7f8871a7 in c10::BoxedKernel::makeFromFunctor<torch::impl::dispatch::PythonKernelHolder>(std::unique_ptr<torch::impl::dispatch::PythonKernelHolder, std::default_delete<torch::impl::dispatch:
:PythonKernelHolder> >)::{lambda(c10::OperatorKernel*, c10::OperatorHandle const&, c10::DispatchKeySet, std::vector<c10::IValue, std::allocator<c10::IValue> >*)ROCm#1}::operator()(c10::OperatorKernel*, c10::O
peratorHandle const&, c10::DispatchKeySet, std::vector<c10::IValue, std::allocator<c10::IValue> >*) const /home/user/pytorch/aten/src/ATen/core/boxing/BoxedKernel_impl.h:87
    ROCm#104 0x3ff7f887261 in c10::BoxedKernel::makeFromFunctor<torch::impl::dispatch::PythonKernelHolder>(std::unique_ptr<torch::impl::dispatch::PythonKernelHolder, std::default_delete<torch::impl::dispatch:
:PythonKernelHolder> >)::{lambda(c10::OperatorKernel*, c10::OperatorHandle const&, c10::DispatchKeySet, std::vector<c10::IValue, std::allocator<c10::IValue> >*)ROCm#1}::_FUN(c10::OperatorKernel*, c10::Operato
rHandle const&, c10::DispatchKeySet, std::vector<c10::IValue, std::allocator<c10::IValue> >*) /home/user/pytorch/aten/src/ATen/core/boxing/BoxedKernel_impl.h:86
    ROCm#105 0x3ff7e0d10ab in c10::BoxedKernel::callBoxed(c10::OperatorHandle const&, c10::DispatchKeySet, std::vector<c10::IValue, std::allocator<c10::IValue> >*) const /home/user/pytorch/aten/src/ATen/core/
boxing/BoxedKernel_impl.h:41
    ROCm#106 0x3ff7e0d1459 in c10::KernelFunction::callBoxed(c10::OperatorHandle const&, c10::DispatchKeySet, std::vector<c10::IValue, std::allocator<c10::IValue> >*) const /home/user/pytorch/aten/src/ATen/co
re/boxing/KernelFunction_impl.h:43
    ROCm#107 0x3ff7f876421 in c10::Dispatcher::callBoxed(c10::OperatorHandle const&, std::vector<c10::IValue, std::allocator<c10::IValue> >*) const /home/user/pytorch/aten/src/ATen/core/dispatch/Dispatcher.h:
691
    ROCm#108 0x3ff4d22bcdd in c10::OperatorHandle::callBoxed(std::vector<c10::IValue, std::allocator<c10::IValue> >*) const /home/user/pytorch/aten/src/ATen/core/dispatch/Dispatcher.h:417
    ROCm#109 0x3ff65a092d5 in c10::OperatorHandle::callBoxed(std::vector<c10::IValue, std::allocator<c10::IValue> >&) const /home/user/pytorch/aten/src/ATen/core/dispatch/Dispatcher.h:421
    ROCm#110 0x3ff65a05641 in operator() /home/user/pytorch/torch/csrc/jit/runtime/register_c10_ops.cpp:15
    ROCm#111 0x3ff65a08cb5 in __invoke_impl<void, torch::jit::(anonymous namespace)::createOperatorFromC10(const c10::OperatorHandle&)::<lambda(torch::jit::Stack&)>&, std::vector<c10::IValue, std::allocator<c
10::IValue> >&> /usr/lib/gcc/s390x-ibm-linux-gnu/11/include/g++-v11/bits/invoke.h:61
    ROCm#112 0x3ff65a0897b in __invoke_r<void, torch::jit::(anonymous namespace)::createOperatorFromC10(const c10::OperatorHandle&)::<lambda(torch::jit::Stack&)>&, std::vector<c10::IValue, std::allocator<c10:
:IValue> >&> /usr/lib/gcc/s390x-ibm-linux-gnu/11/include/g++-v11/bits/invoke.h:111
    ROCm#113 0x3ff65a084e1 in _M_invoke /usr/lib/gcc/s390x-ibm-linux-gnu/11/include/g++-v11/bits/std_function.h:290
    ROCm#114 0x3ff7eb2cb21 in std::function<void (std::vector<c10::IValue, std::allocator<c10::IValue> >&)>::operator()(std::vector<c10::IValue, std::allocator<c10::IValue> >&) const /usr/lib/gcc/s390x-ibm-li
nux-gnu/11/include/g++-v11/bits/std_function.h:590
    ROCm#115 0x3ff7eb1b659 in torch::jit::Operation::operator()(std::vector<c10::IValue, std::allocator<c10::IValue> >&) /home/user/pytorch/aten/src/ATen/core/stack.h:41
    ROCm#116 0x3ff7eb08449 in torch::jit::invokeOperatorFromPython(std::vector<std::shared_ptr<torch::jit::Operator>, std::allocator<std::shared_ptr<torch::jit::Operator> > > const&, pybind11::args, pybind11:
:kwargs const&, c10::optional<c10::DispatchKey>) /home/user/pytorch/torch/csrc/jit/python/pybind_utils.cpp:764
    ROCm#117 0x3ff7eb09d85 in torch::jit::_get_operation_for_overload_or_packet(std::vector<std::shared_ptr<torch::jit::Operator>, std::allocator<std::shared_ptr<torch::jit::Operator> > > const&, c10::Symbol,
 pybind11::args, pybind11::kwargs const&, bool, c10::optional<c10::DispatchKey>) /home/user/pytorch/torch/csrc/jit/python/pybind_utils.cpp:829
    ROCm#118 0x3ff7e573eb9 in operator() /home/user/pytorch/torch/csrc/jit/python/init.cpp:1549
    ROCm#119 0x3ff7e6728dd in call_impl<pybind11::object, torch::jit::initJITBindings(PyObject*)::<lambda(const string&, const string&)>::<lambda(pybind11::args, pybind11::kwargs)>&, 0, 1, pybind11::detail::v
oid_type> /home/user/pytorch/third_party/pybind11/include/pybind11/cast.h:1439
    ROCm#120 0x3ff7e64312f in call<pybind11::object, pybind11::detail::void_type, torch::jit::initJITBindings(PyObject*)::<lambda(const string&, const string&)>::<lambda(pybind11::args, pybind11::kwargs)>&> /
home/user/pytorch/third_party/pybind11/include/pybind11/cast.h:1408
    ROCm#121 0x3ff7e5da259 in operator() /home/user/pytorch/third_party/pybind11/include/pybind11/pybind11.h:249
    ROCm#122 0x3ff7e5da441 in _FUN /home/user/pytorch/third_party/pybind11/include/pybind11/pybind11.h:224
    ROCm#123 0x3ff7d317a1f in pybind11::cpp_function::dispatcher(_object*, _object*, _object*) /home/user/pytorch/third_party/pybind11/include/pybind11/pybind11.h:929
    ROCm#124 0x3ffa5ef5ae1 in cfunction_call Objects/methodobject.c:543
    ROCm#125 0x3ffa5e843f3 in _PyObject_Call Objects/call.c:305
    ROCm#126 0x3ffa5e84483 in PyObject_Call Objects/call.c:317
    ROCm#127 0x3ffa5feb50d in do_call_core Python/ceval.c:5915
    ROCm#128 0x3ffa5fe6019 in _PyEval_EvalFrameDefault Python/ceval.c:4277
    ROCm#129 0x3ffa5fd7aed in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    ROCm#130 0x3ffa5fe8ba9 in _PyEval_Vector Python/ceval.c:5065
    ROCm#131 0x3ffa5e8459b in _PyFunction_Vectorcall Objects/call.c:342
    ROCm#132 0x3ffa5e83d1f in _PyObject_FastCallDictTstate Objects/call.c:142
    ROCm#133 0x3ffa5e84937 in _PyObject_Call_Prepend Objects/call.c:431
    ROCm#134 0x3ffa5f2f577 in slot_tp_call Objects/typeobject.c:7494
    ROCm#135 0x3ffa5e843f3 in _PyObject_Call Objects/call.c:305
    ROCm#136 0x3ffa5e84483 in PyObject_Call Objects/call.c:317
    ROCm#137 0x3ffa5feb7cf in do_call_core Python/ceval.c:5943
    ROCm#138 0x3ffa5fe6019 in _PyEval_EvalFrameDefault Python/ceval.c:4277
    ROCm#139 0x3ffa5fd7aed in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    ROCm#140 0x3ffa5fe8ba9 in _PyEval_Vector Python/ceval.c:5065
    ROCm#141 0x3ffa5e8459b in _PyFunction_Vectorcall Objects/call.c:342
    ROCm#142 0x3ffa5e87d2b in _PyObject_VectorcallTstate Include/cpython/abstract.h:114
    ROCm#143 0x3ffa5e882dd in method_vectorcall Objects/classobject.c:83
    ROCm#144 0x3ffa5e836d3 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114
    ROCm#145 0x3ffa5e84b6f in _PyObject_CallFunctionVa Objects/call.c:485
    ROCm#146 0x3ffa5e84f2d in callmethod Objects/call.c:557
    ROCm#147 0x3ffa5e85039 in PyObject_CallMethod Objects/call.c:577
    ROCm#148 0x3ff7f7efa05 in torch::handle_torch_function_no_python_arg_parser(c10::ArrayRef<pybind11::handle>, _object*, _object*, char const*, _object*, char const*, torch::TorchFunctionName) /home/user/py
torch/torch/csrc/utils/python_arg_parser.cpp:338
    ROCm#149 0x3ff7eb09b67 in torch::jit::_get_operation_for_overload_or_packet(std::vector<std::shared_ptr<torch::jit::Operator>, std::allocator<std::shared_ptr<torch::jit::Operator> > > const&, c10::Symbol,
 pybind11::args, pybind11::kwargs const&, bool, c10::optional<c10::DispatchKey>) /home/user/pytorch/torch/csrc/jit/python/pybind_utils.cpp:827
    ROCm#150 0x3ff7e573eb9 in operator() /home/user/pytorch/torch/csrc/jit/python/init.cpp:1549
    ROCm#151 0x3ff7e6728dd in call_impl<pybind11::object, torch::jit::initJITBindings(PyObject*)::<lambda(const string&, const string&)>::<lambda(pybind11::args, pybind11::kwargs)>&, 0, 1, pybind11::detail::v
oid_type> /home/user/pytorch/third_party/pybind11/include/pybind11/cast.h:1439
    ROCm#152 0x3ff7e64312f in call<pybind11::object, pybind11::detail::void_type, torch::jit::initJITBindings(PyObject*)::<lambda(const string&, const string&)>::<lambda(pybind11::args, pybind11::kwargs)>&> /
home/user/pytorch/third_party/pybind11/include/pybind11/cast.h:1408
    ROCm#153 0x3ff7e5da259 in operator() /home/user/pytorch/third_party/pybind11/include/pybind11/pybind11.h:249
    ROCm#154 0x3ff7e5da441 in _FUN /home/user/pytorch/third_party/pybind11/include/pybind11/pybind11.h:224
    ROCm#155 0x3ff7d317a1f in pybind11::cpp_function::dispatcher(_object*, _object*, _object*) /home/user/pytorch/third_party/pybind11/include/pybind11/pybind11.h:929
    ROCm#156 0x3ffa5ef5ae1 in cfunction_call Objects/methodobject.c:543
    ROCm#157 0x3ffa5e843f3 in _PyObject_Call Objects/call.c:305
    ROCm#158 0x3ffa5e84483 in PyObject_Call Objects/call.c:317
    ROCm#159 0x3ffa5feb50d in do_call_core Python/ceval.c:5915
    ROCm#160 0x3ffa5fe6019 in _PyEval_EvalFrameDefault Python/ceval.c:4277
    ROCm#161 0x3ffa5fd7aed in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    ROCm#162 0x3ffa5fe8ba9 in _PyEval_Vector Python/ceval.c:5065
    ROCm#163 0x3ffa5e8459b in _PyFunction_Vectorcall Objects/call.c:342
    ROCm#164 0x3ffa5e83d1f in _PyObject_FastCallDictTstate Objects/call.c:142
    ROCm#165 0x3ffa5e84937 in _PyObject_Call_Prepend Objects/call.c:431
    ROCm#166 0x3ffa5f2f577 in slot_tp_call Objects/typeobject.c:7494
    ROCm#167 0x3ffa5e84027 in _PyObject_MakeTpCall Objects/call.c:215
    ROCm#168 0x3ffa5fd767b in _PyObject_VectorcallTstate Include/cpython/abstract.h:112
    ROCm#169 0x3ffa5fd772f in PyObject_Vectorcall Include/cpython/abstract.h:123
    ROCm#170 0x3ffa5feb289 in call_function Python/ceval.c:5891
    ROCm#171 0x3ffa5fe5ad1 in _PyEval_EvalFrameDefault Python/ceval.c:4181
    ROCm#172 0x3ffa5fd7aed in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    ROCm#173 0x3ffa5fe8ba9 in _PyEval_Vector Python/ceval.c:5065
    ROCm#174 0x3ffa5e8459b in _PyFunction_Vectorcall Objects/call.c:342
    ROCm#175 0x3ffa5fd76a3 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114
    ROCm#176 0x3ffa5fd772f in PyObject_Vectorcall Include/cpython/abstract.h:123
    ROCm#177 0x3ffa5feb289 in call_function Python/ceval.c:5891
    ROCm#178 0x3ffa5fe5c3b in _PyEval_EvalFrameDefault Python/ceval.c:4213
    ROCm#179 0x3ffa5fd7aed in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    ROCm#180 0x3ffa5fe8ba9 in _PyEval_Vector Python/ceval.c:5065
    ROCm#181 0x3ffa5e8459b in _PyFunction_Vectorcall Objects/call.c:342
    ROCm#182 0x3ffa5e8427f in PyVectorcall_Call Objects/call.c:267
    ROCm#183 0x3ffa5e84347 in _PyObject_Call Objects/call.c:290
    ROCm#184 0x3ffa5e84483 in PyObject_Call Objects/call.c:317
    ROCm#185 0x3ffa5feb7cf in do_call_core Python/ceval.c:5943
    ROCm#186 0x3ffa5fe6019 in _PyEval_EvalFrameDefault Python/ceval.c:4277
    ROCm#187 0x3ffa5fd7aed in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    ROCm#188 0x3ffa5fe8ba9 in _PyEval_Vector Python/ceval.c:5065
    ROCm#189 0x3ffa5e8459b in _PyFunction_Vectorcall Objects/call.c:342
    ROCm#190 0x3ffa5e841fb in PyVectorcall_Call Objects/call.c:255
    ROCm#191 0x3ffa5e84347 in _PyObject_Call Objects/call.c:290
    ROCm#192 0x3ffa5e84483 in PyObject_Call Objects/call.c:317
    ROCm#193 0x3ffa5feb7cf in do_call_core Python/ceval.c:5943
    ROCm#194 0x3ffa5fe6019 in _PyEval_EvalFrameDefault Python/ceval.c:4277
    ROCm#195 0x3ffa5fd7aed in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    ROCm#196 0x3ffa5fe8ba9 in _PyEval_Vector Python/ceval.c:5065
    ROCm#197 0x3ffa5e8459b in _PyFunction_Vectorcall Objects/call.c:342
    ROCm#198 0x3ffa5e841fb in PyVectorcall_Call Objects/call.c:255
    ROCm#199 0x3ffa5e84347 in _PyObject_Call Objects/call.c:290
    ROCm#200 0x3ffa5e84483 in PyObject_Call Objects/call.c:317
    ROCm#201 0x3ffa5feb7cf in do_call_core Python/ceval.c:5943
    ROCm#202 0x3ffa5fe6019 in _PyEval_EvalFrameDefault Python/ceval.c:4277
    ROCm#203 0x3ffa5fd7aed in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    ROCm#204 0x3ffa5fe8ba9 in _PyEval_Vector Python/ceval.c:5065
    ROCm#205 0x3ffa5e8459b in _PyFunction_Vectorcall Objects/call.c:342
    ROCm#206 0x3ffa5e841fb in PyVectorcall_Call Objects/call.c:255
    ROCm#207 0x3ffa5e84347 in _PyObject_Call Objects/call.c:290
    ROCm#208 0x3ffa5e84483 in PyObject_Call Objects/call.c:317
    ROCm#209 0x3ffa5feb7cf in do_call_core Python/ceval.c:5943
    ROCm#210 0x3ffa5fe6019 in _PyEval_EvalFrameDefault Python/ceval.c:4277
    ROCm#211 0x3ffa5fd7aed in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    ROCm#212 0x3ffa5fe8ba9 in _PyEval_Vector Python/ceval.c:5065
    ROCm#213 0x3ffa5e8459b in _PyFunction_Vectorcall Objects/call.c:342
    ROCm#214 0x3ffa5e83d1f in _PyObject_FastCallDictTstate Objects/call.c:142
    ROCm#215 0x3ffa5e84937 in _PyObject_Call_Prepend Objects/call.c:431
    ROCm#216 0x3ffa5f2f577 in slot_tp_call Objects/typeobject.c:7494
    ROCm#217 0x3ffa5e843f3 in _PyObject_Call Objects/call.c:305
    ROCm#218 0x3ffa5e84483 in PyObject_Call Objects/call.c:317
    ROCm#219 0x3ffa5feb7cf in do_call_core Python/ceval.c:5943
    ROCm#220 0x3ffa5fe6019 in _PyEval_EvalFrameDefault Python/ceval.c:4277
    ROCm#221 0x3ffa5fd7aed in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    ROCm#222 0x3ffa5fe8ba9 in _PyEval_Vector Python/ceval.c:5065
    ROCm#223 0x3ffa5e8459b in _PyFunction_Vectorcall Objects/call.c:342
    ROCm#224 0x3ffa5fd76a3 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114
    ROCm#225 0x3ffa5fd772f in PyObject_Vectorcall Include/cpython/abstract.h:123
    ROCm#226 0x3ffa5feb289 in call_function Python/ceval.c:5891
    ROCm#227 0x3ffa5fe5b21 in _PyEval_EvalFrameDefault Python/ceval.c:4198
    ROCm#228 0x3ffa5fd7aed in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    ROCm#229 0x3ffa5fe8ba9 in _PyEval_Vector Python/ceval.c:5065
    ROCm#230 0x3ffa5e8459b in _PyFunction_Vectorcall Objects/call.c:342
    ROCm#231 0x3ffa5e8427f in PyVectorcall_Call Objects/call.c:267
    ROCm#232 0x3ffa5e84347 in _PyObject_Call Objects/call.c:290
    ROCm#233 0x3ffa5e84483 in PyObject_Call Objects/call.c:317
    ROCm#234 0x3ffa5feb7cf in do_call_core Python/ceval.c:5943
    ROCm#235 0x3ffa5fe6019 in _PyEval_EvalFrameDefault Python/ceval.c:4277
    ROCm#236 0x3ffa5fd7aed in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    ROCm#237 0x3ffa5fe8ba9 in _PyEval_Vector Python/ceval.c:5065
    ROCm#238 0x3ffa5e8459b in _PyFunction_Vectorcall Objects/call.c:342
    ROCm#239 0x3ffa5e8427f in PyVectorcall_Call Objects/call.c:267
    ROCm#240 0x3ffa5e84347 in _PyObject_Call Objects/call.c:290
    ROCm#241 0x3ffa5e84483 in PyObject_Call Objects/call.c:317
    ROCm#242 0x3ffa5feb7cf in do_call_core Python/ceval.c:5943
    ROCm#243 0x3ffa5fe6019 in _PyEval_EvalFrameDefault Python/ceval.c:4277
    ROCm#244 0x3ffa5fd7aed in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    ROCm#245 0x3ffa5fe8ba9 in _PyEval_Vector Python/ceval.c:5065
    ROCm#246 0x3ffa5e8459b in _PyFunction_Vectorcall Objects/call.c:342
    ROCm#247 0x3ffa5e8427f in PyVectorcall_Call Objects/call.c:267
    ROCm#248 0x3ffa5e84347 in _PyObject_Call Objects/call.c:290
    ROCm#249 0x3ffa5e84483 in PyObject_Call Objects/call.c:317
    ROCm#250 0x3ffa5feb7cf in do_call_core Python/ceval.c:5943
    ROCm#251 0x3ffa5fe6019 in _PyEval_EvalFrameDefault Python/ceval.c:4277
    ROCm#252 0x3ffa5fd7aed in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    ROCm#253 0x3ffa5fe8ba9 in _PyEval_Vector Python/ceval.c:5065
    ROCm#254 0x3ffa5e8459b in _PyFunction_Vectorcall Objects/call.c:342
    ROCm#255 0x3ffa5e8427f in PyVectorcall_Call Objects/call.c:267

0x03ff70f54570 is located 0 bytes to the right of global variable 'Sleef_rempitabsp' defined in '/home/user/pytorch/third_party/sleef/src/libm/rempitab.c:986:34' (0x3ff70f53f00) of size 1648
SUMMARY: AddressSanitizer: global-buffer-overflow /home/user/pytorch/third_party/sleef/src/arch/helpers390x_128.h:129 in vgather_vf_p_vi2
Shadow bytes around the buggy address:
  0x10007fee1ea850: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x10007fee1ea860: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x10007fee1ea870: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x10007fee1ea880: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x10007fee1ea890: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
=>0x10007fee1ea8a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00[f9]f9
  0x10007fee1ea8b0: f9 f9 f9 f9 00 00 00 00 00 00 00 00 00 00 00 00
  0x10007fee1ea8c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x10007fee1ea8d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x10007fee1ea8e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x10007fee1ea8f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
  Shadow gap:              cc
==2030580==ABORTING
```
</details>

It reproduces when running `pytest -v test/test_ops.py -k test_python_ref__refs_cos_cpu_bfloat16` under address sanitizer on s390x.

See also: shibatch/sleef#464

Pull Request resolved: pytorch#102266
Approved by: https://github.com/malfet
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.