Possible use-after-free of Tensor in JIT generated code #112383
Labels
module: cpp-extensions
Related to torch.utils.cpp_extension
module: crash
Problem manifests as a hard crash, as opposed to a RuntimeError
triaged
This issue has been looked at a team member, and triaged and prioritized into an appropriate module
馃悰 Describe the bug
I got occasional crashes in
test_cpp_extensions_jit
which I could easily trigger withpython test_cpp_extensions_jit.py -k test_warning
. Digging deeper I found the cause to be a potential use-after-free leading to heap corruption and a later crash in a malloc call (seen in GDB)Using Valgrind I got the following trace:
Note how the use-after-free is detected although it doesn't lead to a crash here, which is likely as nothing else runs/allocates after it.
I reduced the test code to the following which still reproduces the bug:
The issue seems to get triggered by the warning converted to an error in combination with the pytorch_error_handling. I.e. without either of
TORCH_WARN
,warnings.simplefilter("error")
orwith_pytorch_error_handling=True
the bug isn't triggeredVersions
PyTorch version: 2.0.1
GCC version: (GCC) 12.2.0
Clang version: Could not collect
CMake version: version 3.24.3
Libc version: glibc-2.17
Python version: 3.10.8 (main, Jul 25 2023, 10:52:38) [GCC 12.2.0] (64-bit runtime)
Python platform: Linux-4.14.0-115.19.1.el7a.ppc64le-ppc64le-with-glibc2.17
CPU:
Architektur: ppc64le
Byte-Reihenfolge: Little Endian
cc @malfet @zou3519
The text was updated successfully, but these errors were encountered: