Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

Possible use-after-free of Tensor in JIT generated code #112383

Closed
Flamefire opened this issue Oct 30, 2023 · 1 comment
Closed

Possible use-after-free of Tensor in JIT generated code #112383

Flamefire opened this issue Oct 30, 2023 · 1 comment
Labels
module: cpp-extensions Related to torch.utils.cpp_extension module: crash Problem manifests as a hard crash, as opposed to a RuntimeError triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module

Comments

@Flamefire
Copy link
Collaborator

Flamefire commented Oct 30, 2023

馃悰 Describe the bug

I got occasional crashes in test_cpp_extensions_jit which I could easily trigger with python test_cpp_extensions_jit.py -k test_warning. Digging deeper I found the cause to be a potential use-after-free leading to heap corruption and a later crash in a malloc call (seen in GDB)

Using Valgrind I got the following trace:

==113540== Invalid read of size 8
==113540==    at 0xB2C51E25C: c10::detail::atomic_refcount_decrement(std::atomic<unsigned long>&) (in /torchcache/torch_extensions/py310_cpu/warn_mod/warn_mod_v1.so)
==113540==    by 0xB2C53F547: c10::intrusive_ptr<c10::TensorImpl, c10::UndefinedTensorImpl>::reset_() (in /torchcache/torch_extensions/py310_cpu/warn_mod/warn_mod_v1.so)
==113540==    by 0xB2C535CF7: c10::intrusive_ptr<c10::TensorImpl, c10::UndefinedTensorImpl>::~intrusive_ptr() (in /torchcache/torch_extensions/py310_cpu/warn_mod/warn_mod_v1.so)
==113540==    by 0xB2C51E6C7: at::TensorBase::~TensorBase() (in /torchcache/torch_extensions/py310_cpu/warn_mod/warn_mod_v1.so)
==113540==    by 0xB2C51E913: at::Tensor::~Tensor() (in /torchcache/torch_extensions/py310_cpu/warn_mod/warn_mod_v1.so)
==113540==    by 0xB2C5346CF: torch::detail::wrap_pybind_function_impl_<at::Tensor (&)(at::Tensor, int), 0ul, 1ul>(at::Tensor (&)(at::Tensor, int), std::integer_sequence<unsigned long, 0ul, 1ul>, bool)::{lambda(at::Tensor, int)#1}::operator()(at::Tensor, int) const (in /torchcache/torch_extensions/py310_cpu/warn_mod/warn_mod_v1.so)
==113540==    by 0xB2C5621C3: at::Tensor pybind11::detail::argument_loader<at::Tensor, int>::call_impl<at::Tensor, torch::detail::wrap_pybind_function_impl_<at::Tensor (&)(at::Tensor, int), 0ul, 1ul>(at::Tensor (&)(at::Tensor, int), std::integer_sequence<unsigned long, 0ul, 1ul>, bool)::{lambda(at::Tensor, int)#1}&, 0ul, 1ul, pybind11::detail::void_type>(torch::detail::wrap_pybind_function_impl_<at::Tensor (&)(at::Tensor, int), 0ul, 1ul>(at::Tensor (&)(at::Tensor, int), std::integer_sequence<unsigned long, 0ul, 1ul>, bool)::{lambda(at::Tensor, int)#1}&, std::integer_sequence<unsigned long, 0ul, 1ul>, pybind11::detail::void_type&&) && (in /torchcache/torch_extensions/py310_cpu/warn_mod/warn_mod_v1.so)
==113540==    by 0xB2C55AFB7: std::enable_if<!std::is_void<at::Tensor>::value, at::Tensor>::type pybind11::detail::argument_loader<at::Tensor, int>::call<at::Tensor, pybind11::detail::void_type, torch::detail::wrap_pybind_function_impl_<at::Tensor (&)(at::Tensor, int), 0ul, 1ul>(at::Tensor (&)(at::Tensor, int), std::integer_sequence<unsigned long, 0ul, 1ul>, bool)::{lambda(at::Tensor, int)#1}&>(torch::detail::wrap_pybind_function_impl_<at::Tensor (&)(at::Tensor, int), 0ul, 1ul>(at::Tensor (&)(at::Tensor, int), std::integer_sequence<unsigned long, 0ul, 1ul>, bool)::{lambda(at::Tensor, int)#1}&) && (in /torchcache/torch_extensions/py310_cpu/warn_mod/warn_mod_v1.so)
==113540==    by 0xB2C551E7B: pybind11::cpp_function::initialize<torch::detail::wrap_pybind_function_impl_<at::Tensor (&)(at::Tensor, int), 0ul, 1ul>(at::Tensor (&)(at::Tensor, int), std::integer_sequence<unsigned long, 0ul, 1ul>, bool)::{lambda(at::Tensor, int)#1}, at::Tensor, at::Tensor, int, pybind11::name, pybind11::scope, pybind11::sibling, char [4]>(at::Tensor (&)(at::Tensor, int), at::Tensor (*)(at::Tensor, int), pybind11::name const&, pybind11::scope const&, pybind11::sibling const&, char const (&) [4])::{lambda(pybind11::detail::function_call&)#3}::operator()(pybind11::detail::function_call&) const (in /torchcache/torch_extensions/py310_cpu/warn_mod/warn_mod_v1.so)
==113540==    by 0xB2C55236B: pybind11::cpp_function::initialize<torch::detail::wrap_pybind_function_impl_<at::Tensor (&)(at::Tensor, int), 0ul, 1ul>(at::Tensor (&)(at::Tensor, int), std::integer_sequence<unsigned long, 0ul, 1ul>, bool)::{lambda(at::Tensor, int)#1}, at::Tensor, at::Tensor, int, pybind11::name, pybind11::scope, pybind11::sibling, char [4]>(at::Tensor (&)(at::Tensor, int), at::Tensor (*)(at::Tensor, int), pybind11::name const&, pybind11::scope const&, pybind11::sibling const&, char const (&) [4])::{lambda(pybind11::detail::function_call&)#3}::_FUN(pybind11::detail::function_call&) (in /torchcache/torch_extensions/py310_cpu/warn_mod/warn_mod_v1.so)
==113540==    by 0xB2C53191F: pybind11::cpp_function::dispatcher(_object*, _object*, _object*) (in /torchcache/torch_extensions/py310_cpu/warn_mod/warn_mod_v1.so)
==113540==    by 0x420315F: cfunction_call (methodobject.c:543)
==113540==  Address 0xbab4968 is 8 bytes inside a block of size 192 free'd
==113540==    at 0x408A5D8: operator delete(void*, unsigned long) (vg_replace_malloc.c:1072)
==113540==    by 0xF64858F: c10::TensorImpl::~TensorImpl() (in /torchinstall/lib/python3.10/site-packages/torch/lib/libc10.so)
==113540==    by 0xB2C53F69F: c10::intrusive_ptr<c10::TensorImpl, c10::UndefinedTensorImpl>::reset_() (in /torchcache/torch_extensions/py310_cpu/warn_mod/warn_mod_v1.so)
==113540==    by 0xB2C535CF7: c10::intrusive_ptr<c10::TensorImpl, c10::UndefinedTensorImpl>::~intrusive_ptr() (in /torchcache/torch_extensions/py310_cpu/warn_mod/warn_mod_v1.so)
==113540==    by 0xB2C51E6C7: at::TensorBase::~TensorBase() (in /torchcache/torch_extensions/py310_cpu/warn_mod/warn_mod_v1.so)
==113540==    by 0xB2C51E913: at::Tensor::~Tensor() (in /torchcache/torch_extensions/py310_cpu/warn_mod/warn_mod_v1.so)
==113540==    by 0xB2C534527: torch::detail::wrap_pybind_function_impl_<at::Tensor (&)(at::Tensor, int), 0ul, 1ul>(at::Tensor (&)(at::Tensor, int), std::integer_sequence<unsigned long, 0ul, 1ul>, bool)::{lambda(at::Tensor, int)#1}::operator()(at::Tensor, int) const (in /torchcache/torch_extensions/py310_cpu/warn_mod/warn_mod_v1.so)
==113540==    by 0xB2C5621C3: at::Tensor pybind11::detail::argument_loader<at::Tensor, int>::call_impl<at::Tensor, torch::detail::wrap_pybind_function_impl_<at::Tensor (&)(at::Tensor, int), 0ul, 1ul>(at::Tensor (&)(at::Tensor, int), std::integer_sequence<unsigned long, 0ul, 1ul>, bool)::{lambda(at::Tensor, int)#1}&, 0ul, 1ul, pybind11::detail::void_type>(torch::detail::wrap_pybind_function_impl_<at::Tensor (&)(at::Tensor, int), 0ul, 1ul>(at::Tensor (&)(at::Tensor, int), std::integer_sequence<unsigned long, 0ul, 1ul>, bool)::{lambda(at::Tensor, int)#1}&, std::integer_sequence<unsigned long, 0ul, 1ul>, pybind11::detail::void_type&&) && (in /torchcache/torch_extensions/py310_cpu/warn_mod/warn_mod_v1.so)
==113540==    by 0xB2C55AFB7: std::enable_if<!std::is_void<at::Tensor>::value, at::Tensor>::type pybind11::detail::argument_loader<at::Tensor, int>::call<at::Tensor, pybind11::detail::void_type, torch::detail::wrap_pybind_function_impl_<at::Tensor (&)(at::Tensor, int), 0ul, 1ul>(at::Tensor (&)(at::Tensor, int), std::integer_sequence<unsigned long, 0ul, 1ul>, bool)::{lambda(at::Tensor, int)#1}&>(torch::detail::wrap_pybind_function_impl_<at::Tensor (&)(at::Tensor, int), 0ul, 1ul>(at::Tensor (&)(at::Tensor, int), std::integer_sequence<unsigned long, 0ul, 1ul>, bool)::{lambda(at::Tensor, int)#1}&) && (in /torchcache/torch_extensions/py310_cpu/warn_mod/warn_mod_v1.so)
==113540==    by 0xB2C551E7B: pybind11::cpp_function::initialize<torch::detail::wrap_pybind_function_impl_<at::Tensor (&)(at::Tensor, int), 0ul, 1ul>(at::Tensor (&)(at::Tensor, int), std::integer_sequence<unsigned long, 0ul, 1ul>, bool)::{lambda(at::Tensor, int)#1}, at::Tensor, at::Tensor, int, pybind11::name, pybind11::scope, pybind11::sibling, char [4]>(at::Tensor (&)(at::Tensor, int), at::Tensor (*)(at::Tensor, int), pybind11::name const&, pybind11::scope const&, pybind11::sibling const&, char const (&) [4])::{lambda(pybind11::detail::function_call&)#3}::operator()(pybind11::detail::function_call&) const (in /torchcache/torch_extensions/py310_cpu/warn_mod/warn_mod_v1.so)
==113540==    by 0xB2C55236B: pybind11::cpp_function::initialize<torch::detail::wrap_pybind_function_impl_<at::Tensor (&)(at::Tensor, int), 0ul, 1ul>(at::Tensor (&)(at::Tensor, int), std::integer_sequence<unsigned long, 0ul, 1ul>, bool)::{lambda(at::Tensor, int)#1}, at::Tensor, at::Tensor, int, pybind11::name, pybind11::scope, pybind11::sibling, char [4]>(at::Tensor (&)(at::Tensor, int), at::Tensor (*)(at::Tensor, int), pybind11::name const&, pybind11::scope const&, pybind11::sibling const&, char const (&) [4])::{lambda(pybind11::detail::function_call&)#3}::_FUN(pybind11::detail::function_call&) (in /torchcache/torch_extensions/py310_cpu/warn_mod/warn_mod_v1.so)
==113540==    by 0xB2C53191F: pybind11::cpp_function::dispatcher(_object*, _object*, _object*) (in /torchcache/torch_extensions/py310_cpu/warn_mod/warn_mod_v1.so)
==113540==  Block was alloc'd at
==113540==    at 0x40866D0: operator new(unsigned long) (vg_replace_malloc.c:472)
==113540==    by 0x11386383: at::TensorBase at::detail::make_tensor_base<c10::TensorImpl, c10::intrusive_ptr<c10::StorageImpl, c10::detail::intrusive_target_default_null_type<c10::StorageImpl> >, c10::DispatchKeySet&, caffe2::TypeMeta&>(c10::intrusive_ptr<c10::StorageImpl, c10::detail::intrusive_target_default_null_type<c10::StorageImpl> >&&, c10::DispatchKeySet&, caffe2::TypeMeta&) (in /torchinstall/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)
==113540==    by 0x113867D3: at::TensorBase at::detail::_empty_generic<long>(c10::ArrayRef<long>, c10::Allocator*, c10::DispatchKeySet, c10::ScalarType, c10::optional<c10::MemoryFormat>) (in /torchinstall/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)
==113540==    by 0x11380A9F: at::detail::empty_generic(c10::ArrayRef<long>, c10::Allocator*, c10::DispatchKeySet, c10::ScalarType, c10::optional<c10::MemoryFormat>) (in /torchinstall/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)
==113540==    by 0x11380B57: at::detail::empty_cpu(c10::ArrayRef<long>, c10::ScalarType, bool, c10::optional<c10::MemoryFormat>) (in /torchinstall/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)
==113540==    by 0x11380C07: at::detail::empty_cpu(c10::ArrayRef<long>, c10::optional<c10::ScalarType>, c10::optional<c10::Layout>, c10::optional<c10::Device>, c10::optional<bool>, c10::optional<c10::MemoryFormat>) (in /torchinstall/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)
==113540==    by 0x11380D87: at::detail::empty_cpu(c10::ArrayRef<long>, c10::TensorOptions const&) (in /torchinstall/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)
==113540==    by 0x128DC06F: at::(anonymous namespace)::create_out(c10::ArrayRef<long>, c10::ArrayRef<long>, c10::TensorOptions const&) (in /torchinstall/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)
==113540==    by 0x129F94BB: at::(anonymous namespace)::structured_cos_out_functional::set_output_strided(long, c10::ArrayRef<long>, c10::ArrayRef<long>, c10::TensorOptions, c10::ArrayRef<at::Dimname>) (in /torchinstall/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)
==113540==    by 0x1143C29B: at::TensorIteratorBase::fast_set_up(at::TensorIteratorConfig const&) (in /torchinstall/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)
==113540==    by 0x11443363: at::TensorIteratorBase::build(at::TensorIteratorConfig&) (in /torchinstall/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)
==113540==    by 0x11445B17: at::TensorIteratorBase::build_borrowing_unary_float_op(at::TensorBase const&, at::TensorBase const&) (in /torchinstall/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)
==113540== 
==113540== Invalid write of size 8
==113540==    at 0xB2C51E264: c10::detail::atomic_refcount_decrement(std::atomic<unsigned long>&) (in /torchcache/torch_extensions/py310_cpu/warn_mod/warn_mod_v1.so)
==113540==    by 0xB2C53F547: c10::intrusive_ptr<c10::TensorImpl, c10::UndefinedTensorImpl>::reset_() (in /torchcache/torch_extensions/py310_cpu/warn_mod/warn_mod_v1.so)
==113540==    by 0xB2C535CF7: c10::intrusive_ptr<c10::TensorImpl, c10::UndefinedTensorImpl>::~intrusive_ptr() (in /torchcache/torch_extensions/py310_cpu/warn_mod/warn_mod_v1.so)
==113540==    by 0xB2C51E6C7: at::TensorBase::~TensorBase() (in /torchcache/torch_extensions/py310_cpu/warn_mod/warn_mod_v1.so)
==113540==    by 0xB2C51E913: at::Tensor::~Tensor() (in /torchcache/torch_extensions/py310_cpu/warn_mod/warn_mod_v1.so)
==113540==    by 0xB2C5346CF: torch::detail::wrap_pybind_function_impl_<at::Tensor (&)(at::Tensor, int), 0ul, 1ul>(at::Tensor (&)(at::Tensor, int), std::integer_sequence<unsigned long, 0ul, 1ul>, bool)::{lambda(at::Tensor, int)#1}::operator()(at::Tensor, int) const (in /torchcache/torch_extensions/py310_cpu/warn_mod/warn_mod_v1.so)
==113540==    by 0xB2C5621C3: at::Tensor pybind11::detail::argument_loader<at::Tensor, int>::call_impl<at::Tensor, torch::detail::wrap_pybind_function_impl_<at::Tensor (&)(at::Tensor, int), 0ul, 1ul>(at::Tensor (&)(at::Tensor, int), std::integer_sequence<unsigned long, 0ul, 1ul>, bool)::{lambda(at::Tensor, int)#1}&, 0ul, 1ul, pybind11::detail::void_type>(torch::detail::wrap_pybind_function_impl_<at::Tensor (&)(at::Tensor, int), 0ul, 1ul>(at::Tensor (&)(at::Tensor, int), std::integer_sequence<unsigned long, 0ul, 1ul>, bool)::{lambda(at::Tensor, int)#1}&, std::integer_sequence<unsigned long, 0ul, 1ul>, pybind11::detail::void_type&&) && (in /torchcache/torch_extensions/py310_cpu/warn_mod/warn_mod_v1.so)
==113540==    by 0xB2C55AFB7: std::enable_if<!std::is_void<at::Tensor>::value, at::Tensor>::type pybind11::detail::argument_loader<at::Tensor, int>::call<at::Tensor, pybind11::detail::void_type, torch::detail::wrap_pybind_function_impl_<at::Tensor (&)(at::Tensor, int), 0ul, 1ul>(at::Tensor (&)(at::Tensor, int), std::integer_sequence<unsigned long, 0ul, 1ul>, bool)::{lambda(at::Tensor, int)#1}&>(torch::detail::wrap_pybind_function_impl_<at::Tensor (&)(at::Tensor, int), 0ul, 1ul>(at::Tensor (&)(at::Tensor, int), std::integer_sequence<unsigned long, 0ul, 1ul>, bool)::{lambda(at::Tensor, int)#1}&) && (in /torchcache/torch_extensions/py310_cpu/warn_mod/warn_mod_v1.so)
==113540==    by 0xB2C551E7B: pybind11::cpp_function::initialize<torch::detail::wrap_pybind_function_impl_<at::Tensor (&)(at::Tensor, int), 0ul, 1ul>(at::Tensor (&)(at::Tensor, int), std::integer_sequence<unsigned long, 0ul, 1ul>, bool)::{lambda(at::Tensor, int)#1}, at::Tensor, at::Tensor, int, pybind11::name, pybind11::scope, pybind11::sibling, char [4]>(at::Tensor (&)(at::Tensor, int), at::Tensor (*)(at::Tensor, int), pybind11::name const&, pybind11::scope const&, pybind11::sibling const&, char const (&) [4])::{lambda(pybind11::detail::function_call&)#3}::operator()(pybind11::detail::function_call&) const (in /torchcache/torch_extensions/py310_cpu/warn_mod/warn_mod_v1.so)
==113540==    by 0xB2C55236B: pybind11::cpp_function::initialize<torch::detail::wrap_pybind_function_impl_<at::Tensor (&)(at::Tensor, int), 0ul, 1ul>(at::Tensor (&)(at::Tensor, int), std::integer_sequence<unsigned long, 0ul, 1ul>, bool)::{lambda(at::Tensor, int)#1}, at::Tensor, at::Tensor, int, pybind11::name, pybind11::scope, pybind11::sibling, char [4]>(at::Tensor (&)(at::Tensor, int), at::Tensor (*)(at::Tensor, int), pybind11::name const&, pybind11::scope const&, pybind11::sibling const&, char const (&) [4])::{lambda(pybind11::detail::function_call&)#3}::_FUN(pybind11::detail::function_call&) (in /torchcache/torch_extensions/py310_cpu/warn_mod/warn_mod_v1.so)
==113540==    by 0xB2C53191F: pybind11::cpp_function::dispatcher(_object*, _object*, _object*) (in /torchcache/torch_extensions/py310_cpu/warn_mod/warn_mod_v1.so)
==113540==    by 0x420315F: cfunction_call (methodobject.c:543)
==113540==  Address 0xbab4968 is 8 bytes inside a block of size 192 free'd
==113540==    at 0x408A5D8: operator delete(void*, unsigned long) (vg_replace_malloc.c:1072)
==113540==    by 0xF64858F: c10::TensorImpl::~TensorImpl() (in /torchinstall/lib/python3.10/site-packages/torch/lib/libc10.so)
==113540==    by 0xB2C53F69F: c10::intrusive_ptr<c10::TensorImpl, c10::UndefinedTensorImpl>::reset_() (in /torchcache/torch_extensions/py310_cpu/warn_mod/warn_mod_v1.so)
==113540==    by 0xB2C535CF7: c10::intrusive_ptr<c10::TensorImpl, c10::UndefinedTensorImpl>::~intrusive_ptr() (in /torchcache/torch_extensions/py310_cpu/warn_mod/warn_mod_v1.so)
==113540==    by 0xB2C51E6C7: at::TensorBase::~TensorBase() (in /torchcache/torch_extensions/py310_cpu/warn_mod/warn_mod_v1.so)
==113540==    by 0xB2C51E913: at::Tensor::~Tensor() (in /torchcache/torch_extensions/py310_cpu/warn_mod/warn_mod_v1.so)
==113540==    by 0xB2C534527: torch::detail::wrap_pybind_function_impl_<at::Tensor (&)(at::Tensor, int), 0ul, 1ul>(at::Tensor (&)(at::Tensor, int), std::integer_sequence<unsigned long, 0ul, 1ul>, bool)::{lambda(at::Tensor, int)#1}::operator()(at::Tensor, int) const (in /torchcache/torch_extensions/py310_cpu/warn_mod/warn_mod_v1.so)
==113540==    by 0xB2C5621C3: at::Tensor pybind11::detail::argument_loader<at::Tensor, int>::call_impl<at::Tensor, torch::detail::wrap_pybind_function_impl_<at::Tensor (&)(at::Tensor, int), 0ul, 1ul>(at::Tensor (&)(at::Tensor, int), std::integer_sequence<unsigned long, 0ul, 1ul>, bool)::{lambda(at::Tensor, int)#1}&, 0ul, 1ul, pybind11::detail::void_type>(torch::detail::wrap_pybind_function_impl_<at::Tensor (&)(at::Tensor, int), 0ul, 1ul>(at::Tensor (&)(at::Tensor, int), std::integer_sequence<unsigned long, 0ul, 1ul>, bool)::{lambda(at::Tensor, int)#1}&, std::integer_sequence<unsigned long, 0ul, 1ul>, pybind11::detail::void_type&&) && (in /torchcache/torch_extensions/py310_cpu/warn_mod/warn_mod_v1.so)
==113540==    by 0xB2C55AFB7: std::enable_if<!std::is_void<at::Tensor>::value, at::Tensor>::type pybind11::detail::argument_loader<at::Tensor, int>::call<at::Tensor, pybind11::detail::void_type, torch::detail::wrap_pybind_function_impl_<at::Tensor (&)(at::Tensor, int), 0ul, 1ul>(at::Tensor (&)(at::Tensor, int), std::integer_sequence<unsigned long, 0ul, 1ul>, bool)::{lambda(at::Tensor, int)#1}&>(torch::detail::wrap_pybind_function_impl_<at::Tensor (&)(at::Tensor, int), 0ul, 1ul>(at::Tensor (&)(at::Tensor, int), std::integer_sequence<unsigned long, 0ul, 1ul>, bool)::{lambda(at::Tensor, int)#1}&) && (in /torchcache/torch_extensions/py310_cpu/warn_mod/warn_mod_v1.so)
==113540==    by 0xB2C551E7B: pybind11::cpp_function::initialize<torch::detail::wrap_pybind_function_impl_<at::Tensor (&)(at::Tensor, int), 0ul, 1ul>(at::Tensor (&)(at::Tensor, int), std::integer_sequence<unsigned long, 0ul, 1ul>, bool)::{lambda(at::Tensor, int)#1}, at::Tensor, at::Tensor, int, pybind11::name, pybind11::scope, pybind11::sibling, char [4]>(at::Tensor (&)(at::Tensor, int), at::Tensor (*)(at::Tensor, int), pybind11::name const&, pybind11::scope const&, pybind11::sibling const&, char const (&) [4])::{lambda(pybind11::detail::function_call&)#3}::operator()(pybind11::detail::function_call&) const (in /torchcache/torch_extensions/py310_cpu/warn_mod/warn_mod_v1.so)
==113540==    by 0xB2C55236B: pybind11::cpp_function::initialize<torch::detail::wrap_pybind_function_impl_<at::Tensor (&)(at::Tensor, int), 0ul, 1ul>(at::Tensor (&)(at::Tensor, int), std::integer_sequence<unsigned long, 0ul, 1ul>, bool)::{lambda(at::Tensor, int)#1}, at::Tensor, at::Tensor, int, pybind11::name, pybind11::scope, pybind11::sibling, char [4]>(at::Tensor (&)(at::Tensor, int), at::Tensor (*)(at::Tensor, int), pybind11::name const&, pybind11::scope const&, pybind11::sibling const&, char const (&) [4])::{lambda(pybind11::detail::function_call&)#3}::_FUN(pybind11::detail::function_call&) (in /torchcache/torch_extensions/py310_cpu/warn_mod/warn_mod_v1.so)
==113540==    by 0xB2C53191F: pybind11::cpp_function::dispatcher(_object*, _object*, _object*) (in /torchcache/torch_extensions/py310_cpu/warn_mod/warn_mod_v1.so)
==113540==  Block was alloc'd at
==113540==    at 0x40866D0: operator new(unsigned long) (vg_replace_malloc.c:472)
==113540==    by 0x11386383: at::TensorBase at::detail::make_tensor_base<c10::TensorImpl, c10::intrusive_ptr<c10::StorageImpl, c10::detail::intrusive_target_default_null_type<c10::StorageImpl> >, c10::DispatchKeySet&, caffe2::TypeMeta&>(c10::intrusive_ptr<c10::StorageImpl, c10::detail::intrusive_target_default_null_type<c10::StorageImpl> >&&, c10::DispatchKeySet&, caffe2::TypeMeta&) (in /torchinstall/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)
==113540==    by 0x113867D3: at::TensorBase at::detail::_empty_generic<long>(c10::ArrayRef<long>, c10::Allocator*, c10::DispatchKeySet, c10::ScalarType, c10::optional<c10::MemoryFormat>) (in /torchinstall/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)
==113540==    by 0x11380A9F: at::detail::empty_generic(c10::ArrayRef<long>, c10::Allocator*, c10::DispatchKeySet, c10::ScalarType, c10::optional<c10::MemoryFormat>) (in /torchinstall/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)
==113540==    by 0x11380B57: at::detail::empty_cpu(c10::ArrayRef<long>, c10::ScalarType, bool, c10::optional<c10::MemoryFormat>) (in /torchinstall/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)
==113540==    by 0x11380C07: at::detail::empty_cpu(c10::ArrayRef<long>, c10::optional<c10::ScalarType>, c10::optional<c10::Layout>, c10::optional<c10::Device>, c10::optional<bool>, c10::optional<c10::MemoryFormat>) (in /torchinstall/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)
==113540==    by 0x11380D87: at::detail::empty_cpu(c10::ArrayRef<long>, c10::TensorOptions const&) (in /torchinstall/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)
==113540==    by 0x128DC06F: at::(anonymous namespace)::create_out(c10::ArrayRef<long>, c10::ArrayRef<long>, c10::TensorOptions const&) (in /torchinstall/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)
==113540==    by 0x129F94BB: at::(anonymous namespace)::structured_cos_out_functional::set_output_strided(long, c10::ArrayRef<long>, c10::ArrayRef<long>, c10::TensorOptions, c10::ArrayRef<at::Dimname>) (in /torchinstall/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)
==113540==    by 0x1143C29B: at::TensorIteratorBase::fast_set_up(at::TensorIteratorConfig const&) (in /torchinstall/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)
==113540==    by 0x11443363: at::TensorIteratorBase::build(at::TensorIteratorConfig&) (in /torchinstall/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)
==113540==    by 0x11445B17: at::TensorIteratorBase::build_borrowing_unary_float_op(at::TensorBase const&, at::TensorBase const&) (in /torchinstall/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)
==113540== 
UserWarning: Error with torch.DoubleTensor (Triggered internally at /torchcache/torch_extensions/py310_cpu/warn_mod/main.cpp:12.)
.
----------------------------------------------------------------------
Ran 1 test in 53.951s

OK

Note how the use-after-free is detected although it doesn't lead to a crash here, which is likely as nothing else runs/allocates after it.

I reduced the test code to the following which still reproduces the bug:

import warnings
import torch
import torch.utils.cpp_extension

source = '''
at::Tensor foo(at::Tensor x, int error_type) {
    std::ostringstream err_stream;
    err_stream << "Error with "  << x.type();

    TORCH_WARN(err_stream.str());
    return x.cos();
}
'''

t = torch.rand(2).double()

warn_mod = torch.utils.cpp_extension.load_inline(name='warnmod',
                                                    cpp_sources=[source],
                                                    functions=['foo'],
                                                    with_pytorch_error_handling=True)

with warnings.catch_warnings(record=True) as w:
    warnings.simplefilter("error")
    warn_mod.foo(t, 0)

The issue seems to get triggered by the warning converted to an error in combination with the pytorch_error_handling. I.e. without either of TORCH_WARN, warnings.simplefilter("error") or with_pytorch_error_handling=True the bug isn't triggered

Versions

PyTorch version: 2.0.1
GCC version: (GCC) 12.2.0
Clang version: Could not collect
CMake version: version 3.24.3
Libc version: glibc-2.17

Python version: 3.10.8 (main, Jul 25 2023, 10:52:38) [GCC 12.2.0] (64-bit runtime)
Python platform: Linux-4.14.0-115.19.1.el7a.ppc64le-ppc64le-with-glibc2.17
CPU:
Architektur: ppc64le
Byte-Reihenfolge: Little Endian

cc @malfet @zou3519

@zou3519 zou3519 added module: crash Problem manifests as a hard crash, as opposed to a RuntimeError module: cpp-extensions Related to torch.utils.cpp_extension triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module labels Oct 30, 2023
@Flamefire
Copy link
Collaborator Author

I deduced this to be a compiler bug in GCC 12+ I filed as https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112301

It is triggered by the destructor of PyWarningHandler throwing an exception and hence applies to any use of HANDLE_TH_ERRORS like wrap_pybind_function instantiated via the use of with_pytorch_error_handling = True.
It is warnings.simplefilter("error") which causes the PyErr_WarnEx to return an error code and then the destructor does throw python_error()

Flamefire added a commit to Flamefire/pytorch that referenced this issue Nov 1, 2023
Avoid a use-after free caused by a bug introduced in GCC 12.
In `wrap_pybind_function` the destructor of `PyWarningHandler` may throw
triggering the bug in GCC.
Avoid this by storing the result in a temporary, handling possible
warnings and returning the temporary (which will likely be elided by NRVO).
To avoid duplicating the code replace `gil_scoped_release` by the new
`conditional_gil_scoped_release` as the `void`-return case must be
handled separately.

See pytorch#112383 for a description of the bug.
Flamefire added a commit to Flamefire/pytorch that referenced this issue Nov 1, 2023
Avoid a use-after free caused by a bug introduced in GCC 12.
In `wrap_pybind_function` the destructor of `PyWarningHandler` may throw
triggering the bug in GCC.
Avoid this by storing the result in a temporary, handling possible
warnings and returning the temporary (which will likely be elided by NRVO).
To avoid duplicating the code replace `gil_scoped_release` by the new
`conditional_gil_scoped_release` as the `void`-return case must be
handled separately.

See pytorch#112383 for a description of the bug.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
module: cpp-extensions Related to torch.utils.cpp_extension module: crash Problem manifests as a hard crash, as opposed to a RuntimeError triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module
Projects
None yet
Development

No branches or pull requests

2 participants