New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Figure out what mojo FB common/process/StackTrace.h has that we don't #56399
Labels
better-engineering
Relatively self-contained tasks for better engineering contributors
module: internals
Related to internal abstractions in c10 and ATen
triaged
This issue has been looked at a team member, and triaged and prioritized into an appropriate module
Comments
ezyang
added a commit
that referenced
this issue
Apr 19, 2021
See #56399 I don't have time to fix this properly, so this is just to stem the bleeding. Someone should go and figure out what it is that common/process is doing better. Differential Revision: [D27861908](https://our.internmc.facebook.com/intern/diff/D27861908/) **NOTE FOR REVIEWERS**: This PR has internal Facebook specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D27861908/)! [ghstack-poisoned]
ezyang
added a commit
that referenced
this issue
Apr 19, 2021
See #56399 I don't have time to fix this properly, so this is just to stem the bleeding. Someone should go and figure out what it is that common/process is doing better. Differential Revision: [D27861908](https://our.internmc.facebook.com/intern/diff/D27861908/) **NOTE FOR REVIEWERS**: This PR has internal Facebook specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D27861908/)! ghstack-source-id: 126868405 Pull Request resolved: #56400
bdhirsh
added
better-engineering
Relatively self-contained tasks for better engineering contributors
module: internals
Related to internal abstractions in c10 and ATen
triaged
This issue has been looked at a team member, and triaged and prioritized into an appropriate module
labels
Apr 19, 2021
facebook-github-bot
pushed a commit
that referenced
this issue
Apr 20, 2021
Summary: Pull Request resolved: #56400 See #56399 I don't have time to fix this properly, so this is just to stem the bleeding. Someone should go and figure out what it is that common/process is doing better. ghstack-source-id: 126868405 Test Plan: I manually patched this into D27765125 and triggered a exception and observed that everything symbolized good: ``` [9] what(): new_refcount != 1INTERNAL ASSERT FAILED at "caffe2/c10/util/intrusive_ptr.h":234, please report a bug to PyTorch. intrusive_ptr: Cannot increase refcount after it reached zero. Exception raised from retain_ at caffe2/c10/util/intrusive_ptr.h:234 (most recent call first): # 0 c10::get_backtrace[abi:cxx11](unsigned long, unsigned long, bool) # 1 c10::(anonymous namespace)::GetFetchStackTrace[abi:cxx11]()::$_0::operator()[abi:cxx11]() const # 2 std::_Function_handler<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > (), c10::(anonymous namespace)::Ge tFetchStackTrace()::$_0>::_M_invoke(std::_Any_data const&) # 3 std::function<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > ()>::operator()() const # 4 c10::Error::Error(c10::SourceLocation, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >) # 5 c10::detail::torchCheckFail(char const*, char const*, unsigned int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocat or<char> > const&) # 6 c10::detail::torchInternalAssertFail(char const*, char const*, unsigned int, char const*, char const*) # 7 c10::intrusive_ptr<c10d::ProcessGroup, c10::detail::intrusive_target_default_null_type<c10d::ProcessGroup> >::retain_() # 8 c10::intrusive_ptr<c10d::ProcessGroup, c10::detail::intrusive_target_default_null_type<c10d::ProcessGroup> >::intrusive_ptr(c10::intrusiv e_ptr<c10d::ProcessGroup, c10::detail::intrusive_target_default_null_type<c10d::ProcessGroup> > const&) # 9 c10::intrusive_ptr<c10d::ProcessGroup, c10::detail::intrusive_target_default_null_type<c10d::ProcessGroup> >& c10::intrusive_ptr<c10d::Pr ocessGroup, c10::detail::intrusive_target_default_null_type<c10d::ProcessGroup> >::operator=<c10d::ProcessGroup, c10::detail::intrusive_target _default_null_type<c10d::ProcessGroup> >(c10::intrusive_ptr<c10d::ProcessGroup, c10::detail::intrusive_target_default_null_type<c10d::ProcessG roup> > const&) & ``` Reviewed By: driazati Differential Revision: D27861908 fbshipit-source-id: 84c1dfb1ef28c460b020646f836c153562ad5c44
krshrimali
pushed a commit
to krshrimali/pytorch
that referenced
this issue
May 19, 2021
…6400) Summary: Pull Request resolved: pytorch#56400 See pytorch#56399 I don't have time to fix this properly, so this is just to stem the bleeding. Someone should go and figure out what it is that common/process is doing better. ghstack-source-id: 126868405 Test Plan: I manually patched this into D27765125 and triggered a exception and observed that everything symbolized good: ``` [9] what(): new_refcount != 1INTERNAL ASSERT FAILED at "caffe2/c10/util/intrusive_ptr.h":234, please report a bug to PyTorch. intrusive_ptr: Cannot increase refcount after it reached zero. Exception raised from retain_ at caffe2/c10/util/intrusive_ptr.h:234 (most recent call first): # 0 c10::get_backtrace[abi:cxx11](unsigned long, unsigned long, bool) # 1 c10::(anonymous namespace)::GetFetchStackTrace[abi:cxx11]()::$_0::operator()[abi:cxx11]() const # 2 std::_Function_handler<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > (), c10::(anonymous namespace)::Ge tFetchStackTrace()::$_0>::_M_invoke(std::_Any_data const&) # 3 std::function<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > ()>::operator()() const # 4 c10::Error::Error(c10::SourceLocation, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >) # 5 c10::detail::torchCheckFail(char const*, char const*, unsigned int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocat or<char> > const&) # 6 c10::detail::torchInternalAssertFail(char const*, char const*, unsigned int, char const*, char const*) # 7 c10::intrusive_ptr<c10d::ProcessGroup, c10::detail::intrusive_target_default_null_type<c10d::ProcessGroup> >::retain_() # 8 c10::intrusive_ptr<c10d::ProcessGroup, c10::detail::intrusive_target_default_null_type<c10d::ProcessGroup> >::intrusive_ptr(c10::intrusiv e_ptr<c10d::ProcessGroup, c10::detail::intrusive_target_default_null_type<c10d::ProcessGroup> > const&) # 9 c10::intrusive_ptr<c10d::ProcessGroup, c10::detail::intrusive_target_default_null_type<c10d::ProcessGroup> >& c10::intrusive_ptr<c10d::Pr ocessGroup, c10::detail::intrusive_target_default_null_type<c10d::ProcessGroup> >::operator=<c10d::ProcessGroup, c10::detail::intrusive_target _default_null_type<c10d::ProcessGroup> >(c10::intrusive_ptr<c10d::ProcessGroup, c10::detail::intrusive_target_default_null_type<c10d::ProcessG roup> > const&) & ``` Reviewed By: driazati Differential Revision: D27861908 fbshipit-source-id: 84c1dfb1ef28c460b020646f836c153562ad5c44
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
better-engineering
Relatively self-contained tasks for better engineering contributors
module: internals
Related to internal abstractions in c10 and ATen
triaged
This issue has been looked at a team member, and triaged and prioritized into an appropriate module
I have observed that in FB-internal builds, the implementation of stack trace printing in common/process/StackTrace.h is superior to the native implementation in c10/util/Backtrace.cpp in that it is able to symbolize more symbols than we are able to.
For now, I'm going to stem the bleeding by making sure fbcode builds use this better implementation, but ultimately we should figure out what's going on (it's probably related to DWARF) and apply it to our implementation.
cc @ezyang @bhosmer @smessmer @ljk53 @bdhirsh @ailzhang @driazati
The text was updated successfully, but these errors were encountered: