Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Seg fault for std_string in message pack #465

Closed
phyy-nx opened this issue Jan 18, 2022 · 13 comments
Closed

Seg fault for std_string in message pack #465

phyy-nx opened this issue Jan 18, 2022 · 13 comments

Comments

@phyy-nx
Copy link

phyy-nx commented Jan 18, 2022

Hi, this came up while looking at message pack instead of pickle for cctbx.xfel.merge, which currently outputs pickle files. Here's a minimal reproducer:

from dials.array_family import flex
refls = flex.reflection_table()
strs = flex.std_string(['a', 'b'])
refls['strings'] = strs
refls.as_file('seggy.refl')

test = flex.reflection_table.from_file('seggy.refl')
assert len(test) == 2 # passes

from libtbx import easy_run
result = easy_run.fully_buffered("""libtbx.python -c "from dials.array_family import flex; print(len(flex.reflection_table.from_file('seggy.refl')))" """)
result.show_stdout()
result.show_stderr()

assert result.return_code == 0, result.return_code # indicates seg fault, code -11

Output indicates a segfault:

Traceback (most recent call last):
  File "demo.py", line 15, in <module>
    assert result.return_code == 0, result.return_code
AssertionError: -11

Further, running dials.show seggy.refl will also seg fault. It is also super weird that the first assert passes and the second fails.

Note that the same file dumped using pickle will work just fine.

@dwpaley
Copy link

dwpaley commented Jan 19, 2022

Here is a typical gdb stack trace for the segfault, but haven't made any real progress understanding the problem:

Program received signal SIGSEGV, Segmentation fault.
0x00007fb5b83137d1 in std::char_traits<char>::copy(char*, char const*, unsigned long) [clone .isra.0] () from /dev/shm/dwpaley/20220105/conda_base/lib/libstdc++.so.6
(gdb) bt
#0  0x00007fb5b83137d1 in std::char_traits<char>::copy(char*, char const*, unsigned long) [clone .isra.0] () from /dev/shm/dwpaley/20220105/conda_base/lib/libstdc++.so.6
#1  0x00007fb5b8313aa1 in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_assign(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) ()
   from /dev/shm/dwpaley/20220105/conda_base/lib/libstdc++.so.6
#2  0x00007fb5b8313cde in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::operator=(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) ()
   from /dev/shm/dwpaley/20220105/conda_base/lib/libstdc++.so.6
#3  0x00007fb5b6cdf550 in std::__copy_move<false, false, std::random_access_iterator_tag>::__copy_m<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >*> (__first=0x7fb5ac018a63, __last=0x7fb5ac01f163,
    __result=0x55738ab09530)
    at /dev/shm/dwpaley/20220105/conda_base/x86_64-conda-linux-gnu/include/c++/9.4.0/bits/stl_algobase.h:342
#4  0x00007fb5b6cdda77 in std::__copy_move_a<false, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >*> (__first=0x7fb5ac018a63,
    __last=0x7fb5ac01f163, __result=0x55738ab09530)
    at /dev/shm/dwpaley/20220105/conda_base/x86_64-conda-linux-gnu/include/c++/9.4.0/bits/stl_algobase.h:404
#5  0x00007fb5b6cdb369 in std::__copy_move_a2<false, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >*> (__first=0x7fb5ac018a63,
    __last=0x7fb5ac01f163, __result=0x55738ab09530)
    at /dev/shm/dwpaley/20220105/conda_base/x86_64-conda-linux-gnu/include/c++/9.4.0/bits/stl_algobase.h:440
#6  0x00007fb5b6cd702a in std::copy<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >*> (__first=0x7fb5ac018a63, __last=0x7fb5ac01f163,
    __result=0x55738ab09530)
    at /dev/shm/dwpaley/20220105/conda_base/x86_64-conda-linux-gnu/include/c++/9.4.0/bits/stl_algobase.h:474
#7  0x00007fb5ad4bbe36 in msgpack::v3::adaptor::convert<scitbx::af::ref<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, scitbx::af::trivial_accessor>, void>::operator() (this=0x7ffddf5b56cf, o=..., v=...)
    at /dev/shm/dwpaley/20220105/modules/dials/array_family/reflection_table_msgpack_adapter.h:421
#8  0x00007fb5ad4ad543 in msgpack::v1::operator>><scitbx::af::ref<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, scitbx::af::trivial_accessor> > (o=..., v=...)
    at /dev/shm/dwpaley/20220105/modules/msgpack-3.1.1/include/msgpack/v1/adaptor/adaptor_base.hpp:58
#9  0x00007fb5ad4a1b2c in msgpack::v1::object::convert<scitbx::af::ref<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, scitbx::af::trivial_accessor> > (this=0x55738ab045d8, v=...)
    at /dev/shm/dwpaley/20220105/modules/msgpack-3.1.1/include/msgpack/v1/object.hpp:1073
#10 0x00007fb5ad4989b3 in msgpack::v2::object::convert<scitbx::af::ref<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, scitbx::af::trivial_accessor> > (this=0x55738ab045d8, v=...)
    at /dev/shm/dwpaley/20220105/modules/msgpack-3.1.1/include/msgpack/v2/object_fwd.hpp:60
#11 0x00007fb5ad4907e0 in msgpack::v3::adaptor::convert<scitbx::af::shared<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, void>::operator() (this=0x7ffddf5b583f, o=..., v=...)
    at /dev/shm/dwpaley/20220105/modules/dials/array_family/reflection_table_msgpack_adapter.h:539
#12 0x00007fb5ad484ab0 in msgpack::v1::operator>><scitbx::af::shared<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > (o=..., v=...)
    at /dev/shm/dwpaley/20220105/modules/msgpack-3.1.1/include/msgpack/v1/adaptor/adaptor_base.hpp:58
#13 0x00007fb5ad4760d2 in msgpack::v1::object::convert<scitbx::af::shared<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > (
    this=0x55738ab045a8, v=...)
    at /dev/shm/dwpaley/20220105/modules/msgpack-3.1.1/include/msgpack/v1/object.hpp:1073
#14 0x00007fb5ad462c0d in msgpack::v2::object::convert<scitbx::af::shared<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > (
    this=0x55738ab045a8, v=...)
    at /dev/shm/dwpaley/20220105/modules/msgpack-3.1.1/include/msgpack/v2/object_fwd.hpp:60
#15 0x00007fb5ad45aeb6 in msgpack::v3::adaptor::convert<boost::variant<boost::detail::variant::over_sequence<boost::mpl::l_item<mpl_::long_<11l>, scitbx::af::shared<bool>, boost::mpl::l_item<mpl_::long_<10l>, scitbx::af::shared<int>, boost::mpl::l_item<mpl_::long_<9l>, scitbx::af::shared<unsigned long>, boost::mpl::l_item<mpl_::long_<8l>, scitbx::af::shared<double>, boost::mpl::l_item<mpl_::long_<7l>, scitbx::af::shared<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, boost::mpl::l_item<mpl_::long_<6l>, scitbx::af::shared<scitbx::vec2<double> >, boost::mpl::l_item<mpl_::long_<5l>, scitbx::af::shared<scitbx::vec3<double> >, boost::mpl::l_item<mpl_::long_<4l>, scitbx::af::shared<scitbx::mat3<double> >, boost::mpl::l_item<mpl_::long_<3l>, scitbx::af::shared<scitbx::af::tiny<int, 6ul> >, boost::mpl::l_item<mpl_::long_<2l>, scitbx::af::shared<cctbx::miller::index<int> >, boost::mpl::l_item<mpl_::long_<1l>, scitbx::af::shared<dials::model::Shoebox<float> >, boost::mpl::l_end> > > > > > > > > > > >>, void>::extract<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >(msgpack::v2::object const&) const (
    this=0x7ffddf5b5aaf, o=...)
    at /dev/shm/dwpaley/20220105/modules/dials/array_family/reflection_table_msgpack_adapter.h:689
#16 0x00007fb5ad4546d2 in msgpack::v3::adaptor::convert<boost::variant<boost::detail::variant::over_sequence<boost::mpl::l_item<mpl_::long_<11l>, scitbx::af::shared<bool>, boost::mpl::l_item<mpl_::long_<10l>, scitbx::af::shared<int>, boost::mpl::l_item<mpl_::long_<9l>, scitbx::af::shared<unsigned long>, boost::mpl::l_item<mpl_::long_<8l>, scitbx::af::shared<double>, boost::mpl::l_item<mpl_::long_<7l>, scitbx::af::shared<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, boost::mpl::l_item<mpl_::long_<6l>, scitbx::af::shared<scitbx::vec2<double> >, boost::mpl::l_item<mpl_::long_<5l>, scitbx::af::shared<scitbx::vec3<double> >, boost::mpl::l_item<mpl_::long_<4l>, scitbx::af::shared<scitbx::mat3<double> >, boost::mpl::l_item<mpl_::long_<3l>, scitbx::af::shared<scitbx::af::tiny<int, 6ul> >, boost::mpl::l_item<mpl_::long_<2l>, scitbx::af::shared<cctbx::miller::index<int> >, boost::mpl::l_item<mpl_::long_<1l>, scitbx::af::shared<dials::model::Shoebox<float> >, boost::mpl::l_end> > > > > > > > > > > >>, void>::operator()(msgpack::v2::object const&, boost::variant<boost::detail::variant::over_sequence<boost::mpl::l_item<mpl_::long_<11l>, scitbx::af::shared<bool>, boost::mpl::l_item<mpl_::long_<10l>, scitbx::af::shared<int>, boost::mpl::l_item<mpl_::long_<9l>, scitbx::af::shared<unsigned long>, boost::mpl::l_item<mpl_::long_<8l>, scitbx::af::shared<double>, boost::mpl::l_item<mpl_::long_<7l>, scitbx::af::shared<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, boost::mpl::l_item<mpl_::long_<6l>, scitbx::af::shared<scitbx::vec2<double> >, boost::mpl::l_item<mpl_::long_<5l>, scitbx::af::shared<scitbx::vec3<double> >, boost::mpl::l_item<mpl_::long_<4l>, scitbx::af::shared<scitbx::mat3<double> >, boost::mpl::l_item<mpl_::long_<3l>, scitbx::af::shared<scitbx::af::tiny<int, 6ul> >, boost::mpl::l_item<mpl_::long_<2l>, scitbx::af::shared<cctbx::miller::index<int> >, boost::mpl::l_item<mpl_::long_<1l>, scitbx::af::shared<dials::model::Shoebox<float> >, boost::mpl::l_end> > > > > > > > > > > >>&) const (this=0x7ffddf5b5aaf, o=..., v=...)
    at /dev/shm/dwpaley/20220105/modules/dials/array_family/reflection_table_msgpack_adapter.h:666
#17 0x00007fb5ad478797 in msgpack::v1::operator>><boost::variant<boost::detail::variant::over_sequence<boost::mpl::l_item<mpl_::long_<11l>, scitbx::af::shared<bool>, boost::mpl::l_item<mpl_::long_<10l>, scitbx::af::shared<int>, boost::mpl::l_item<mpl_::long_<9l>, scitbx::af::shared<unsigned long>, boost::mpl::l_item<mpl_::long_<8l>, scitbx::af::shared<double>, boost::mpl::l_item<mpl_::long_<7l>, scitbx::af::shared<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, boost::mpl::l_item<mpl_::long_<6l>, scitbx::af::shared<scitbx::vec2<double> >, boost::mpl::l_item<mpl_::long_<5l>, scitbx::af::shared<scitbx::vec3<double> >, boost::mpl::l_item<mpl_::long_<4l>, scitbx::af::shared<scitbx::mat3<double> >, boost::mpl::l_item<mpl_::long_<3l>, scitbx::af::shared<scitbx::af::tiny<int, 6ul> >, boost::mpl::l_item<mpl_::long_<2l>, scitbx::af::shared<cctbx::miller::index<int> >, boost::mpl::l_item<mpl_::long_<1l>, scitbx::af::shared<dials::model::Shoebox<float> >, boost::mpl::l_end> > > > > > > > > > > >> >(msgpack::v2::object const&, boost::variant<boost::detail::variant::over_sequence<boost::mpl::l_item<mpl_::long_<11l>, scitbx::af::shared<bool>, boost::mpl::l_item<mpl_::long_<10l>, scitbx::af::shared<int>, boost::mpl::l_item<mpl_::long_<9l>, scitbx::af::shared<unsigned long>, boost::mpl::l_item<mpl_::long_<8l>, scitbx::af::shared<double>, boost::mpl::l_item<mpl_::long_<7l>, scitbx::af::shared<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, boost::mpl::l_item<mpl_::long_<6l>, scitbx::af::shared<scitbx::vec2<double> >, boost::mpl::l_item<mpl_::long_<5l>, scitbx::af::shared<scitbx::vec3<double> >, boost::mpl::l_item<mpl_::long_<4l>, scitbx::af::shared<scitbx::mat3<double> >, boost::mpl::l_item<mpl_::long_<3l>, scitbx::af::shared<scitbx::af::tiny<int, 6ul> >, boost::mpl::l_item<mpl_::long_<2l>, scitbx::af::shared<cctbx::miller::index<int> >, boost::mpl::l_item<mpl_::long_<1l>, scitbx::af::shared<dials::model::Shoebox<float> >, boost::mpl::l_end> > > > > > > > > > > >>&) (o=..., v=...)
    at /dev/shm/dwpaley/20220105/modules/msgpack-3.1.1/include/msgpack/v1/adaptor/adaptor_base.hpp:58
#18 0x00007fb5ad463d1a in msgpack::v1::object::convert<boost::variant<boost::detail::variant::over_sequence<boost::mpl::l_item<mpl_::long_<11l>, scitbx::af::shared<bool>, boost::mpl::l_item<mpl_::long_<10l>, scitbx::af::shared<int>, boost::mpl::l_item<mpl_::long_<9l>, scitbx::af::shared<unsigned long>, boost::mpl::l_item<mpl_::long_<8l>, scitbx::af::shared<double>, boost::mpl::l_item<mpl_::long_<7l>, scitbx::af::shared<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, boost::mpl::l_item<mpl_::long_<6l>, scitbx::af::shared<scitbx::vec2<double> >, boost::mpl::l_item<mpl_::long_<5l>, scitbx::af::shared<scitbx::vec3<double> >, boost::mpl::l_item<mpl_::long_<4l>, scitbx::af::shared<scitbx::mat3<double> >, boost::mpl::l_item<mpl_::long_<3l>, scitbx::af::shared<scitbx::af::tiny<int, 6ul> >, boost::mpl::l_item<mpl_::long_<2l>, scitbx::af::shared<cctbx::miller::index<int> >, boost::mpl::l_item<mpl_::long_<1l>, scitbx::af::shared<dials::model::Shoebox<float> >, boost::mpl::l_end> > > > > > > > > > > >> >(boost::variant<boost::detail::variant::over_sequence<boost::mpl::l_item<mpl_::long_<11l>, scitbx::af::shared<bool>, boost::mpl::l_item<mpl_::long_<10l>, scitbx::af::shared<int>, boost::mpl::l_item<mpl_::long_<9l>, scitbx::af::shared<unsigned long>, boost::mpl::l_item<mpl_::long_<8l>, scitbx::af::shared<double>, boost::mpl::l_item<mpl_::long_<7l>, scitbx::af::shared<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, boost::mpl::l_item<mpl_::long_<6l>, scitbx::af::shared<scitbx::vec2<double> >, boost::mpl::l_item<mpl_::long_<5l>, scitbx::af::shared<scitbx::vec3<double> >, boost::mpl::l_item<mpl_::long_<4l>, scitbx::af::shared<scitbx::mat3<double> >, boost::mpl::l_item<mpl_::long_<3l>, scitbx::af::shared<scitbx::af::tiny<int, 6ul> >, boost::mpl::l_item<mpl_::long_<2l>, scitbx::af::shared<cctbx::miller::index<int> >, boost::mpl::l_item<mpl_::long_<1l>, scitbx::af::shared<dials::model::Shoebox<float> >, boost::mpl::l_end> > > > > > > > > > > >>&) const (
    this=0x55738ab042a8, v=...)
    at /dev/shm/dwpaley/20220105/modules/msgpack-3.1.1/include/msgpack/v1/object.hpp:1073
#19 0x00007fb5ad45b7c9 in msgpack::v2::object::convert<boost::variant<boost::detail::variant::over_sequence<boost::mpl::l_item<mpl_::long_<11l>, scitbx::af::shared<bool>, boost::mpl::l_item<mpl_::long_<10l>, scitbx::af::shared<int>, boost::mpl::l_item<mpl_::long_<9l>, scitbx::af::shared<unsigned long>, boost::mpl::l_item<mpl_::long_<8l>, scitbx::af::shared<double>, boost::mpl::l_item<mpl_::long_<7l>, scitbx::af::shared<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, boost::mpl::l_item<mpl_::long_<6l>, scitbx::af::shared<scitbx::vec2<double> >, boost::mpl::l_item<mpl_::long_<5l>, scitbx::af::shared<scitbx::vec3<double> >, boost::mpl::l_item<mpl_::long_<4l>, scitbx::af::shared<scitbx::mat3<double> >, boost::mpl::l_item<mpl_::long_<3l>, scitbx::af::shared<scitbx::af::tiny<int, 6ul> >, boost::mpl::l_item<mpl_::long_<2l>, scitbx::af::shared<cctbx::miller::index<int> >, boost::mpl::l_item<mpl_::long_<1l>, scitbx::af::shared<dials::model::Shoebox<float> >, boost::mpl::l_end> > > > > > > > > > > >> >(boost::variant<boost::detail::variant::over_sequence<boost::mpl::l_item<mpl_::long_<11l>, scitbx::af::shared<bool>, boost::mpl::l_item<mpl_::long_<10l>, scitbx::af::shared<int>, boost::mpl::l_item<mpl_::long_<9l>, scitbx::af::shared<unsigned long>, boost::mpl::l_item<mpl_::long_<8l>, scitbx::af::shared<double>, boost::mpl::l_item<mpl_::long_<7l>, scitbx::af::shared<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, boost::mpl::l_item<mpl_::long_<6l>, scitbx::af::shared<scitbx::vec2<double> >, boost::mpl::l_item<mpl_::long_<5l>, scitbx::af::shared<scitbx::vec3<double> >, boost::mpl::l_item<mpl_::long_<4l>, scitbx::af::shared<scitbx::mat3<double> >, boost::mpl::l_item<mpl_::long_<3l>, scitbx::af::shared<scitbx::af::tiny<int, 6ul> >, boost::mpl::l_item<mpl_::long_<2l>, scitbx::af::shared<cctbx::miller::index<int> >, boost::mpl::l_item<mpl_::long_<1l>, scitbx::af::shared<dials::model::Shoebox<float> >, boost::mpl::l_end> > > > > > > > > > > >>&) const (
    this=0x55738ab042a8, v=...)
    at /dev/shm/dwpaley/20220105/modules/msgpack-3.1.1/include/msgpack/v2/object_fwd.hpp:60
#20 0x00007fb5ad45551a in msgpack::v3::adaptor::convert<dials::af::reflection_table, void>::operator() (this=0x7ffddf5b5e3f, o=..., v=...)
    at /dev/shm/dwpaley/20220105/modules/dials/array_family/reflection_table_msgpack_adapter.h:832
#21 0x00007fb5ad478c27 in msgpack::v1::operator>><dials::af::reflection_table> (
    o=..., v=...)
    at /dev/shm/dwpaley/20220105/modules/msgpack-3.1.1/include/msgpack/v1/adaptor/adaptor_base.hpp:58
#22 0x00007fb5ad464e40 in msgpack::v1::object::convert<dials::af::reflection_table>
    (this=0x7ffddf5b5ee0, v=...)
    at /dev/shm/dwpaley/20220105/modules/msgpack-3.1.1/include/msgpack/v1/object.hpp:1073
#23 0x00007fb5ad45cea8 in msgpack::v1::object::as<dials::af::reflection_table> (
    this=0x7ffddf5b5ee0)
    at /dev/shm/dwpaley/20220105/modules/msgpack-3.1.1/include/msgpack/v1/object.hpp:1128
#24 0x00007fb5ad44ef88 in dials::af::boost_python::reflection_table_from_msgpack (
    packed=...)
    at /dev/shm/dwpaley/20220105/modules/dials/array_family/boost_python/flex_reflection_table.cc:860
#25 0x00007fb5ad520f1b in boost::python::detail::invoke<boost::python::to_python_value<dials::af::reflection_table const&>, dials::af::reflection_table (*)(boost::python::api::object), boost::python::arg_from_python<boost::python::api::object> > (
    rc=...,
    f=@0x55738a9b7a78: 0x7fb5ad44ee95 <dials::af::boost_python::reflection_table_from_msgpack(boost::python::api::object)>, ac0=...)
at /dev/shm/dwpaley/20220105/build_debug/../conda_base/include/boost/python/detail/invoke.hpp:73
#26 0x00007fb5ad5161bf in boost::python::detail::caller_arity<1u>::impl<dials::af::reflection_table (*)(boost::python::api::object), boost::python::default_call_policies, boost::mpl::vector2<dials::af::reflection_table, boost::python::api::object> >::operator() (this=0x55738a9b7a78, args_=0x7fb5bfc0de90)
    at /dev/shm/dwpaley/20220105/build_debug/../conda_base/include/boost/python/detail/caller.hpp:233
#27 0x00007fb5ad50f617 in boost::python::objects::caller_py_function_impl<boost::python::detail::caller<dials::af::reflection_table (*)(boost::python::api::object), boost::python::default_call_policies, boost::mpl::vector2<dials::af::reflection_table, boost::python::api::object> > >::operator() (this=0x55738a9b7a70, args=0x7fb5bfc0de90,
    kw=0x0)
    at /dev/shm/dwpaley/20220105/build_debug/../conda_base/include/boost/python/object/py_function.hpp:38
#28 0x00007fb5b83df4c5 in boost::python::objects::function::call(_object*, _object*) const () from /dev/shm/dwpaley/20220105/conda_base/lib/libboost_python37.so.1.74.0
#29 0x00007fb5b83df669 in boost::detail::function::void_function_ref_invoker0<boost::python::objects::(anonymous namespace)::bind_return, void>::invoke(boost::detail::function::function_buffer&) ()
   from /dev/shm/dwpaley/20220105/conda_base/lib/libboost_python37.so.1.74.0
#30 0x00007fb5b83e4bb3 in boost::python::handle_exception_impl(boost::function0<void>) () from /dev/shm/dwpaley/20220105/conda_base/lib/libboost_python37.so.1.74.0
#31 0x00007fb5b83dd3c3 in function_call ()
   from /dev/shm/dwpaley/20220105/conda_base/lib/libboost_python37.so.1.74.0
#32 0x00005573891f572b in _PyObject_FastCallKeywords ()
    at /home/conda/feedstock_root/build_artifacts/python_1635226054942/work/Python/errors.c:176
#33 0x00005573891f6269 in call_function ()
    at /home/conda/feedstock_root/build_artifacts/python_1635226054942/work/Python/ceval.c:4619
#34 0x000055738926ccba in _PyEval_EvalFrameDefault ()
    at /home/conda/feedstock_root/build_artifacts/python_1635226054942/work/Python/ceval.c:3093
#35 0x00005573891f31d4 in PyEval_EvalFrameEx (throwflag=0, f=0x7fb5ad931240)
    at /home/conda/feedstock_root/build_artifacts/python_1635226054942/work/Python/ceval.c:544
#36 function_code_fastcall (globals=0x7fb5b7326a50, nargs=140418440508880,
    args=<optimized out>, co=<optimized out>)
    at /home/conda/feedstock_root/build_artifacts/python_1635226054942/work/Objects/call.c:283
#37 _PyFunction_FastCallKeywords ()
    at /home/conda/feedstock_root/build_artifacts/python_1635226054942/work/Objects/call.c:408
#38 0x00005573891f60d8 in call_function ()
    at /home/conda/feedstock_root/build_artifacts/python_1635226054942/work/Python/ceval.c:4616
#39 0x000055738926ccba in _PyEval_EvalFrameDefault ()
    at /home/conda/feedstock_root/build_artifacts/python_1635226054942/work/Python/ceval.c:3093
#40 0x00005573891f31d4 in PyEval_EvalFrameEx (throwflag=0, f=0x7fb5ac959050)
    at /home/conda/feedstock_root/build_artifacts/python_1635226054942/work/Python/ceval.c:544
#41 function_code_fastcall (globals=0x7fb5b7326a50, nargs=140418440510176,
    args=<optimized out>, co=<optimized out>)
    at /home/conda/feedstock_root/build_artifacts/python_1635226054942/work/Objects/call.c:283
#42 _PyFunction_FastCallKeywords ()
    at /home/conda/feedstock_root/build_artifacts/python_1635226054942/work/Objects/call.c:408
#43 0x00005573891f60d8 in call_function ()
    at /home/conda/feedstock_root/build_artifacts/python_1635226054942/work/Python/ceval.c:4616
#44 0x000055738926ccba in _PyEval_EvalFrameDefault ()
    at /home/conda/feedstock_root/build_artifacts/python_1635226054942/work/Python/ceval.c:3093
#45 0x00005573891acea2 in PyEval_EvalFrameEx (throwflag=0, f=0x7fb5bfc5f450)
    at /home/conda/feedstock_root/build_artifacts/python_1635226054942/work/Python/ceval.c:3930
#46 _PyEval_EvalCodeWithName ()
    at /home/conda/feedstock_root/build_artifacts/python_1635226054942/work/Python/ceval.c:3930
#47 0x00005573891ae0b9 in PyEval_EvalCodeEx ()
    at /home/conda/feedstock_root/build_artifacts/python_1635226054942/work/Python/ceval.c:3959
#48 0x000055738928d15b in PyEval_EvalCode (co=co@entry=0x7fb5bfc8b4b0,
    globals=globals@entry=0x7fb5bfcbc050, locals=locals@entry=0x7fb5bfcbc050)
    at /home/conda/feedstock_root/build_artifacts/python_1635226054942/work/Python/ceval.c:524
#49 0x00005573892f8d53 in run_mod ()
    at /home/conda/feedstock_root/build_artifacts/python_1635226054942/work/Python/pythonrun.c:1037
#50 0x0000557389303307 in PyRun_FileExFlags ()
    at /home/conda/feedstock_root/build_artifacts/python_1635226054942/work/Python/pythonrun.c:990
#51 0x00005573893034dc in PyRun_SimpleFileExFlags ()
    at /home/conda/feedstock_root/build_artifacts/python_1635226054942/work/Python/pythonrun.c:429
#52 0x00005573893036eb in PyRun_AnyFileExFlags ()
    at /home/conda/feedstock_root/build_artifacts/python_1635226054942/work/Python/pythonrun.c:84
#53 0x0000557389303a39 in pymain_run_file (p_cf=0x7ffddf5b69e0,
    filename=<optimized out>, fp=0x55738a0f86a0)
    at /home/conda/feedstock_root/build_artifacts/python_1635226054942/work/Modules/main.c:456
#54 pymain_run_filename (cf=0x7ffddf5b69e0, pymain=0x7ffddf5b6af0)
    at /home/conda/feedstock_root/build_artifacts/python_1635226054942/work/Modules/main.c:1646
#55 pymain_run_python (pymain=0x7ffddf5b6af0)
    at /home/conda/feedstock_root/build_artifacts/python_1635226054942/work/Modules/main.c:2907
#56 pymain_main ()
    at /home/conda/feedstock_root/build_artifacts/python_1635226054942/work/Modules/main.c:3068
#57 0x0000557389303b8c in _Py_UnixMain (argc=<optimized out>, argv=<optimized out>)
    at /home/conda/feedstock_root/build_artifacts/python_1635226054942/work/Modules/main.c:3103
#58 0x00007fb5bec41c05 in __libc_start_main () from /lib64/libc.so.6
#59 0x000055738927951d in _start ()
    at /home/conda/feedstock_root/build_artifacts/python_1635226054942/work/Parser/parser.c:325

and the Python stack trace as shared by @nksauter is here:

Fatal Python error: Segmentation fault

Current thread 0x00007f8feb1d2740 (most recent call first):
  File "/pscratch/sd/n/nksauter/adse13_249/alcc-recipes/cctbx/modules/dials/array_family/flex_ext.py", line 210 in from_msgpack_file
  File "/pscratch/sd/n/nksauter/adse13_249/alcc-recipes/cctbx/modules/dials/array_family/flex_ext.py", line 229 in from_file
  File "/pscratch/sd/n/nksauter/adse13_249/alcc-recipes/cctbx/modules/dials/util/options.py", line 332 in try_read_reflections
  File "/pscratch/sd/n/nksauter/adse13_249/alcc-recipes/cctbx/modules/dials/util/options.py", line 216 in __init__
  File "/pscratch/sd/n/nksauter/adse13_249/alcc-recipes/cctbx/modules/dials/util/options.py", line 547 in parse_args
  File "/pscratch/sd/n/nksauter/adse13_249/alcc-recipes/cctbx/modules/dials/util/options.py", line 856 in parse_args
  File "/pscratch/sd/n/nksauter/adse13_249/alcc-recipes/cctbx/build/../modules/dials/command_line/show.py", line 206 in run
  File "/pscratch/sd/n/nksauter/adse13_249/alcc-recipes/cctbx/opt/mamba/envs/psana_env/lib/python3.7/contextlib.py", line 74 in inner
  File "/pscratch/sd/n/nksauter/adse13_249/alcc-recipes/cctbx/build/../modules/dials/command_line/show.py", line 693 in <module>
Segmentation fault

@dwpaley
Copy link

dwpaley commented Jan 19, 2022

Can also trigger a segfault by the following steps, where iobs_000000.refl was previously written by cctbx.xfel.merge:

$ libtbx.python
>>> from dials.array_family import flex
>>> a = flex.reflection_table.from_file('iobs_000000.refl')
>>> as.as_file('out.refl')
>>> quit()

$ libtbx.python
>>> from dials.array_family import flex
>>> a = flex.reflection_table.from_file('out.refl')
Segmentation fault (core dumped)

@graeme-winter
Copy link
Collaborator

@phyy-nx your code as written works fine for me - macOS / Python 3.9

silver-surfer-2 berkel :) $ cat segv.py 
from dials.array_family import flex
refls = flex.reflection_table()
strs = flex.std_string(['a', 'b'])
refls['strings'] = strs
refls.as_file('seggy.refl')

test = flex.reflection_table.from_file('seggy.refl')
assert len(test) == 2 # passes

from libtbx import easy_run
result = easy_run.fully_buffered("""libtbx.python -c "from dials.array_family import flex; print(len(flex.reflection_table.from_file('seggy.refl')))" """)
result.show_stdout()
result.show_stderr()

assert result.return_code == 0, result.return_code # indicates seg fault, code -11

silver-surfer-2 berkel :) $ libtbx.python segv.py 
2

@graeme-winter
Copy link
Collaborator

@phyy-nx your code as written works fine for me - macOS / Python 3.9

silver-surfer-2 berkel :) $ cat segv.py 
from dials.array_family import flex
refls = flex.reflection_table()
strs = flex.std_string(['a', 'b'])
refls['strings'] = strs
refls.as_file('seggy.refl')

test = flex.reflection_table.from_file('seggy.refl')
assert len(test) == 2 # passes

from libtbx import easy_run
result = easy_run.fully_buffered("""libtbx.python -c "from dials.array_family import flex; print(len(flex.reflection_table.from_file('seggy.refl')))" """)
result.show_stdout()
result.show_stderr()

assert result.return_code == 0, result.return_code # indicates seg fault, code -11

silver-surfer-2 berkel :) $ libtbx.python segv.py 
2

Where by work fine, I mean it fails to fail. It succeeds. I can also manually inspect the file and verify that it indeed appears to work:

silver-surfer-2 berkel :) $ libtbx.python
Python 3.9.7 | packaged by conda-forge | (default, Sep 29 2021, 20:33:18) 
[Clang 11.1.0 ] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from dials.array_family import flex
>>> data = flex.reflection_table.from_file("seggy.refl")
>>> len(data)
2
>>> data[0]
{'strings': 'a'}
>>> data[1]
{'strings': 'b'}
>>> list(data["strings"])
['a', 'b']

Seems to do the things I would have expected it to do...

@graeme-winter
Copy link
Collaborator

I can however report that on RHEL7 with DIALS 3.8.0 release I do reproduce your error:

cs03r-sc-serv-36 seg :) $ libtbx.python segv.py 
Traceback (most recent call last):
  File "/tmp/seg/segv.py", line 15, in <module>
    assert result.return_code == 0, result.return_code # indicates seg fault, code -11
AssertionError: -11

@graeme-winter
Copy link
Collaborator

OK, additional - the seggy.refl file from macOS was 130 bytes; from RHEL7 was 146

When I try to load the macOS derived file on RHEL7 (which should work) I get a big badda boom

>>> data = flex.reflection_table.from_file("seggy.refl")
Traceback (most recent call last):
  File "/dls_sw/apps/dials/dials-v3-8-0/modules/dials/array_family/flex_ext.py", line 228, in from_file
    return dials_array_family_flex_ext.reflection_table.from_msgpack_file(
  File "/dls_sw/apps/dials/dials-v3-8-0/modules/dials/array_family/flex_ext.py", line 209, in from_msgpack_file
    return dials_array_family_flex_ext.reflection_table.from_msgpack(
RuntimeError: dials Error: /scratch/jenkins_agent/workspace/dials_minor_release/build_dials/modules/dials/array_family/reflection_table_msgpack_adapter.h(408): scitbx::af::ref: msgpack bin data does not have correct size

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/dls_sw/apps/dials/dials-v3-8-0/modules/dials/array_family/flex_ext.py", line 232, in from_file
    return dials_array_family_flex_ext.reflection_table.from_pickle(filename)
  File "/dls_sw/apps/dials/dials-v3-8-0/modules/dials/array_family/flex_ext.py", line 188, in from_pickle
    result = pickle.load(infile, encoding="bytes")
_pickle.UnpicklingError: STACK_GLOBAL requires str

The pickling nonsense there is because of our graceless fallback.

@graeme-winter
Copy link
Collaborator

Default string encoding on both platforms is UTF8:

Python 3.9.7 | packaged by conda-forge | (default, Sep 29 2021, 20:33:18) 
[Clang 11.1.0 ] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import sys
>>> sys.getdefaultencoding()
'utf-8'

-> 🤔

I don't see why the saved messagepack files would be different -> this is probably a pointer to the underlying fault

@graeme-winter
Copy link
Collaborator

Adding

cs03r-sc-serv-36 array_family :) [main] $ git diff
diff --git a/array_family/reflection_table_msgpack_adapter.h b/array_family/reflection_table_msgpack_adapter.h
index 812f898fa..2bc8ad76a 100644
--- a/array_family/reflection_table_msgpack_adapter.h
+++ b/array_family/reflection_table_msgpack_adapter.h
@@ -168,6 +168,7 @@ MSGPACK_API_VERSION_NAMESPACE(MSGPACK_DEFAULT_API_NS) {
                                           const scitbx::af::const_ref<T>& v) const {
         std::size_t num_elements = v.size();
         std::size_t element_size = element_size_helper<T>::size();
+        std::cout << "Element size is " << element_size << std::endl;
         std::size_t binary_size = num_elements * element_size;
         o.pack_bin(binary_size);
         o.pack_bin_body(reinterpret_cast<const char*>(&v[0]), binary_size);

(because I am old and debug with print) I see that the element size is 24 bytes on macOS and 32 bytes on RHEL7. This would be consistent with the difference in file size across the two platforms (16 bytes).

@graeme-winter
Copy link
Collaborator

Ah, OK - this probably works on macOS as a happ accident - because the actual string data are saved in the std::string struct so get packed correctly.

If I write

strs = flex.std_string(['its the end of the world as we know it', 'and I feel fine'])

I get the expected segmentation fault on macOS -> this must never have ever even considered working (and yeh probably time for an xfailing test in DIALS)

@graeme-winter
Copy link
Collaborator

OK; after a little investigation of this we have the following issue:

  • this packs strings as if they are a const sized object; they are not
  • correct solution is to pack as if ragged object e.g. make an IFF array
  • quick attempt at doing this immediately got me knotted in C++ template hell

Packing to:

size_t N_str;
for str in strs:
  size_t str.len();
  const char * str.c_str();

as a stream of bytes then unpacking the same would make sense here - overloading in the same manner as the shoebox - but my interval to look at this expired so I backed out teh changes. It can be done, but is probably a 3 coffee problem.

@graeme-winter
Copy link
Collaborator

Obviously there are also some rather embedded assumptions here, for example that the strings are utf8 etc. -> probably should be more defensive than it is at the moment. However in the interim I think we can safely assert that having strings in reflection tables is going to end in pain until this is addressed.

graeme-winter added a commit to graeme-winter/dials that referenced this issue Jan 19, 2022
Working on cctbx/dxtbx#465

This could never have worked. Now trying to make something which does work.
This does not work. This is at least some stubs / breadcrumbs on how it
could work
@ndevenish
Copy link
Collaborator

I'm going to close this here - not because it isn't an issue - but because it's explicitly a dials issue, and there is an existing ticket to track this: dials/dials#1858

@phyy-nx
Copy link
Author

phyy-nx commented Jan 20, 2022

Thanks @graeme-winter for looking at this. Good set of clues. At the moment I'm inclined to look into whether we can do away with the exp_id column in the cctbx.xfel.merge reflection tables, and instead use the usual size_t id column alongside the experiment id<->experiment identifier map. The exp_id column was developed more or less at the same time as the map, and both solved similar problems for tracking experiment identifiers. I'll look into how big of a change set that would be. Wouldn't fix this or dials/dials#1858, but it would solve our merging problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants