crimson/net: support connections in multiple shards #51916

cyx1231st · 2023-06-05T06:07:08Z

This PR implements multi-shard support in Crimson Messenger, following up #49420 and #50835.

Generally, all handshakes in protocol v2 are done in a dedicated shard for a Crimson Messenger as before, hopefully this will make further integrations with Crimson OSD simpler, because the dependent logic (such as the authentication) in OSD is not sharded, and the internal connection registration as well as replacements can remain atomic. Apart from that, the events and message dispatching are distributed to all the available shards per connection, determined by where the shard seastar::connected_socket lives. Due to the limitation that connected_socket cannot be moved to a different shard once established, the working connection shard can only available with the connected and accepted events. For lossless connection, this means that the working connection shard can be changed during connction recovery and upon reconnect and reaccept.

Also see #49420 (comment).

TODOs:

Make sure there is no regression to all the existing logic using the single-shard Crimson Messenger;
~~Integrate multi-shard Crimson Messenger with perf-crimson-msgr;~~ crimson/tools/perf_crimson_msgr: integrate multi-core msgr with various improvements #52091
Understand the performance impacts;
Extend the unit tests to validate the multi-shard Crimson Messenger;
Integrate multi-shard Crimson Messenger with Crimson OSD;

Contribution Guidelines

To sign and title your commits, please refer to Submitting Patches to Ceph.
If you are submitting a fix for a stable branch (e.g. "pacific"), please refer to Submitting Patches to Ceph - Backports for the proper workflow.

Checklist

Tracker (select at least one)
- References tracker ticket
- Very recent bug; references commit where it was introduced
- New feature (ticket optional)
- Doc update (no ticket needed)
- Code cleanup (no ticket needed)
Component impact
- Affects Dashboard, opened tracker ticket
- Affects Orchestrator, opened tracker ticket
- No impact that needs to be tracked
Documentation (select at least one)
- Updates relevant documentation
- No doc update is appropriate
Tests (select at least one)
- Includes unit test(s)
- Includes integration test(s)
- Includes bug reproducer
- No tests

Show available Jenkins commands

jenkins retest this please
jenkins test classic perf
jenkins test crimson perf
jenkins test signed
jenkins test make check
jenkins test make check arm64
jenkins test submodules
jenkins test dashboard
jenkins test dashboard cephadm
jenkins test api
jenkins test docs
jenkins render docs
jenkins test ceph-volume all
jenkins test ceph-volume tox
jenkins test windows

cyx1231st · 2023-06-05T07:32:43Z

make check looks good, retry more times and check teuthology

cyx1231st · 2023-06-05T08:49:03Z

The following tests FAILED:
	236 - unittest-seastar-messenger (Subprocess aborted)

...
INFO  2023-06-05 07:50:24,320 [shard 0] ms - [0x6110001f7240 osd.4(TestPeer) v2:127.0.0.1:9013/2@62179 >> osd.2 v2:127.0.0.1:9014/2] protocol READY do_in_dispatch fault with nothing to send, going to STANDBY io(in_seq=0, is_out_queued=false, has_out_sent=false) -- read eof
INFO  2023-06-05 07:50:24,321 [shard 0] test - [TestPeer] send_peer()
INFO  2023-06-05 07:50:24,346 [shard 0] ms - [0x6110001f7240 osd.4(TestPeer) v2:127.0.0.1:9013/2@62179 >> osd.2 v2:127.0.0.1:9014/2] notify_out(): at STANDBY, going to CONNECTING
ERROR 2023-06-05 07:50:24,348 [shard 0] none - /home/jenkins-build/build/workspace/ceph-pull-requests/src/crimson/net/ProtocolV2.h:118 : In function 'void crimson::net::ProtocolV2::trigger_state(crimson::net::ProtocolV2::state_t, crimson::net::ProtocolV2::io_state_t)', ceph_assert(%s)
!pr_exit_io.has_value()

The failed ceph_assert() doesn't look correct and is actually possible in the test case.

cyx1231st · 2023-06-05T09:34:34Z

make check looks good, retry more times and check teuthology

Looks the crimson build is always 404: https://shaman.ceph.com/builds/ceph/wip-yingxin-msgr-multi-core-crimson-only//crimson/345576/

Matan-B · 2023-06-05T10:22:14Z

Looks the crimson build is always 404: https://shaman.ceph.com/builds/ceph/wip-yingxin-msgr-multi-core-crimson-only//crimson/345576/

This is an issue with all crimson builds from the last few days, I have notified the sepia team.

cyx1231st · 2023-06-07T01:21:34Z

jenkins test make check

cyx1231st · 2023-06-07T02:55:20Z

unrelated failure:

The following tests FAILED:
	 34 - run-rbd-unit-tests-127.sh (Failed)
[  FAILED  ] TestMigration.Stress (6103 ms)

Matan-B · 2023-06-07T09:16:39Z

make check looks good, retry more times and check teuthology

Looks the crimson build is always 404: https://shaman.ceph.com/builds/ceph/wip-yingxin-msgr-multi-core-crimson-only//crimson/345576/

You can try scheduling a build without crimson-only in the name.

cyx1231st · 2023-06-12T03:28:38Z

Saw an unrelated issue in OSD.2, https://pulpito.ceph.com/yingxin-2023-06-09_02:09:26-crimson-rados-wip-yingxin-msgr-multi-core-crimson-3-distro-default-smithi/7299712/

OSD.2 hits a heap-use-after-free issue when printing PG.

The AddressSanitizer report is interleaved with debug logs, try to recover here:

==104588==ERROR: AddressSanitizer: heap-use-after-free on address 0x625000435e68 at pc 0x55a1f9110f23 bp 0x7f3d9ed61f20 sp 0x7f3d9ed61f10
READ of size 8 at 0x625000435e68 thread T1 (reactor-1)
    #0 0x55a1f9110f22 in crimson::osd::operator<<(std::ostream&, crimson::osd::PG const&) (/usr/bin/ceph-osd+0x395daf22)
    #1 0x55a1f9897ea7 in void fmt::v9::detail::format_value<char, crimson::osd::PG>(fmt::v9::detail::buffer<char>&, crimson::osd::PG const&, fmt::v9::detail::locale_ref) (/usr/bin/ceph-osd+0x39d61ea7)
    #2 0x55a1f98998fc in fmt::v9::appender fmt::v9::basic_ostream_formatter<char>::format<crimson::osd::PG, fmt::v9::appender>(crimson::osd::PG const&, fmt::v9::basic_format_context<fmt::v9::appender, char>&) const (/usr/bin/ceph-osd+0x39d638fc)
    #3 0x55a1f989a90e in void fmt::v9::detail::value<fmt::v9::basic_format_context<fmt::v9::appender, char> >::format_custom_arg<crimson::osd::PG, fmt::v9::formatter<crimson::osd::PG, char, void> >(void*, fmt::v9::basic_format_parse_context<char, fmt::v9::detail::error_handler>&, fmt::v9::basic_format_context<fmt::v9::appender, char>&) (/usr/bin/ceph-osd+0x39d6490e)
    #4 0x55a1f84e2bac in fmt::v9::detail::default_arg_formatter<char>::operator()(fmt::v9::basic_format_arg<fmt::v9::basic_format_context<fmt::v9::appender, char> >::handle) (/usr/bin/ceph-osd+0x389acbac)
    #5 0x55a1f85f5485 in char const* fmt::v9::detail::parse_replacement_field<char, fmt::v9::detail::vformat_to<char>(fmt::v9::detail::buffer<char>&, fmt::v9::basic_string_view<char>, fmt::v9::basic_format_args<fmt::v9::basic_format_context<std::conditional<std::is_same<fmt::v9::type_identity<char>::type, char>::value, fmt::v9::appender, std::back_insert_iterator<fmt::v9::detail::             buffer<fmt::v9::type_identity<char>::type> > >::type, fmt::v9::type_identity<char>::type> >, fmt::v9::detail::locale_ref)::format_handler&>(char const*, char const*, fmt::v9::detail::vformat_to<char>(fmt::v9::detail::buffer<char>&, fmt::v9::basic_string_view<char>, fmt::v9::basic_format_args<fmt::v9::basic_format_context<std::conditional<std::is_same<fmt::v9::type_identity<char>::             type, char>::value, fmt::v9::appender, std::back_insert_iterator<fmt::v9::detail::buffer<fmt::v9::type_identity<char>::type> > >::type, fmt::v9::type_identity<char>::type> >, fmt::v9::detail::locale_ref)::format_handler&) (/usr/bin/ceph-osd+0x38abf485)
    #6 0x55a1f85f80f9 in void fmt::v9::detail::vformat_to<char>(fmt::v9::detail::buffer<char>&, fmt::v9::basic_string_view<char>, fmt::v9::basic_format_args<fmt::v9::basic_format_context<std::conditional<std::is_same<fmt::v9::type_identity<char>::type, char>::value, fmt::v9::appender, std::back_insert_iterator<fmt::v9::detail::buffer<fmt::v9::type_identity<char>::type> > >::type, fmt::        v9::type_identity<char>::type> >, fmt::v9::detail::locale_ref) (/usr/bin/ceph-osd+0x38ac20f9)
    #7 0x55a1f85f842b in seastar::internal::log_buf::inserter_iterator fmt::v9::vformat_to<seastar::internal::log_buf::inserter_iterator, 0>(seastar::internal::log_buf::inserter_iterator, fmt::v9::basic_string_view<char>, fmt::v9::basic_format_args<fmt::v9::basic_format_context<fmt::v9::appender, char> >) (/usr/bin/ceph-osd+0x38ac242b)
    #8 0x55a1f97099e4 in seastar::logger::lambda_log_writer<seastar::logger::log<crimson::osd::PG&>(seastar::log_level, seastar::logger::format_info, crimson::osd::PG&)::{lambda(seastar::internal::log_buf::inserter_iterator)#1}>::operator()(seastar::internal::log_buf::inserter_iterator) (/usr/bin/ceph-osd+0x39bd39e4)
    #9 0x55a2084f269c in seastar::logger::do_log(seastar::log_level, seastar::logger::log_writer&)::{lambda(seastar::internal::log_buf::inserter_iterator)#1}::operator()(seastar::internal::log_buf::inserter_iterator) const (/usr/bin/ceph-osd+0x489bc69c)
    #10 0x55a2084f32a6 in seastar::logger::do_log(seastar::log_level, seastar::logger::log_writer&) (/usr/bin/ceph-osd+0x489bd2a6)
    #11 0x55a1f9125f0f in _ZN7crimson9erroratorIJNS_19unthrowable_wrapperIRKSt10error_codeL_ZNS_2ecILi2EEEEEENS1_IS4_L_ZNS5_ILi11EEEEEEEE7_futureINS_23errorated_future_markerIN7seastar10bool_classINSB_18stop_iteration_tagEEEEEE24_safe_then_handle_errorsINSB_8futurizeINSB_6futureISE_EEEESK_ZNS_L8composerIZNS6_6handleIZZZNS_3osd2PG16on_active_actmapEvENKUlvE0_clEvENKUlvE_clEvEUlvE_EEDaOT_EUlRKS6_E_JZNS7_6handleIZ        ZZNSP_16on_active_actmapEvENKSQ_clEvENKSR_clEvEUlvE0_EEDaSU_EUlRKS7_E_EEEDaSU_DpOT0_EUlDpOT_E_EEDaOT0_OT1_ (/usr/bin/ceph-osd+0x395eff0f)
    #12 0x55a1f9127719 in _ZN7seastar20noncopyable_functionIFNS_6futureINS_10bool_classINS_18stop_iteration_tagEEEEEOS5_EE17direct_vtable_forIZNS5_24then_wrapped_maybe_eraseILb0ES5_ZN7crimson9erroratorIJNSB_19unthrowable_wrapperIRKSt10error_codeL_ZNSB_2ecILi2EEEEEENSD_ISG_L_ZNSH_ILi11EEEEEEEE7_futureINSB_23errorated_future_markerIS4_EEE12handle_errorIZNSB_L8composerIZNSI_6handleIZZZNSB_3osd2PG16on_active_actmap        EvENKUlvE0_clEvENKUlvE_clEvEUlvE_EEDaOT_EUlRKSI_E_JZNSJ_6handleIZZZNST_16on_active_actmapEvENKSU_clEvENKSV_clEvEUlvE0_EEDaSY_EUlRKSJ_E_EEEDaSY_DpOT0_EUlDpOT_E_EEDaSY_EUlSY_E_EENS_8futurizeIT0_E4typeEOT1_EUlS6_E_E4callEPKS8_S6_ (/usr/bin/ceph-osd+0x395f1719)
    #13 0x55a1f9745526 in void seastar::futurize<seastar::future<seastar::bool_class<seastar::stop_iteration_tag> > >::satisfy_with_result_of<seastar::future<seastar::bool_class<seastar::stop_iteration_tag> >::then_wrapped_nrvo<seastar::future<seastar::bool_class<seastar::stop_iteration_tag> >, seastar::noncopyable_function<seastar::future<seastar::bool_class<seastar::                         stop_iteration_tag> > (seastar::future<seastar::bool_class<seastar::stop_iteration_tag> >&&)> >(seastar::noncopyable_function<seastar::future<seastar::bool_class<seastar::stop_iteration_tag> > (seastar::future<seastar::bool_class<seastar::stop_iteration_tag> >&&)>&&)::{lambda(seastar::internal::promise_base_with_type<seastar::bool_class<seastar::stop_iteration_tag> >&&, seastar::              noncopyable_function<seastar::future<seastar::bool_class<seastar::stop_iteration_tag> > (seastar::future<seastar::bool_class<seastar::stop_iteration_tag> >&&)>&, seastar::future_state<seastar::bool_class<seastar::stop_iteration_tag> >&&)#1}::operator()(seastar::internal::promise_base_with_type<seastar::bool_class<seastar::stop_iteration_tag> >&&, seastar::noncopyable_function<seastar::        future<seastar::bool_class<seastar::stop_iteration_tag> > (seastar::future<seastar::bool_class<seastar::stop_iteration_tag> >&&)>&, seastar::future_state<seastar::bool_class<seastar::stop_iteration_tag> >&&) const::{lambda()#1}>(seastar::internal::promise_base_with_type<seastar::bool_class<seastar::stop_iteration_tag> >&&, seastar::future<seastar::bool_class<seastar::                          stop_iteration_tag> >::then_wrapped_nrvo<seastar::future<seastar::bool_class<seastar::stop_iteration_tag> >, seastar::noncopyable_function<seastar::future<seastar::bool_class<seastar::stop_iteration_tag> > (seastar::future<seastar::bool_class<seastar::stop_iteration_tag> >&&)> >(seastar::noncopyable_function<seastar::future<seastar::bool_class<seastar::stop_iteration_tag> > (seastar::         future<seastar::bool_class<seastar::stop_iteration_tag> >&&)>&&)::{lambda(seastar::internal::promise_base_with_type<seastar::bool_class<seastar::stop_iteration_tag> >&&, seastar::noncopyable_function<seastar::future<seastar::bool_class<seastar::stop_iteration_tag> > (seastar::future<seastar::bool_class<seastar::stop_iteration_tag> >&&)>&, seastar::future_state<seastar::                        bool_class<seastar::stop_iteration_tag> >&&)#1}::operator()(seastar::internal::promise_base_with_type<seastar::bool_class<seastar::stop_iteration_tag> >&&, seastar::noncopyable_function<seastar::future<seastar::bool_class<seastar::stop_iteration_tag> > (seastar::future<seastar::bool_class<seastar::stop_iteration_tag> >&&)>&, seastar::future_state<seastar::bool_class<seastar::                  stop_iteration_tag> >&&) const::{lambda()#1}&&) (/usr/bin/ceph-osd+0x39c0f526)
    #14 0x55a1f974662c in seastar::continuation<seastar::internal::promise_base_with_type<seastar::bool_class<seastar::stop_iteration_tag> >, seastar::noncopyable_function<seastar::future<seastar::bool_class<seastar::stop_iteration_tag> > (seastar::future<seastar::bool_class<seastar::stop_iteration_tag> >&&)>, seastar::future<seastar::bool_class<seastar::stop_iteration_tag> >::                then_wrapped_nrvo<seastar::future<seastar::bool_class<seastar::stop_iteration_tag> >, seastar::noncopyable_function<seastar::future<seastar::bool_class<seastar::stop_iteration_tag> > (seastar::future<seastar::bool_class<seastar::stop_iteration_tag> >&&)> >(seastar::noncopyable_function<seastar::future<seastar::bool_class<seastar::stop_iteration_tag> > (seastar::future<seastar::                bool_class<seastar::stop_iteration_tag> >&&)>&&)::{lambda(seastar::internal::promise_base_with_type<seastar::bool_class<seastar::stop_iteration_tag> >&&, seastar::noncopyable_function<seastar::future<seastar::bool_class<seastar::stop_iteration_tag> > (seastar::future<seastar::bool_class<seastar::stop_iteration_tag> >&&)>&, seastar::future_state<seastar::bool_class<seastar::                    stop_iteration_tag> >&&)#1}, seastar::bool_class<seastar::stop_iteration_tag> >::run_and_dispose() (/usr/bin/ceph-osd+0x39c1062c)
......
freed by thread T1 (reactor-1) here:
    #0 0x7f3db4aa636f in operator delete(void*, unsigned long) (/lib64/libasan.so.6+0xb736f)
    #1 0x55a1f9422851 in crimson::osd::PG::~PG() (/usr/bin/ceph-osd+0x398ec851)
previously allocated by thread T1 (reactor-1) here:
    #0 0x7f3db4aa5307 in operator new(unsigned long) (/lib64/libasan.so.6+0xb6307)
    #1 0x55a1f9daa375 in auto crimson::osd::ShardServices::make_pg(crimson::local_shared_foreign_ptr<boost::local_shared_ptr<OSDMap const> >, spg_t, bool)::{lambda(auto:1&&)#3}::operator()<std::tuple<seastar::future<std::tuple<pg_pool_t, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::map<std::__cxx11::basic_string<char, std::char_traits<char>,            std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>         > > > > > >, seastar::future<boost::intrusive_ptr<crimson::os::FuturizedCollection> > > >(std::tuple<seastar::future<std::tuple<pg_pool_t, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>         >, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > > > >, seastar::future<boost::intrusive_ptr<crimson::os::FuturizedCollection> > >&&) const (/           usr/bin/ceph-osd+0x3a274375)
......

cyx1231st · 2023-06-12T06:38:22Z

Connection is switching between STANDBY/CONNECTION too frequently, seems reconnect waiting is missing in this case.

cyx1231st · 2023-06-12T07:33:32Z

Connection is switching between STANDBY/CONNECTION too frequently, seems reconnect waiting is missing in this case.

io_handler_state::is_out_queued is not updated in ProtocolV2::notify_out(), fixed inline.

Matan-B · 2023-06-12T09:44:02Z

Saw an unrelated issue in OSD.2, https://pulpito.ceph.com/yingxin-2023-06-09_02:09:26-crimson-rados-wip-yingxin-msgr-multi-core-crimson-3-distro-default-smithi/7299712/

OSD.2 hits a heap-use-after-free issue when printing PG.

The AddressSanitizer report is interleaved with debug logs, try to recover here:

==104588==ERROR: AddressSanitizer: heap-use-after-free on address 0x625000435e68 at pc 0x55a1f9110f23 bp 0x7f3d9ed61f20 sp 0x7f3d9ed61f10
READ of size 8 at 0x625000435e68 thread T1 (reactor-1)
    #0 0x55a1f9110f22 in crimson::osd::operator<<(std::ostream&, crimson::osd::PG const&) (/usr/bin/ceph-osd+0x395daf22)
    #1 0x55a1f9897ea7 in void fmt::v9::detail::format_value<char, crimson::osd::PG>(fmt::v9::detail::buffer<char>&, crimson::osd::PG const&, fmt::v9::detail::locale_ref) (/usr/bin/ceph-osd+0x39d61ea7)
    #2 0x55a1f98998fc in fmt::v9::appender fmt::v9::basic_ostream_formatter<char>::format<crimson::osd::PG, fmt::v9::appender>(crimson::osd::PG const&, fmt::v9::basic_format_context<fmt::v9::appender, char>&) const (/usr/bin/ceph-osd+0x39d638fc)
    #3 0x55a1f989a90e in void fmt::v9::detail::value<fmt::v9::basic_format_context<fmt::v9::appender, char> >::format_custom_arg<crimson::osd::PG, fmt::v9::formatter<crimson::osd::PG, char, void> >(void*, fmt::v9::basic_format_parse_context<char, fmt::v9::detail::error_handler>&, fmt::v9::basic_format_context<fmt::v9::appender, char>&) (/usr/bin/ceph-osd+0x39d6490e)
    #4 0x55a1f84e2bac in fmt::v9::detail::default_arg_formatter<char>::operator()(fmt::v9::basic_format_arg<fmt::v9::basic_format_context<fmt::v9::appender, char> >::handle) (/usr/bin/ceph-osd+0x389acbac)
    #5 0x55a1f85f5485 in char const* fmt::v9::detail::parse_replacement_field<char, fmt::v9::detail::vformat_to<char>(fmt::v9::detail::buffer<char>&, fmt::v9::basic_string_view<char>, fmt::v9::basic_format_args<fmt::v9::basic_format_context<std::conditional<std::is_same<fmt::v9::type_identity<char>::type, char>::value, fmt::v9::appender, std::back_insert_iterator<fmt::v9::detail::             buffer<fmt::v9::type_identity<char>::type> > >::type, fmt::v9::type_identity<char>::type> >, fmt::v9::detail::locale_ref)::format_handler&>(char const*, char const*, fmt::v9::detail::vformat_to<char>(fmt::v9::detail::buffer<char>&, fmt::v9::basic_string_view<char>, fmt::v9::basic_format_args<fmt::v9::basic_format_context<std::conditional<std::is_same<fmt::v9::type_identity<char>::             type, char>::value, fmt::v9::appender, std::back_insert_iterator<fmt::v9::detail::buffer<fmt::v9::type_identity<char>::type> > >::type, fmt::v9::type_identity<char>::type> >, fmt::v9::detail::locale_ref)::format_handler&) (/usr/bin/ceph-osd+0x38abf485)
    #6 0x55a1f85f80f9 in void fmt::v9::detail::vformat_to<char>(fmt::v9::detail::buffer<char>&, fmt::v9::basic_string_view<char>, fmt::v9::basic_format_args<fmt::v9::basic_format_context<std::conditional<std::is_same<fmt::v9::type_identity<char>::type, char>::value, fmt::v9::appender, std::back_insert_iterator<fmt::v9::detail::buffer<fmt::v9::type_identity<char>::type> > >::type, fmt::        v9::type_identity<char>::type> >, fmt::v9::detail::locale_ref) (/usr/bin/ceph-osd+0x38ac20f9)
    #7 0x55a1f85f842b in seastar::internal::log_buf::inserter_iterator fmt::v9::vformat_to<seastar::internal::log_buf::inserter_iterator, 0>(seastar::internal::log_buf::inserter_iterator, fmt::v9::basic_string_view<char>, fmt::v9::basic_format_args<fmt::v9::basic_format_context<fmt::v9::appender, char> >) (/usr/bin/ceph-osd+0x38ac242b)
    #8 0x55a1f97099e4 in seastar::logger::lambda_log_writer<seastar::logger::log<crimson::osd::PG&>(seastar::log_level, seastar::logger::format_info, crimson::osd::PG&)::{lambda(seastar::internal::log_buf::inserter_iterator)#1}>::operator()(seastar::internal::log_buf::inserter_iterator) (/usr/bin/ceph-osd+0x39bd39e4)
    #9 0x55a2084f269c in seastar::logger::do_log(seastar::log_level, seastar::logger::log_writer&)::{lambda(seastar::internal::log_buf::inserter_iterator)#1}::operator()(seastar::internal::log_buf::inserter_iterator) const (/usr/bin/ceph-osd+0x489bc69c)
    #10 0x55a2084f32a6 in seastar::logger::do_log(seastar::log_level, seastar::logger::log_writer&) (/usr/bin/ceph-osd+0x489bd2a6)
    #11 0x55a1f9125f0f in _ZN7crimson9erroratorIJNS_19unthrowable_wrapperIRKSt10error_codeL_ZNS_2ecILi2EEEEEENS1_IS4_L_ZNS5_ILi11EEEEEEEE7_futureINS_23errorated_future_markerIN7seastar10bool_classINSB_18stop_iteration_tagEEEEEE24_safe_then_handle_errorsINSB_8futurizeINSB_6futureISE_EEEESK_ZNS_L8composerIZNS6_6handleIZZZNS_3osd2PG16on_active_actmapEvENKUlvE0_clEvENKUlvE_clEvEUlvE_EEDaOT_EUlRKS6_E_JZNS7_6handleIZ        ZZNSP_16on_active_actmapEvENKSQ_clEvENKSR_clEvEUlvE0_EEDaSU_EUlRKS7_E_EEEDaSU_DpOT0_EUlDpOT_E_EEDaOT0_OT1_ (/usr/bin/ceph-osd+0x395eff0f)
    #12 0x55a1f9127719 in _ZN7seastar20noncopyable_functionIFNS_6futureINS_10bool_classINS_18stop_iteration_tagEEEEEOS5_EE17direct_vtable_forIZNS5_24then_wrapped_maybe_eraseILb0ES5_ZN7crimson9erroratorIJNSB_19unthrowable_wrapperIRKSt10error_codeL_ZNSB_2ecILi2EEEEEENSD_ISG_L_ZNSH_ILi11EEEEEEEE7_futureINSB_23errorated_future_markerIS4_EEE12handle_errorIZNSB_L8composerIZNSI_6handleIZZZNSB_3osd2PG16on_active_actmap        EvENKUlvE0_clEvENKUlvE_clEvEUlvE_EEDaOT_EUlRKSI_E_JZNSJ_6handleIZZZNST_16on_active_actmapEvENKSU_clEvENKSV_clEvEUlvE0_EEDaSY_EUlRKSJ_E_EEEDaSY_DpOT0_EUlDpOT_E_EEDaSY_EUlSY_E_EENS_8futurizeIT0_E4typeEOT1_EUlS6_E_E4callEPKS8_S6_ (/usr/bin/ceph-osd+0x395f1719)
    #13 0x55a1f9745526 in void seastar::futurize<seastar::future<seastar::bool_class<seastar::stop_iteration_tag> > >::satisfy_with_result_of<seastar::future<seastar::bool_class<seastar::stop_iteration_tag> >::then_wrapped_nrvo<seastar::future<seastar::bool_class<seastar::stop_iteration_tag> >, seastar::noncopyable_function<seastar::future<seastar::bool_class<seastar::                         stop_iteration_tag> > (seastar::future<seastar::bool_class<seastar::stop_iteration_tag> >&&)> >(seastar::noncopyable_function<seastar::future<seastar::bool_class<seastar::stop_iteration_tag> > (seastar::future<seastar::bool_class<seastar::stop_iteration_tag> >&&)>&&)::{lambda(seastar::internal::promise_base_with_type<seastar::bool_class<seastar::stop_iteration_tag> >&&, seastar::              noncopyable_function<seastar::future<seastar::bool_class<seastar::stop_iteration_tag> > (seastar::future<seastar::bool_class<seastar::stop_iteration_tag> >&&)>&, seastar::future_state<seastar::bool_class<seastar::stop_iteration_tag> >&&)#1}::operator()(seastar::internal::promise_base_with_type<seastar::bool_class<seastar::stop_iteration_tag> >&&, seastar::noncopyable_function<seastar::        future<seastar::bool_class<seastar::stop_iteration_tag> > (seastar::future<seastar::bool_class<seastar::stop_iteration_tag> >&&)>&, seastar::future_state<seastar::bool_class<seastar::stop_iteration_tag> >&&) const::{lambda()#1}>(seastar::internal::promise_base_with_type<seastar::bool_class<seastar::stop_iteration_tag> >&&, seastar::future<seastar::bool_class<seastar::                          stop_iteration_tag> >::then_wrapped_nrvo<seastar::future<seastar::bool_class<seastar::stop_iteration_tag> >, seastar::noncopyable_function<seastar::future<seastar::bool_class<seastar::stop_iteration_tag> > (seastar::future<seastar::bool_class<seastar::stop_iteration_tag> >&&)> >(seastar::noncopyable_function<seastar::future<seastar::bool_class<seastar::stop_iteration_tag> > (seastar::         future<seastar::bool_class<seastar::stop_iteration_tag> >&&)>&&)::{lambda(seastar::internal::promise_base_with_type<seastar::bool_class<seastar::stop_iteration_tag> >&&, seastar::noncopyable_function<seastar::future<seastar::bool_class<seastar::stop_iteration_tag> > (seastar::future<seastar::bool_class<seastar::stop_iteration_tag> >&&)>&, seastar::future_state<seastar::                        bool_class<seastar::stop_iteration_tag> >&&)#1}::operator()(seastar::internal::promise_base_with_type<seastar::bool_class<seastar::stop_iteration_tag> >&&, seastar::noncopyable_function<seastar::future<seastar::bool_class<seastar::stop_iteration_tag> > (seastar::future<seastar::bool_class<seastar::stop_iteration_tag> >&&)>&, seastar::future_state<seastar::bool_class<seastar::                  stop_iteration_tag> >&&) const::{lambda()#1}&&) (/usr/bin/ceph-osd+0x39c0f526)
    #14 0x55a1f974662c in seastar::continuation<seastar::internal::promise_base_with_type<seastar::bool_class<seastar::stop_iteration_tag> >, seastar::noncopyable_function<seastar::future<seastar::bool_class<seastar::stop_iteration_tag> > (seastar::future<seastar::bool_class<seastar::stop_iteration_tag> >&&)>, seastar::future<seastar::bool_class<seastar::stop_iteration_tag> >::                then_wrapped_nrvo<seastar::future<seastar::bool_class<seastar::stop_iteration_tag> >, seastar::noncopyable_function<seastar::future<seastar::bool_class<seastar::stop_iteration_tag> > (seastar::future<seastar::bool_class<seastar::stop_iteration_tag> >&&)> >(seastar::noncopyable_function<seastar::future<seastar::bool_class<seastar::stop_iteration_tag> > (seastar::future<seastar::                bool_class<seastar::stop_iteration_tag> >&&)>&&)::{lambda(seastar::internal::promise_base_with_type<seastar::bool_class<seastar::stop_iteration_tag> >&&, seastar::noncopyable_function<seastar::future<seastar::bool_class<seastar::stop_iteration_tag> > (seastar::future<seastar::bool_class<seastar::stop_iteration_tag> >&&)>&, seastar::future_state<seastar::bool_class<seastar::                    stop_iteration_tag> >&&)#1}, seastar::bool_class<seastar::stop_iteration_tag> >::run_and_dispose() (/usr/bin/ceph-osd+0x39c1062c)
......
freed by thread T1 (reactor-1) here:
    #0 0x7f3db4aa636f in operator delete(void*, unsigned long) (/lib64/libasan.so.6+0xb736f)
    #1 0x55a1f9422851 in crimson::osd::PG::~PG() (/usr/bin/ceph-osd+0x398ec851)
previously allocated by thread T1 (reactor-1) here:
    #0 0x7f3db4aa5307 in operator new(unsigned long) (/lib64/libasan.so.6+0xb6307)
    #1 0x55a1f9daa375 in auto crimson::osd::ShardServices::make_pg(crimson::local_shared_foreign_ptr<boost::local_shared_ptr<OSDMap const> >, spg_t, bool)::{lambda(auto:1&&)#3}::operator()<std::tuple<seastar::future<std::tuple<pg_pool_t, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::map<std::__cxx11::basic_string<char, std::char_traits<char>,            std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>         > > > > > >, seastar::future<boost::intrusive_ptr<crimson::os::FuturizedCollection> > > >(std::tuple<seastar::future<std::tuple<pg_pool_t, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>         >, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > > > >, seastar::future<boost::intrusive_ptr<crimson::os::FuturizedCollection> > >&&) const (/           usr/bin/ceph-osd+0x3a274375)
......

I have not encountered that issue before. Are you sure it's unrelated? If so, can you please open a tracker w/ the logs?
Thank you!

Matan-B · 2023-06-12T09:56:19Z

@athanatos @cyx1231st,
Since this PR is a critical change, once it will be ready - I'll feel more confident to merge it only after #51945 and #51961. The two PRs should help us to have more stable tests. If the two PRs will be merged by then already - I suggest re-testing after a rebase.

cyx1231st · 2023-06-12T10:30:25Z

I have not encountered that issue before. Are you sure it's unrelated?

This PR implements multi-core messenger, but it hasn't enabled the multi-core feature to crimson OSD. So Crimson OSD is still running with the single-core-mode messenger, and the interactions between OSD and msgr are not (supposed to be) modified. Please also refer to how this PR changes the OSD-part code, there aren't many.

It seems to me that printing PG after destructing it is unrelated to messenger internal refactorings. More likely to be a side-effect due to unexpected messenger behavior. For example with #51916 (comment) the connection will immediately retry without backoff, which is wrong but should not cause the PG AddressSanitizer issue above.

If so, can you please open a tracker w/ the logs?

I'm not sure if we can reproduce the issue after correcting the msgr behavior. And the log for OSD.2 is 530MB. I'll paste the link to the log and open the tracker tomorrow.

cyx1231st · 2023-06-12T10:35:36Z

Since this PR is a critical change, once it will be ready - I'll feel more confident to merge it only after #51945 and #51961. The two PRs should help us to have more stable tests. If the two PRs will be merged by then already - I suggest re-testing after a rebase.

Sure, I'll schedule the follow-up tests based on the 2 PRs. There are still TODOs #51916 (comment), so it's fine to take some time.

Matan-B · 2023-06-12T10:53:11Z

Sure, I'll schedule the follow-up tests based on the 2 PRs. There are still TODOs #51916 (comment), so it's fine to take some time.

Thank you. They are still not ready so you can rebase after they will be merged.

cyx1231st · 2023-06-13T07:57:44Z

I have not encountered that issue before. Are you sure it's unrelated? If so, can you please open a tracker w/ the logs?

https://tracker.ceph.com/issues/61653

To prevent the previous shard-states racing on the shared data structures after switching to a new shard-states. Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com>

Users may need to know the new connection shard prior to message dispatching. Otherwise, there will be no chance for user to do any related preparations. This is still a placeholder before multi-core messegner is enabled. Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com>

Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com>

Note that it is inevitable that the user can mark down the connection while the protocol is trying to move the connection to another core. In that case, the implementation should tolerate the racing, which finally needs to cleanup resources correctly and dispatch reasonable events. Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com>

…-handler interfaces Otherwise, calling io-handler interfaces may result in wrong core/order. This needs to take special care to handle preemptive cases such as closing and replacing. Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com>

… to be cross-core Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com>

Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com>

It's meaningless to dispatch the initial acceptance always from the msgr core. Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com>

cyx1231st · 2023-06-26T03:30:44Z

rebased to catch up with main

athanatos · 2023-06-26T20:22:09Z

How close is this to being ready to merge?

There might be different levels of expectations:

1. Merge if there is no regression to the current implementations.
   
   * The original single-core msgr UT is passing, but still needs to check the teuthology test results.

2. Merge if the implementation is proved to be scalable with cores.
   
   * Perf tool is ready at: [crimson/tools/perf_crimson_msgr: integrate multi-core msgr with various improvements #52091](https://github.com/ceph/ceph/pull/52091)
   * Still working on getting performance with different numbers of CPUs
   * Still working on the performance analysis (sync-msgr vs crimson-msgr, multi-core)

3. Merge if the implementation is fully working with multi-core OSD
   
   * Longer-term, see TODOs in [crimson/net: support connections in multiple shards #51916 (comment)](https://github.com/ceph/ceph/pull/51916#issue-1741044738), need to implement multi-core msgr UTs and fully enable multi-core crimson msgr with multi-core crimson OSD.

@athanatos What do you think? Is this PR appropriate to merge at level 1 or 2, or needs to wait until level 3 is done?

I think we should merge at 1. given the size of this PR.

athanatos

LGTM other than my above nit.

cyx1231st · 2023-06-27T01:33:31Z

I think we should merge at 1. given the size of this PR.

OK, for 1, teuthology still sees inconsistent failures, it takes time to investigate and fix.

I'll work on 2 at the same time as the tool is ready.

Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com>

cyx1231st · 2023-06-27T03:00:07Z

Changeset: crimson/net: cleanup, rename is_fixed_cpu

cyx1231st · 2023-06-29T06:21:49Z

Changeset: crimson/osd/heartbeat: relax the order of replacement reset and accept

Identified and fixed an issue that causes regression during heartbeat racing, which causes some failures in teutholoy test.

With the new implementation in messenger, the order of replacement reset and accept events cannot be determined because they are from different connections. Modify the heatbeat logic to tolerate the both cases. Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com>

cyx1231st · 2023-06-29T09:01:26Z

crimson build is available at https://shaman.ceph.com/builds/ceph/wip-yingxin-msgr-multi-core-crimson-10/6450d2709f998063da1a187d826fea699691b0b1/crimson/348588/

cyx1231st · 2023-06-30T02:02:51Z

Test results: https://pulpito.ceph.com/matan-2023-06-29_09:53:04-crimson-rados-wip-yingxin-msgr-multi-core-crimson-10-distro-default-smithi/

3 out of 60 are failed:

7321111-osd.1.log

ERROR 2023-06-29 10:19:59,152 [shard 0] none - src/osd/PeeringState.cc:4038: 
In function 'bool PeeringState::append_log_entries_update_missing(mempool::osd_pglog::list<pg_log_entry_t>&, ObjectStore::Transaction&, std::optional<eversion_t>, std::optional<eversion_t>)',
ceph_assert(%s) entries.begin()->version > info.last_update

73211112-osd.0.log

ERROR 2023-06-29 11:07:45,755 [shard 1] none - src/osd/PeeringState.cc:4038 : 
In function 'bool PeeringState::append_log_entries_update_missing(mempool::osd_pglog::list<pg_log_entry_t>&, ObjectStore::Transaction&, std::optional<eversion_t>, std::optional<eversion_t>)', 
ceph_assert(%s)

7321120-osd.2.log

ERROR 2023-06-29 10:28:20,622 [shard 1] none - src/osd/PeeringState.cc:4038 : 
In function 'bool PeeringState::append_log_entries_update_missing(mempool::osd_pglog::list<pg_log_entry_t>&, ObjectStore::Transaction&, std::optional<eversion_t>, std::optional<eversion_t>)',
ceph_assert(%s) entries.begin()->version > info.last_update

@athanatos @Matan-B They look like to be the same issue and not related to this PR.

athanatos · 2023-06-30T03:02:50Z

I'm inclined to go ahead and merge -- I'll do it in the morning if no one disagrees (or beats me to it :) ).

Matan-B · 2023-07-02T09:55:38Z

Merging based on:
https://pulpito.ceph.com/matan-2023-06-29_09:53:04-crimson-rados-wip-yingxin-msgr-multi-core-crimson-10-distro-default-smithi/
https://pulpito.ceph.com/matan-2023-07-02_08:25:06-crimson-rados-wip-yingxin-msgr-multi-core-crimson-10-distro-crimson-smithi/

Known failures:
https://tracker.ceph.com/issues/59165

cyx1231st requested a review from a team as a code owner June 5, 2023 06:07

cyx1231st requested review from athanatos, liu-chunmei and Matan-B June 5, 2023 06:07

github-actions bot added crimson tests labels Jun 5, 2023

cyx1231st force-pushed the wip-seastar-msgr-multi-core-2 branch 2 times, most recently from 481fbe3 to 53d3bcc Compare June 5, 2023 06:55

This comment was marked as off-topic.

Sign in to view

cyx1231st force-pushed the wip-seastar-msgr-multi-core-2 branch from 53d3bcc to 922f8e7 Compare June 5, 2023 08:50

This comment was marked as off-topic.

Sign in to view

This comment was marked as duplicate.

Sign in to view

cyx1231st force-pushed the wip-seastar-msgr-multi-core-2 branch from 922f8e7 to 4e4dcd2 Compare June 8, 2023 01:26

cyx1231st force-pushed the wip-seastar-msgr-multi-core-2 branch from 4e4dcd2 to 25de6df Compare June 12, 2023 07:28

cyx1231st added 10 commits June 25, 2023 11:57

crimson/net: make io-handler in/out dispatching aware of being switched

77fef01

To prevent the previous shard-states racing on the shared data structures after switching to a new shard-states. Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com>

crimson/net: expose the connection working shard to users

b2a456d

Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com>

crimson/net: convert all interactions between protocol and io-handler…

0b60209

… to be cross-core Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com>

crimson/net: misc cleanups with logs around cross-core

7726867

Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com>

crimson/net: expose is_fixed_cpu configuration at Messenger level

6a01988

Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com>

crimson/net: keep the order of cross-core events in msgr v2

bdd89b6

Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com>

crimson/net: dispatch the initial acceptance from the socket core

1e8a39f

It's meaningless to dispatch the initial acceptance always from the msgr core. Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com>

cyx1231st force-pushed the wip-seastar-msgr-multi-core-2 branch from 1be8b3f to 1e8a39f Compare June 26, 2023 03:30

athanatos self-requested a review June 26, 2023 22:03

athanatos approved these changes Jun 26, 2023

View reviewed changes

liu-chunmei mentioned this pull request Jun 26, 2023

Crimson/osd: support multicore osd #51147

Merged

14 tasks

crimson/net: cleanup, rename is_fixed_cpu

2fed939

Signed-off-by: Yingxin Cheng <yingxin.cheng@intel.com>

cyx1231st force-pushed the wip-seastar-msgr-multi-core-2 branch from 9bd4d85 to a870946 Compare June 29, 2023 06:28

Matan-B merged commit 1f1800b into ceph:main Jul 2, 2023
10 of 11 checks passed

cyx1231st deleted the wip-seastar-msgr-multi-core-2 branch July 4, 2023 09:09

Matan-B mentioned this pull request Oct 11, 2023

reef: crimson/net: support connections in multiple shards #53949

Merged

cyx1231st added the crimson-perf label Apr 18, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

crimson/net: support connections in multiple shards #51916

crimson/net: support connections in multiple shards #51916

cyx1231st commented Jun 5, 2023 •

edited

cyx1231st commented Jun 5, 2023

This comment was marked as off-topic.

cyx1231st commented Jun 5, 2023

This comment was marked as off-topic.

cyx1231st commented Jun 5, 2023

Matan-B commented Jun 5, 2023

This comment was marked as duplicate.

cyx1231st commented Jun 7, 2023

cyx1231st commented Jun 7, 2023

Matan-B commented Jun 7, 2023 •

edited

cyx1231st commented Jun 12, 2023 •

edited

cyx1231st commented Jun 12, 2023

cyx1231st commented Jun 12, 2023

Matan-B commented Jun 12, 2023 •

edited

Matan-B commented Jun 12, 2023

cyx1231st commented Jun 12, 2023

cyx1231st commented Jun 12, 2023

Matan-B commented Jun 12, 2023

cyx1231st commented Jun 13, 2023

cyx1231st commented Jun 26, 2023

athanatos commented Jun 26, 2023

athanatos left a comment

cyx1231st commented Jun 27, 2023

cyx1231st commented Jun 27, 2023

cyx1231st commented Jun 29, 2023

cyx1231st commented Jun 29, 2023

cyx1231st commented Jun 30, 2023

athanatos commented Jun 30, 2023

Matan-B commented Jul 2, 2023

crimson/net: support connections in multiple shards #51916

crimson/net: support connections in multiple shards #51916

Conversation

cyx1231st commented Jun 5, 2023 • edited

Contribution Guidelines

Checklist

cyx1231st commented Jun 5, 2023

This comment was marked as off-topic.

cyx1231st commented Jun 5, 2023

This comment was marked as off-topic.

cyx1231st commented Jun 5, 2023

Matan-B commented Jun 5, 2023

This comment was marked as duplicate.

cyx1231st commented Jun 7, 2023

cyx1231st commented Jun 7, 2023

Matan-B commented Jun 7, 2023 • edited

cyx1231st commented Jun 12, 2023 • edited

cyx1231st commented Jun 12, 2023

cyx1231st commented Jun 12, 2023

Matan-B commented Jun 12, 2023 • edited

Matan-B commented Jun 12, 2023

cyx1231st commented Jun 12, 2023

cyx1231st commented Jun 12, 2023

Matan-B commented Jun 12, 2023

cyx1231st commented Jun 13, 2023

cyx1231st commented Jun 26, 2023

athanatos commented Jun 26, 2023

athanatos left a comment

Choose a reason for hiding this comment

cyx1231st commented Jun 27, 2023

cyx1231st commented Jun 27, 2023

cyx1231st commented Jun 29, 2023

cyx1231st commented Jun 29, 2023

cyx1231st commented Jun 30, 2023

athanatos commented Jun 30, 2023

Matan-B commented Jul 2, 2023

cyx1231st commented Jun 5, 2023 •

edited

Matan-B commented Jun 7, 2023 •

edited

cyx1231st commented Jun 12, 2023 •

edited

Matan-B commented Jun 12, 2023 •

edited