crimson/net: miscellaneous fixes to seastar-msgr #23816

cyx1231st · 2018-08-30T04:51:52Z

References tracker ticket
Updates documentation if necessary
Includes tests for new functionality or reproducer for bug

cyx1231st · 2018-08-30T06:01:18Z

cbodley · 2018-08-30T13:42:49Z

src/crimson/net/Errors.cc

@@ -61,7 +61,11 @@ const std::error_category& net_category()
    }

    bool equivalent(int code, const std::error_condition& cond) const noexcept override {
-      switch (static_cast<error>(code)) {
+      const auto &err = static_cast<error>(code);
+      if (cond == err) {


can you clarify why this is needed? what comparison was failing without this part?

The direct reason is that (e.g.) e.code() == error::read_eof will result in false even if they appear same from debugging.

Under the hood (IMO), the rh error::read_eof will be implicitly converted to type std::error_condition because of struct is_error_condition_enum<ceph::net::error> : public true_type {}; To compare lh std::error_code with rh std::error_condition, the legacy equivalent() will return false even if the code matches cond.

Fix: return true if cond equals static_cast<error>(code) in the new equivalent(), note the rh static_cast<error>(code) will also be implicitly converted to std::error_condition. So no need to worry about infinite recursion.

Then the multiple close() issue pops out, previously it won't be called in read_tags_until_next_message() in any condition.

thanks! sorry for the delays, i wanted to do some extra testing here. it looks like the default implementations of std::error_category::equivalent() both do the right thing here, so i'd rather call them for this part

i opened #23844 as an example. if that works for you, you're welcome to pull those commits into this pr, or we can merge that one separately - up to you

thanks for the fix! I'll pull #23844 into this pr, because the multiple-call of SocketConnection::close() depends on this fix.

cbodley · 2018-08-30T13:43:06Z

src/crimson/net/SocketConnection.cc

+    // already closing
+    assert(close_ready.valid());
+    return close_ready.get_future();
+  } else {


nit: drop the else block to save indentation

edit: looking closer at read_tags_until_next_message(), it doesn't wait on the future returned by close(). maybe we should just change it to only call close() if state != state_t::closed?

maybe we should just change it to only call close() if state != state_t::closed?

I think not ... close() in read_tags_until_next_message() can be called first, this means we should also add condition state != state_t::closed in shutdown(). But then shutdown() cannot wait for this close().

Another possible case is that multiple continuations may need to wait for shutdown().

When implementations go complex, IMO we need to provide such interface to allow multiple continuations waiting for close(), or multiple calls to functions that wait for close() internally.

multiple continuations may need to wait for shutdown().

could you share with us a use case? i think the connection will only be closed by a single party. hence it will be the one closing the connection who is waiting for this future.

No, there is no use case yet (just want to explain the limitations).

The real issue happening now is that the connection can be closed at multiple places (read_tags_until_next_message() and shutdown()) and in random orders. Just checkout the "error_code fix" and run unittest_seastar_messenger and there will be assertion error happening:

/root/ceph/src/crimson/net/SocketMessenger.cc: In function 'virtual void ceph::net::SocketMessenger::unregister_conn(ceph::net::ConnectionRef)' thread 7f8e90099e40 /root/ceph/src/crimson/net/SocketMessenger.cc: 190: FAILED ceph_assert(found != connections.end()) ceph version 14.0.0-2679-g2001145 (20011453a24fde224328b352e91643a279396d04) nautilus (dev) 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x152) [0x7f8e8744f8b3] 2: (ceph::__ceph_assertf_fail(char const*, char const*, int, char const*, char const*, ...)+0) [0x7f8e8744fa35] 3: ./unittest_seastar_messenger() [0x4d9b33] 4: (ceph::net::SocketConnection::close()+0x5e) [0x51351e] 5: ./unittest_seastar_messenger() [0x4b9243] 6: ./unittest_seastar_messenger() [0x503558] 7: (seastar::reactor::run_tasks(seastar::reactor::task_queue&)+0x85) [0x5620f5] 8: (seastar::reactor::run_some_tasks()+0xe7) [0x562477] 9: (seastar::reactor::run()+0xcf0) [0x5bdfa0] 10: (seastar::app_template::run_deprecated(int, char**, std::function<void ()>&&)+0xa54) [0x53ff64] 11: (seastar::app_template::run(int, char**, std::function<seastar::future<int> ()>&&)+0xd0) [0x540dd0] 12: (seastar::app_template::run(int, char**, std::function<seastar::future<> ()>&&)+0xd0) [0x540f50] 13: (main()+0x602) [0x4bb4f2] 14: (__libc_start_main()+0xf0) [0x7f8e858cf830] 15: (_start()+0x29) [0x4bce79] Aborting on shard 0.

cbodley · 2018-08-30T13:50:42Z

src/crimson/net/SocketConnection.cc

-  return out.write(reinterpret_cast<const char*>(&h.reply), sizeof(h.reply))
+  seastar::net::packet msg{reinterpret_cast<const char*>(&h.reply),
+                           sizeof(h.reply)};
+  return out.write(std::move(msg))


this code gets duplicated a bit - can we make a helper function for it?

template <typename T> seastar::net::packet make_static_packet(const T& value) { return { reinterpret_cast<const char*>(&value), sizeof(value)) }; }

then these calls could just be return out.write(make_static_packet(h.reply));

repushed and fixed

tchaikov · 2018-08-31T10:42:27Z

src/crimson/net/SocketConnection.cc

@@ -801,8 +801,7 @@ seastar::future<> SocketConnection::client_handshake(entity_type_t peer_type,
      validate_peer_addr(saddr, peer_addr);

      if (my_addr != caddr) {
-        // take peer's address for me, but preserve my port/nonce
-        caddr.set_port(my_addr.get_port());
+        // take peer's address for me, but preserve my nonce


why would you want to learn the port from the peer? i understand that the client side's port is always 0 here, but it does not hurt, right? also, SimpleMesseger::learned_addr() also preserves the nonce and port.

i am fine either way as long as it does not break the messenger.

It will be clearer for debugging/logging purposes if client doesn't drop this port information. And I'm testing multiple-connection scenario now...

Currently AsyncMessenger::learned_addr() may learn the port from peer at certain condition. If this would break seastar-msgr, we can store it inside SocketConnection separately.

tchaikov · 2018-08-31T11:29:02Z

src/crimson/net/SocketConnection.cc

@@ -680,20 +698,22 @@ seastar::future<> SocketConnection::handle_connect_reply(msgr_tag_t tag)
      missing) {
    return fault();
  }
-  if (h.reply.tag == CEPH_MSGR_TAG_SEQ) {
+  if (tag == CEPH_MSGR_TAG_SEQ) {


ahh, i just stumbled on this bug. thanks for fixing it!

cyx1231st · 2018-09-02T10:07:06Z

src/crimson/net/SocketConnection.cc

+    // already closing
+    assert(close_ready.valid());
+    return close_ready.get_future();
+  }


@tchaikov

could you share with us a use case? i think the connection will only be closed by a single party. hence it will be the one closing the connection who is waiting for this future.

No, there is no use case yet (just want to explain the limitations).

The real issue happening now is that SocketConnection::close() can be called at multiple places (inside read_tags_until_next_message() and shutdown()) and in random orders. To reproduce, checkout the "error_code fix" and run unittest_seastar_messenger. There will be assertion error happening:

/root/ceph/src/crimson/net/SocketMessenger.cc: In function 'virtual void ceph::net::SocketMessenger::unregister_conn(ceph::net::ConnectionRef)' thread 7f8e90099e40 /root/ceph/src/crimson/net/SocketMessenger.cc: 190: FAILED ceph_assert(found != connections.end()) ceph version 14.0.0-2679-g2001145 (20011453a24fde224328b352e91643a279396d04) nautilus (dev) 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x152) [0x7f8e8744f8b3] 2: (ceph::__ceph_assertf_fail(char const*, char const*, int, char const*, char const*, ...)+0) [0x7f8e8744fa35] 3: ./unittest_seastar_messenger() [0x4d9b33] 4: (ceph::net::SocketConnection::close()+0x5e) [0x51351e] 5: ./unittest_seastar_messenger() [0x4b9243] 6: ./unittest_seastar_messenger() [0x503558] 7: (seastar::reactor::run_tasks(seastar::reactor::task_queue&)+0x85) [0x5620f5] 8: (seastar::reactor::run_some_tasks()+0xe7) [0x562477] 9: (seastar::reactor::run()+0xcf0) [0x5bdfa0] 10: (seastar::app_template::run_deprecated(int, char**, std::function<void ()>&&)+0xa54) [0x53ff64] 11: (seastar::app_template::run(int, char**, std::function<seastar::future<int> ()>&&)+0xd0) [0x540dd0] 12: (seastar::app_template::run(int, char**, std::function<seastar::future<> ()>&&)+0xd0) [0x540f50] 13: (main()+0x602) [0x4bb4f2] 14: (__libc_start_main()+0xf0) [0x7f8e858cf830] 15: (_start()+0x29) [0x4bce79] Aborting on shard 0.

Also please refer to the make check failure in #23844

cyx1231st · 2018-09-02T16:10:12Z

https://github.com/ceph/ceph/blob/master/src/crimson/net/SocketMessenger.cc#L80-L90

  return conn->server_handshake()
    .handle_exception([conn] (std::exception_ptr eptr) {
      // close the connection before returning errors
      return seastar::make_exception_future<>(eptr)
1)      .finally([conn] { return conn->close(); });
    }).then([this, conn] {
      dispatcher->ms_handle_accept(conn);
      // dispatch messages until the connection closes or the dispatch
      // queue shuts down
2)    return dispatch(std::move(conn));
    });

I think here is a logic defect: 1) is closing the connection while connections.emplace(conn->get_peer_addr(), conn) hasn't been executed in 2).
Maybe SocketConnection::close() still needs to be improved, e.g. only unregister the conn after it is pushed into SocketMessenger::connections?

tchaikov · 2018-09-02T17:03:46Z

probably we can move the get_messenger()->unregister_conn(this); out of SocketConnection::close() ? and add a finally() block at the end of SocketMessenger::dispatch(), so we can call it in thatfinally() block?

cyx1231st · 2018-09-03T03:00:16Z

Agreed @tchaikov .

But there was a discussion about the order to firstly unregister the connection, then close the in/out_streams. Since the closing connection is marked state_t::close, maybe we can check the state in SocketConnection::send() and choose not to send if it is state_t::close?

Should we also change the interface of SocketConnection::send() to return error or throw exception (which is prefered?) if the send is failed?

tchaikov · 2018-09-03T05:05:05Z

about the order to firstly unregister the connection, then close the in/out_streams.

ahh, that's true.

Should we also change the interface of SocketConnection::send() to return error or throw exception (which is prefered?) if the send is failed?

i like this idea, as it is simpler. but i think the users of SocketConnection won't agree on this. they use the messenger in this way:

connect() to peer, and
use the connection returned by connect() to send a message
wait for the reply, and handle it.

if we cannot ensure that the caller will get a usable connection. we will need to complicate this flow in a strange way: to repeat calling connect() and send() until the latter does not throw.

maybe, another option is to

change the state of failed SocketConnection to state_t::closed under circumstances that we consider it as a failure, for instance, connection_reset or read_eof. and
change SocketMessenger::lookup_conn() to let it check for the found connection's state before returning it, so it also returns nullptr if the connection is already closed.

and, yes. we need to be careful about the "replacing" logic where the new connection takes the place of the existing failed one.

if we recreate a new connection if the existing one is found or closed, then i guess we can unreigster the connection in SocketMessenger::dispatch(). am i right?

Signed-off-by: Casey Bodley <cbodley@redhat.com>

consult base class for equivalence (which will match errors within this category) before applying the extra error code mappings Signed-off-by: Casey Bodley <cbodley@redhat.com>

Signed-off-by: Yingxin <yingxin.cheng@intel.com>

cyx1231st · 2018-09-03T16:04:18Z

change the state of failed SocketConnection to state_t::closed under circumstances that we consider it as a failure, for instance, connection_reset or read_eof. and

Right, I see there are already some code that handle failures with SocketConnection::close().

change SocketMessenger::lookup_conn() to let it check for the found connection's state before returning it, so it also returns nullptr if the connection is already closed.

Caller will get an immediate usable connection ref then. But it won't guarantee that caller's connection ref can always be usable IMO..., since it can be remotely closed at anytime. So SocketConnection::send() still need to check the connection state, right?

Ahh I see send_ready is aimed to handle the remote close ...

if we recreate a new connection if the existing one is found or closed, then i guess we can unreigster the connection in SocketMessenger::dispatch(). am i right?

I think we should be careful if the existing connection is still at state_t::open state. Might need to crosscheck how the async-msgr deals with this...

cyx1231st · 2018-09-04T05:17:19Z

added commit
crimson/net: remove unused and dup global_seq

It is possible while closing a connectin during a msgr shutdown, a second close() can be called in `read_tags_until_next_message()` Signed-off-by: Yingxin <yingxin.cheng@intel.com>

Signed-off-by: Yingxin <yingxin.cheng@intel.com>

seastar doesn't support mixed buffered writes and zero-copy writes. Signed-off-by: Yingxin <yingxin.cheng@intel.com>

tchaikov

LGTM in general. albeit not very comfortable with 2d61065 . probably we can fix it in another PR though. @cbodley what do you think?

tchaikov · 2018-09-04T14:35:27Z

Caller will get an immediate usable connection ref then. But it won't guarantee that caller's connection ref can always be usable IMO..., since it can be remotely closed at anytime.

right. but i think that's what the lossy policy is for. we try out best to return an usable connection to connect() 's caller, in the worst case, the msg will be resent to the connection replacing the closed one.

cbodley · 2018-09-04T17:56:24Z

not very comfortable with 2d61065 . probably we can fix it in another PR though. @cbodley what do you think?

agreed. i think we should take this to fix the unit test, then put some more thought into a design that makes sense for both lossy/lossless

tchaikov added bug-fix crimson labels Aug 30, 2018

tchaikov requested review from tchaikov and cbodley August 30, 2018 12:26

cbodley reviewed Aug 30, 2018

View reviewed changes

tchaikov reviewed Aug 31, 2018

View reviewed changes

tchaikov mentioned this pull request Sep 1, 2018

crimson: add MonClient #23849

Merged

3 tasks

cyx1231st mentioned this pull request Sep 2, 2018

crimson/net: fixes for error code framework #23844

Closed

cyx1231st force-pushed the wip-seastar-msgr-fix branch from 5a89a17 to 58546de Compare September 2, 2018 09:50

cyx1231st commented Sep 2, 2018

View reviewed changes

cyx1231st force-pushed the wip-seastar-msgr-fix branch from 58546de to 029c24b Compare September 3, 2018 01:47

cbodley and others added 3 commits September 3, 2018 16:29

crimson/net: add error::success = 0

a94f42f

Signed-off-by: Casey Bodley <cbodley@redhat.com>

crimson/net: fix error category equivalence checks

c48de27

consult base class for equivalence (which will match errors within this category) before applying the extra error code mappings Signed-off-by: Casey Bodley <cbodley@redhat.com>

crimson/net: fix peer_addr sync in during handshake

bee7be5

Signed-off-by: Yingxin <yingxin.cheng@intel.com>

cyx1231st force-pushed the wip-seastar-msgr-fix branch 2 times, most recently from 749182b to b9061bb Compare September 4, 2018 05:16

cyx1231st added 5 commits September 4, 2018 18:37

crimson/net: support multiple call to conn.close()

2d61065

It is possible while closing a connectin during a msgr shutdown, a second close() can be called in `read_tags_until_next_message()` Signed-off-by: Yingxin <yingxin.cheng@intel.com>

crimson/net: use parameter tag in handle_connect_reply()

b5514f1

Signed-off-by: Yingxin <yingxin.cheng@intel.com>

crimson/net: remove unused and dup global_seq

692932c

Signed-off-by: Yingxin <yingxin.cheng@intel.com>

crimson/net: init h.global_seq during handshake

9b66c5b

Signed-off-by: Yingxin <yingxin.cheng@intel.com>

crimson/net: only do zero-copy writes to out_stream

b9061bb

seastar doesn't support mixed buffered writes and zero-copy writes. Signed-off-by: Yingxin <yingxin.cheng@intel.com>

tchaikov approved these changes Sep 4, 2018

View reviewed changes

tchaikov merged commit b7b3ce8 into ceph:master Sep 5, 2018

cyx1231st deleted the wip-seastar-msgr-fix branch September 5, 2018 08:43

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

crimson/net: miscellaneous fixes to seastar-msgr #23816

crimson/net: miscellaneous fixes to seastar-msgr #23816

cyx1231st commented Aug 30, 2018

cyx1231st commented Aug 30, 2018

cbodley Aug 30, 2018

cyx1231st Aug 30, 2018

cyx1231st Aug 30, 2018

cbodley Aug 31, 2018

cyx1231st Sep 2, 2018

cyx1231st Sep 3, 2018

cbodley Aug 30, 2018

cbodley Aug 30, 2018

cyx1231st Aug 30, 2018 •

edited

Loading

tchaikov Aug 31, 2018

cyx1231st Sep 2, 2018

cbodley Aug 30, 2018

cyx1231st Sep 3, 2018

tchaikov Aug 31, 2018

cyx1231st Sep 2, 2018

tchaikov Aug 31, 2018

cyx1231st Sep 2, 2018 •

edited

Loading

cyx1231st commented Sep 2, 2018

tchaikov commented Sep 2, 2018

cyx1231st commented Sep 3, 2018 •

edited

Loading

tchaikov commented Sep 3, 2018 •

edited

Loading

cyx1231st commented Sep 3, 2018 •

edited

Loading

cyx1231st commented Sep 4, 2018

tchaikov left a comment •

edited

Loading

tchaikov commented Sep 4, 2018 •

edited

Loading

cbodley commented Sep 4, 2018

crimson/net: miscellaneous fixes to seastar-msgr #23816

crimson/net: miscellaneous fixes to seastar-msgr #23816

Conversation

cyx1231st commented Aug 30, 2018

cyx1231st commented Aug 30, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cyx1231st Aug 30, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cyx1231st Sep 2, 2018 • edited Loading

Choose a reason for hiding this comment

cyx1231st commented Sep 2, 2018

tchaikov commented Sep 2, 2018

cyx1231st commented Sep 3, 2018 • edited Loading

tchaikov commented Sep 3, 2018 • edited Loading

cyx1231st commented Sep 3, 2018 • edited Loading

cyx1231st commented Sep 4, 2018

tchaikov left a comment • edited Loading

Choose a reason for hiding this comment

tchaikov commented Sep 4, 2018 • edited Loading

cbodley commented Sep 4, 2018

cyx1231st Aug 30, 2018 •

edited

Loading

cyx1231st Sep 2, 2018 •

edited

Loading

cyx1231st commented Sep 3, 2018 •

edited

Loading

tchaikov commented Sep 3, 2018 •

edited

Loading

cyx1231st commented Sep 3, 2018 •

edited

Loading

tchaikov left a comment •

edited

Loading

tchaikov commented Sep 4, 2018 •

edited

Loading