Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

crimson/net: miscellaneous fixes to seastar-msgr #23816

Merged
merged 8 commits into from
Sep 5, 2018

Conversation

cyx1231st
Copy link
Member

  • References tracker ticket
  • Updates documentation if necessary
  • Includes tests for new functionality or reproducer for bug

@cyx1231st
Copy link
Member Author

@tchaikov @cbodley

@@ -61,7 +61,11 @@ const std::error_category& net_category()
}

bool equivalent(int code, const std::error_condition& cond) const noexcept override {
switch (static_cast<error>(code)) {
const auto &err = static_cast<error>(code);
if (cond == err) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you clarify why this is needed? what comparison was failing without this part?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The direct reason is that (e.g.) e.code() == error::read_eof will result in false even if they appear same from debugging.

Under the hood (IMO), the rh error::read_eof will be implicitly converted to type std::error_condition because of struct is_error_condition_enum<ceph::net::error> : public true_type {}; To compare lh std::error_code with rh std::error_condition, the legacy equivalent() will return false even if the code matches cond.

Fix: return true if cond equals static_cast<error>(code) in the new equivalent(), note the rh static_cast<error>(code) will also be implicitly converted to std::error_condition. So no need to worry about infinite recursion.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then the multiple close() issue pops out, previously it won't be called in read_tags_until_next_message() in any condition.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks! sorry for the delays, i wanted to do some extra testing here. it looks like the default implementations of std::error_category::equivalent() both do the right thing here, so i'd rather call them for this part

i opened #23844 as an example. if that works for you, you're welcome to pull those commits into this pr, or we can merge that one separately - up to you

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for the fix! I'll pull #23844 into this pr, because the multiple-call of SocketConnection::close() depends on this fix.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

// already closing
assert(close_ready.valid());
return close_ready.get_future();
} else {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: drop the else block to save indentation

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

edit: looking closer at read_tags_until_next_message(), it doesn't wait on the future returned by close(). maybe we should just change it to only call close() if state != state_t::closed?

Copy link
Member Author

@cyx1231st cyx1231st Aug 30, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe we should just change it to only call close() if state != state_t::closed?

I think not ... close() in read_tags_until_next_message() can be called first, this means we should also add condition state != state_t::closed in shutdown(). But then shutdown() cannot wait for this close().

Another possible case is that multiple continuations may need to wait for shutdown().

When implementations go complex, IMO we need to provide such interface to allow multiple continuations waiting for close(), or multiple calls to functions that wait for close() internally.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

multiple continuations may need to wait for shutdown().

could you share with us a use case? i think the connection will only be closed by a single party. hence it will be the one closing the connection who is waiting for this future.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, there is no use case yet (just want to explain the limitations).

The real issue happening now is that the connection can be closed at multiple places (read_tags_until_next_message() and shutdown()) and in random orders. Just checkout the "error_code fix" and run unittest_seastar_messenger and there will be assertion error happening:

/root/ceph/src/crimson/net/SocketMessenger.cc: In function 'virtual void ceph::net::SocketMessenger::unregister_conn(ceph::net::ConnectionRef)' thread 7f8e90099e40
/root/ceph/src/crimson/net/SocketMessenger.cc: 190: FAILED ceph_assert(found != connections.end())
 ceph version 14.0.0-2679-g2001145 (20011453a24fde224328b352e91643a279396d04) nautilus (dev)
 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x152) [0x7f8e8744f8b3]
 2: (ceph::__ceph_assertf_fail(char const*, char const*, int, char const*, char const*, ...)+0) [0x7f8e8744fa35]
 3: ./unittest_seastar_messenger() [0x4d9b33]
 4: (ceph::net::SocketConnection::close()+0x5e) [0x51351e]
 5: ./unittest_seastar_messenger() [0x4b9243]
 6: ./unittest_seastar_messenger() [0x503558]
 7: (seastar::reactor::run_tasks(seastar::reactor::task_queue&)+0x85) [0x5620f5]
 8: (seastar::reactor::run_some_tasks()+0xe7) [0x562477]
 9: (seastar::reactor::run()+0xcf0) [0x5bdfa0]
 10: (seastar::app_template::run_deprecated(int, char**, std::function<void ()>&&)+0xa54) [0x53ff64]
 11: (seastar::app_template::run(int, char**, std::function<seastar::future<int> ()>&&)+0xd0) [0x540dd0]
 12: (seastar::app_template::run(int, char**, std::function<seastar::future<> ()>&&)+0xd0) [0x540f50]
 13: (main()+0x602) [0x4bb4f2]
 14: (__libc_start_main()+0xf0) [0x7f8e858cf830]
 15: (_start()+0x29) [0x4bce79]
Aborting on shard 0.

return out.write(reinterpret_cast<const char*>(&h.reply), sizeof(h.reply))
seastar::net::packet msg{reinterpret_cast<const char*>(&h.reply),
sizeof(h.reply)};
return out.write(std::move(msg))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this code gets duplicated a bit - can we make a helper function for it?

template <typename T>
seastar::net::packet make_static_packet(const T& value) {
  return { reinterpret_cast<const char*>(&value), sizeof(value)) };
}

then these calls could just be return out.write(make_static_packet(h.reply));

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

repushed and fixed

@@ -801,8 +801,7 @@ seastar::future<> SocketConnection::client_handshake(entity_type_t peer_type,
validate_peer_addr(saddr, peer_addr);

if (my_addr != caddr) {
// take peer's address for me, but preserve my port/nonce
caddr.set_port(my_addr.get_port());
// take peer's address for me, but preserve my nonce
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why would you want to learn the port from the peer? i understand that the client side's port is always 0 here, but it does not hurt, right? also, SimpleMesseger::learned_addr() also preserves the nonce and port.

i am fine either way as long as it does not break the messenger.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will be clearer for debugging/logging purposes if client doesn't drop this port information. And I'm testing multiple-connection scenario now...

Currently AsyncMessenger::learned_addr() may learn the port from peer at certain condition. If this would break seastar-msgr, we can store it inside SocketConnection separately.

@@ -680,20 +698,22 @@ seastar::future<> SocketConnection::handle_connect_reply(msgr_tag_t tag)
missing) {
return fault();
}
if (h.reply.tag == CEPH_MSGR_TAG_SEQ) {
if (tag == CEPH_MSGR_TAG_SEQ) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ahh, i just stumbled on this bug. thanks for fixing it!

// already closing
assert(close_ready.valid());
return close_ready.get_future();
}
Copy link
Member Author

@cyx1231st cyx1231st Sep 2, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tchaikov

could you share with us a use case? i think the connection will only be closed by a single party. hence it will be the one closing the connection who is waiting for this future.

No, there is no use case yet (just want to explain the limitations).

The real issue happening now is that SocketConnection::close() can be called at multiple places (inside read_tags_until_next_message() and shutdown()) and in random orders. To reproduce, checkout the "error_code fix" and run unittest_seastar_messenger. There will be assertion error happening:

/root/ceph/src/crimson/net/SocketMessenger.cc: In function 'virtual void ceph::net::SocketMessenger::unregister_conn(ceph::net::ConnectionRef)' thread 7f8e90099e40
/root/ceph/src/crimson/net/SocketMessenger.cc: 190: FAILED ceph_assert(found != connections.end())
 ceph version 14.0.0-2679-g2001145 (20011453a24fde224328b352e91643a279396d04) nautilus (dev)
 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x152) [0x7f8e8744f8b3]
 2: (ceph::__ceph_assertf_fail(char const*, char const*, int, char const*, char const*, ...)+0) [0x7f8e8744fa35]
 3: ./unittest_seastar_messenger() [0x4d9b33]
 4: (ceph::net::SocketConnection::close()+0x5e) [0x51351e]
 5: ./unittest_seastar_messenger() [0x4b9243]
 6: ./unittest_seastar_messenger() [0x503558]
 7: (seastar::reactor::run_tasks(seastar::reactor::task_queue&)+0x85) [0x5620f5]
 8: (seastar::reactor::run_some_tasks()+0xe7) [0x562477]
 9: (seastar::reactor::run()+0xcf0) [0x5bdfa0]
 10: (seastar::app_template::run_deprecated(int, char**, std::function<void ()>&&)+0xa54) [0x53ff64]
 11: (seastar::app_template::run(int, char**, std::function<seastar::future<int> ()>&&)+0xd0) [0x540dd0]
 12: (seastar::app_template::run(int, char**, std::function<seastar::future<> ()>&&)+0xd0) [0x540f50]
 13: (main()+0x602) [0x4bb4f2]
 14: (__libc_start_main()+0xf0) [0x7f8e858cf830]
 15: (_start()+0x29) [0x4bce79]
Aborting on shard 0.

Also please refer to the make check failure in #23844

@cyx1231st
Copy link
Member Author

https://github.com/ceph/ceph/blob/master/src/crimson/net/SocketMessenger.cc#L80-L90

  return conn->server_handshake()
    .handle_exception([conn] (std::exception_ptr eptr) {
      // close the connection before returning errors
      return seastar::make_exception_future<>(eptr)
1)      .finally([conn] { return conn->close(); });
    }).then([this, conn] {
      dispatcher->ms_handle_accept(conn);
      // dispatch messages until the connection closes or the dispatch
      // queue shuts down
2)    return dispatch(std::move(conn));
    });

I think here is a logic defect: 1) is closing the connection while connections.emplace(conn->get_peer_addr(), conn) hasn't been executed in 2).
Maybe SocketConnection::close() still needs to be improved, e.g. only unregister the conn after it is pushed into SocketMessenger::connections?

@tchaikov
Copy link
Contributor

tchaikov commented Sep 2, 2018

probably we can move the get_messenger()->unregister_conn(this); out of SocketConnection::close() ? and add a finally() block at the end of SocketMessenger::dispatch(), so we can call it in thatfinally() block?

@cyx1231st
Copy link
Member Author

cyx1231st commented Sep 3, 2018

Agreed @tchaikov .

But there was a discussion about the order to firstly unregister the connection, then close the in/out_streams. Since the closing connection is marked state_t::close, maybe we can check the state in SocketConnection::send() and choose not to send if it is state_t::close?

Should we also change the interface of SocketConnection::send() to return error or throw exception (which is prefered?) if the send is failed?

@tchaikov
Copy link
Contributor

tchaikov commented Sep 3, 2018

about the order to firstly unregister the connection, then close the in/out_streams.

ahh, that's true.

Should we also change the interface of SocketConnection::send() to return error or throw exception (which is prefered?) if the send is failed?

i like this idea, as it is simpler. but i think the users of SocketConnection won't agree on this. they use the messenger in this way:

  1. connect() to peer, and
  2. use the connection returned by connect() to send a message
  3. wait for the reply, and handle it.

if we cannot ensure that the caller will get a usable connection. we will need to complicate this flow in a strange way: to repeat calling connect() and send() until the latter does not throw.

maybe, another option is to

  1. change the state of failed SocketConnection to state_t::closed under circumstances that we consider it as a failure, for instance, connection_reset or read_eof. and
  2. change SocketMessenger::lookup_conn() to let it check for the found connection's state before returning it, so it also returns nullptr if the connection is already closed.

and, yes. we need to be careful about the "replacing" logic where the new connection takes the place of the existing failed one.

if we recreate a new connection if the existing one is found or closed, then i guess we can unreigster the connection in SocketMessenger::dispatch(). am i right?

cbodley and others added 3 commits September 3, 2018 16:29
Signed-off-by: Casey Bodley <cbodley@redhat.com>
consult base class for equivalence (which will match errors within
this category) before applying the extra error code mappings

Signed-off-by: Casey Bodley <cbodley@redhat.com>
Signed-off-by: Yingxin <yingxin.cheng@intel.com>
@cyx1231st
Copy link
Member Author

cyx1231st commented Sep 3, 2018

  1. change the state of failed SocketConnection to state_t::closed under circumstances that we consider it as a failure, for instance, connection_reset or read_eof. and

Right, I see there are already some code that handle failures with SocketConnection::close().

  1. change SocketMessenger::lookup_conn() to let it check for the found connection's state before returning it, so it also returns nullptr if the connection is already closed.

Caller will get an immediate usable connection ref then. But it won't guarantee that caller's connection ref can always be usable IMO..., since it can be remotely closed at anytime. So SocketConnection::send() still need to check the connection state, right?

Ahh I see send_ready is aimed to handle the remote close ...

if we recreate a new connection if the existing one is found or closed, then i guess we can unreigster the connection in SocketMessenger::dispatch(). am i right?

I think we should be careful if the existing connection is still at state_t::open state. Might need to crosscheck how the async-msgr deals with this...

@cyx1231st cyx1231st force-pushed the wip-seastar-msgr-fix branch 2 times, most recently from 749182b to b9061bb Compare September 4, 2018 05:16
@cyx1231st
Copy link
Member Author

added commit
crimson/net: remove unused and dup global_seq

It is possible while closing a connectin during a msgr shutdown,
a second close() can be called in `read_tags_until_next_message()`

Signed-off-by: Yingxin <yingxin.cheng@intel.com>
Signed-off-by: Yingxin <yingxin.cheng@intel.com>
Signed-off-by: Yingxin <yingxin.cheng@intel.com>
Signed-off-by: Yingxin <yingxin.cheng@intel.com>
seastar doesn't support mixed buffered writes and zero-copy writes.

Signed-off-by: Yingxin <yingxin.cheng@intel.com>
Copy link
Contributor

@tchaikov tchaikov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM in general. albeit not very comfortable with 2d61065 . probably we can fix it in another PR though. @cbodley what do you think?

@tchaikov
Copy link
Contributor

tchaikov commented Sep 4, 2018

Caller will get an immediate usable connection ref then. But it won't guarantee that caller's connection ref can always be usable IMO..., since it can be remotely closed at anytime.

right. but i think that's what the lossy policy is for. we try out best to return an usable connection to connect() 's caller, in the worst case, the msg will be resent to the connection replacing the closed one.

@cbodley
Copy link
Contributor

cbodley commented Sep 4, 2018

not very comfortable with 2d61065 . probably we can fix it in another PR though. @cbodley what do you think?

agreed. i think we should take this to fix the unit test, then put some more thought into a design that makes sense for both lossy/lossless

@tchaikov tchaikov merged commit b7b3ce8 into ceph:master Sep 5, 2018
@cyx1231st cyx1231st deleted the wip-seastar-msgr-fix branch September 5, 2018 08:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants