msg/async: support of non-block connect in async messenger #5681

zuiwanyuan · 2015-08-27T09:23:24Z

Async messenger should avoid all potential blocking point, but we use a block connect now.
So we must add this feature.

loic-bot · 2015-08-27T12:09:47Z

SUCCESS: http://jenkins.ceph.dachary.org/job/ceph/7272/

SUCCESS http://jenkins.ceph.dachary.org/job/ceph/LABELS=centos-7&&x86_64/7272/

yuyuyu101 · 2015-08-27T13:13:22Z

src/msg/async/net_handler.cc

+{
+  int ret = ::connect(sd, (sockaddr*)&addr.addr, addr.addr_size());
+
+  if (ret < 0 && errno != EISCONN) {


it maybe still in connecting(EINPROGRESS)?

In reconnect , it must be wakened by poll(epoll) with an read event. That means it being connected or peer have sent some data to me. We can use reconnect to test whether it is connected or not. If it is connected, then errno will be EISCONN. If it is not connected, we will go to fail, and go to connecting again.

Actually it may not only wake up from epoll. AsyncConnection has time event which also can wakeup and it maybe legacy so that wakeup connection when connecting

…CTING_RE"

loic-bot · 2015-08-28T05:57:23Z

SUCCESS: http://jenkins.ceph.dachary.org/job/ceph/7297/

SUCCESS http://jenkins.ceph.dachary.org/job/ceph/LABELS=centos-7&&x86_64/7297/

zuiwanyuan · 2015-08-28T07:52:51Z

In my test, the second commit have serious problem. I close the fd in fault(). Let me correct it，and test

…ry happen , it may mark osd down wrongly.

loic-bot · 2015-08-28T19:34:11Z

SUCCESS: http://jenkins.ceph.dachary.org/job/ceph/7314/

… we must set first_connect false

loic-bot · 2015-08-29T15:04:57Z

SUCCESS: http://jenkins.ceph.dachary.org/job/ceph/7331/

yuyuyu101 · 2015-09-03T10:00:59Z

src/msg/async/AsyncConnection.cc

        if (r < 0) {
+          if (first_connect == true) {


What's the usage of "first_connect"?

When first call nonblock_connect, if we call reconnect fail. We don't need to go to fail and retry, we just break. In other cases, we must retry.
When you have many breakdown peer, retrying may make msg thread busy, and also mark_down other peer wrongly.

If r < 0, we should goto fail. Only when r == EINPROGRESS, we always need to break and retry later

Do you mean "errno == EINPROGRESS"? Oh, at this time, the read event still exist. Yes, you are right.

“2015-09-06 13:31:51.563489 7fcde1ffb700 20 -- :/1551015 >> 172.16.50.12:6789/0 conn(0x7fcde4171090 sd=18 :-1 s=STATE_CONNECTING_RE pgs=0 cs=0 l=1).process state is STATE_CONNECTING_RE, prev state is STATE_CONNECTING
2015-09-06 13:31:51.563497 7fcde1ffb700 10 NetHandler reconnect reconnect: Operation already in progress”
Subsquent calls to reconnect may return EALREADY, so we also check it. And let it to break?

… the right way.

loic-bot · 2015-09-06T06:54:05Z

SUCCESS: http://jenkins.ceph.dachary.org/job/ceph/7541/

SUCCESS http://jenkins.ceph.dachary.org/job/ceph/LABELS=ubuntu-14.04&&x86_64/7541/

tchaikov · 2015-09-09T04:57:53Z

@zuiwanyuan needs rebase your pull request against master to resolve the conflicts.

and could you please

prefix the title of your commit message with something like "msg/async:" ?
add a Fixes: #12802 line at the end of the commit message of the related commits
and add a Signed-off-by line for all your commits per https://github.com/ceph/ceph/blob/master/SubmittingPatches#L62 ?

zuiwanyuan · 2015-09-09T06:32:20Z

@tchaikov I close this PR. And create a new one: "msg/async: support of non-block connect in async messenger #5848"

tchaikov · 2015-09-09T06:46:42Z

@zuiwanyuan , you could also git push -f to your remote branch of wip-nonblock-connect after fixing your commits. that would update this pull request automatically.

zuiwanyuan · 2015-09-09T06:52:18Z

@tchaikov , thanks you. I get it now.

support of non-block connect in async messenger

9331d78

tchaikov assigned yuyuyu101 Aug 27, 2015

yuyuyu101 reviewed Aug 27, 2015
View reviewed changes

we should remove "bool reconnect", and make loop stay at "STATE_CONNE…

27b0da8

…CTING_RE"

zuiwanyuan added 2 commits August 28, 2015 22:06

we can't close fd in fault when state is STATE_CONNECTING_RE

1ac35be

we must avoid no sence connection retry. When a lot of connection ret…

2f7eeca

…ry happen , it may mark osd down wrongly.

when it is loopback, it first reconnect may success. So at this time,…

ac49b56

… we must set first_connect false

yuyuyu101 added feature core labels Sep 3, 2015

yuyuyu101 reviewed Sep 3, 2015
View reviewed changes

using EINPROGRESS & EALREADY to prevent msg thread from being busy is…

ccdfebe

… the right way.

zuiwanyuan changed the title ~~support of non-block connect in async messenger~~ msg/async: support of non-block connect in async messenger Sep 9, 2015

zuiwanyuan closed this Sep 9, 2015

zuiwanyuan deleted the wip-nonblock-connect branch September 9, 2015 06:26

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

msg/async: support of non-block connect in async messenger #5681

msg/async: support of non-block connect in async messenger #5681

zuiwanyuan commented Aug 27, 2015

loic-bot commented Aug 27, 2015

yuyuyu101 Aug 27, 2015

zuiwanyuan Aug 28, 2015

yuyuyu101 Aug 28, 2015

loic-bot commented Aug 28, 2015

zuiwanyuan commented Aug 28, 2015

loic-bot commented Aug 28, 2015

loic-bot commented Aug 29, 2015

yuyuyu101 Sep 3, 2015

zuiwanyuan Sep 4, 2015

yuyuyu101 Sep 4, 2015

zuiwanyuan Sep 6, 2015

zuiwanyuan Sep 6, 2015

yuyuyu101 Sep 6, 2015

loic-bot commented Sep 6, 2015

tchaikov commented Sep 9, 2015

zuiwanyuan commented Sep 9, 2015

tchaikov commented Sep 9, 2015

zuiwanyuan commented Sep 9, 2015

msg/async: support of non-block connect in async messenger #5681

msg/async: support of non-block connect in async messenger #5681

Conversation

zuiwanyuan commented Aug 27, 2015

loic-bot commented Aug 27, 2015

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

loic-bot commented Aug 28, 2015

zuiwanyuan commented Aug 28, 2015

loic-bot commented Aug 28, 2015

loic-bot commented Aug 29, 2015

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

loic-bot commented Sep 6, 2015

tchaikov commented Sep 9, 2015

zuiwanyuan commented Sep 9, 2015

tchaikov commented Sep 9, 2015

zuiwanyuan commented Sep 9, 2015