New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

msg/async: Postpone bind if network stack is not ready #14414

Merged
merged 3 commits into from Apr 19, 2017

Conversation

Projects
None yet
4 participants
@Adirl

Adirl commented Apr 9, 2017

No description provided.

@Adirl

This comment has been minimized.

@@ -291,7 +291,15 @@ void AsyncMessenger::ready()
{
ldout(cct,10) << __func__ << " " << get_myaddr() << dendl;
stack->start();
stack->ready();
if (pending_bind) {

This comment has been minimized.

@yuyuyu101

yuyuyu101 Apr 11, 2017

Member

no, we don't want to introduce odd bind logic here, we need to maintain bind request in RDMAStack side.

This comment has been minimized.

@amirv

amirv Apr 11, 2017

Contributor

I thought so too, but the logic to find a free port to listen on is below the RDMAStack.

This comment has been minimized.

@amirv

amirv Apr 12, 2017

Contributor

@yuyuyu101 ,

As I said, I also preferred to have all the change inside RDMAStack.

But, there were two major problems I saw, with implementing according to that approach:

  1. In Processor::start() the listener socket needs to supply a fd to to register the event on. In RDMA-CM case, I need to start the librdmacm stack for that - and we don't want to do it before the fork. The only other option I can think of, is to have an intermediate eventfd that will be created before the fork. And after the fork, the real channel->fd event will write to the eventfd.
    This will open the door for other bugs - having 2 events for every listener event is not efficient nor good coding, the less events and contexts, the better we could sleep :)
    here is a quick pseudo code to explain what I meant:
    1. RDMAServerSocket() { my_eventfd = eventfd(); }
    2. int RDMAServerSocket::fd() { return my_eventfd; };
    3. post_fork:
      1. channel = rdma_create_id(); // This will call ibv_open_device()
      2. create_file_event(channel->fd) { listen_handler; }
      3. RDMAServerSocket::listen_handler() { write(listen_handler, ...); }
  2. If we call Processor::bind() before the fork, and postpone it inside RDMAStack(), which port should I return that Processor will save in Processor::listenaddr? Should I clone the code with the logic of scanning for free ports from Processor::bind into RDMAStack?

This is why I implemented it like I did. It should have zero influence on other stack types, only for RDMAStack the logic of pending_listen should be used. And as you can see the change is minimal and very not risky.
Considering the options, I think it the best option we have.

Amir

@Adirl @orendu

@yuyuyu101

This comment has been minimized.

Member

yuyuyu101 commented Apr 12, 2017

retest this please

@Adirl

This comment has been minimized.

Adirl commented Apr 13, 2017

Hi @yuyuyu101

"retest this please" is for Jenkins? for us?

@yuyuyu101

This comment has been minimized.

Member

yuyuyu101 commented Apr 13, 2017

for jenkins

@Adirl

This comment has been minimized.

Adirl commented Apr 13, 2017

yes, thought so...

yuyuyu101 and others added some commits Apr 4, 2017

msg/async: remove duplicate NetworkStack::start which exists in const…
…ructor

Issue: 995322
Change-Id: Ib8f7f25b9b360bea97147ce27c369c6baefcc6f3
Signed-off-by: Haomai Wang <haomai@xsky.com>
msg/async: Postpone bind if network stack is not ready
RDMAStack shouldn't access hardware from the parent process.
The only reason to do so, is because bind is called before the fork.
After this patch the bind is postponed until the NetworkStack reports
that it is ready to bind.
For NetworkStack types will always return true, except the RDMAStack
which will return true only after the fork (after
AsyncMessenger::ready() is called).

This patch is based on a patch by Haomai Wang <haomai@xsky.com>

Issue: 995322
Change-Id: I1d0d0d52db0a339b9319680c18ee05cde87b2b64
Signed-off-by: Amir Vadai <amir@vadai.me>
msg/async/rdma: Use RDMA resources only after fork
Thanks to previous patch [1], no need to access RDMA resources before
the fork. Initialize Infiniband class only before a connection is
established or a listener is created. [1] is making sure that the call
to RDMAWorker::listen() is postponed till after the fork.

[1] - 7393db45644d ("msg/async: Postpone bind if network stack is not ready")

Issue: 995322
Change-Id: I8ea246b2e03c8c9533bc324b2b8d142eb3d1ed4d
Signed-off-by: Amir Vadai <amir@vadai.me>
@Adirl

This comment has been minimized.

Adirl commented Apr 13, 2017

@yuyuyu101 @amirv @DanielBar-On
Added 3rd commit that removes the use of pre/post_fork
Testing is in progress

@Adirl

This comment has been minimized.

Adirl commented Apr 18, 2017

@yuyuyu101 @amirv
Did not pass all tests,
need to fix, please do not merge

@liewegas liewegas changed the title from msg/async: Postpone bind if network stack is not ready to msg/async: Postpone bind if network stack is not ready Apr 19, 2017

@liewegas liewegas merged commit abd52cb into ceph:master Apr 19, 2017

3 checks passed

Signed-off-by all commits in this PR are signed
Details
Unmodifed Submodules submodules for project are unmodified
Details
default Build finished.
Details

@Adirl Adirl deleted the Adirl:is_ready branch Apr 20, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment