New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

msg/async/rdma: Move resource handling to Device #14088

Merged
merged 3 commits into from Mar 27, 2017

Conversation

Projects
None yet
5 participants
@Adirl

Adirl commented Mar 22, 2017

No description provided.

@Adirl

This comment has been minimized.

Adirl commented Mar 22, 2017

saritz and others added some commits Mar 19, 2017

msg/async/rdma: Fix crash when running traffic
Root cause- called lock mutex twice by same thread.
Solution- Added function erase_qpn_lockless() and update to call new function.

issue:1007313

Change-Id: Iacc6b217331381219e269682d3fae4061a1922f9
Signed-off-by: Sarit Zubakov <saritz@mellanox.com>
msg/async/rdma: Move resource handling to Device
Move rx/tx completion channel and completion queue from RDMAStack into
Device class
Move SRQ and QP handling from Infiniband into Device class.
Adapt polling() to poll on multiple devices - will be used in a later
commit.

On construction Device will create a completion channel. This will
be done for every HCA on the server.
Only the Device in use will be initialized (init() will be called) and
cq, srq, MemoryManager and the rest of the resources will be allocated.

This patch also introduces RDMADispatcher::poll_{start,stop}
It is being used to stop the polling thread before destructing the
Device resources.

Issue: 995322
Change-Id: I79bfdc687ab690a46c05e271a436b33d8dba0182
Signed-off-by: Amir Vadai <amir@vadai.me>
msg/async/rdma: Move async event handling to Device
issue: none

Change-Id: I5e1ee73bb2d0d751774eaeb7dd2950c704894caf
Signed-off-by: Amir Vadai <amir@vadai.me>
@Adirl

This comment has been minimized.

Adirl commented Mar 23, 2017

@yuyuyu101
please take a look when you can
thanks

void Infiniband::handle_pre_fork()
{
device->uninit();

This comment has been minimized.

@yuyuyu101

yuyuyu101 Mar 23, 2017

Member

if multi device, I think we need to make all devices uninit?

This comment has been minimized.

@amirv

amirv Mar 24, 2017

Contributor

Correct. This commit is a preparation only. It only initializes the selected device (from the config parameter).
I will soon send you a patch that enable using more than 1 device, and the uninit() will be modified accordingly.

@yuyuyu101

This comment has been minimized.

Member

yuyuyu101 commented Mar 23, 2017

good cleanup, we add multi device supports here. but we only make use of one device now even after this pr, right?

@yuyuyu101

This comment has been minimized.

Member

yuyuyu101 commented Mar 23, 2017

if multi device, I think infiniband only need to operate DeviceList instead of one device.

@amirv

This comment has been minimized.

Contributor

amirv commented Mar 24, 2017

Yes, it is my mistake that I didn't share with you the patch plan:

This is the first part of 3 in the series. Every such part consist of few patches to make it easier for review.

In this part there are 2 patches which are only cleanup and code rearranging without changing the actual logic. Still only one device is being supported.

In part 2 we will send some patches that finishes the multi device support, and make the init, uninit and polling multidevice aware. Once this accepted, everything is ready to the last part.

In the last part we will add the RDMA-CM basic support. Which is our target for the end of the month.
A config param will enable the user to select on runtime if he wants RDMA-CM or TCP connection establishment.

Thanks,
Amir

@yuyuyu101

This comment has been minimized.

Member

yuyuyu101 commented Mar 25, 2017

@Adirl do we passed the basic tests?

@Adirl

This comment has been minimized.

Adirl commented Mar 26, 2017

@yuyuyu101 yes we did

@Adirl

This comment has been minimized.

Adirl commented Mar 27, 2017

@yuyuyu101
anything else needed here?

Thanks

@yuyuyu101 yuyuyu101 merged commit 53e0344 into ceph:master Mar 27, 2017

3 checks passed

Signed-off-by all commits in this PR are signed
Details
Unmodifed Submodules submodules for project are unmodified
Details
default Build finished.
Details

@Adirl Adirl deleted the Adirl:rdma-cm-2 branch Apr 18, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment