Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

msg/async: ibverbs/rdma support #11531

Merged
merged 7 commits into from Nov 4, 2016
Merged

msg/async: ibverbs/rdma support #11531

merged 7 commits into from Nov 4, 2016

Conversation

yuyuyu101
Copy link
Member

No description provided.

@yuyuyu101 yuyuyu101 changed the title wip ss msg/async: ibverbs/rdma support Oct 18, 2016
@yuyuyu101
Copy link
Member Author

a good sharp to merge, pass ceph_test_msgr, basic cluster io tests.

@avnerbh
Copy link
Contributor

avnerbh commented Oct 23, 2016

Hi Haomai,
Looks very comprehensive work
I tried to build this pull request and compilation failed on test applications

[ 89%] Building CXX object src/test/msgr/CMakeFiles/ceph_test_async_networkstack.dir/test_async_networkstack.cc.o
/mnt/data/avnerb/ceph/src/test/msgr/test_async_networkstack.cc: In lambda function:
/mnt/data/avnerb/ceph/src/test/msgr/test_async_networkstack.cc:166:61: error: no matching function for call to 'ServerSocket::accept(ConnectedSocket*, SocketOptions&, entity_addr_t*)'
       r = bind_socket.accept(&srv_socket, options, &cli_addr);

```                                                             ^
Am I missing something?

my build command was: 
git submodule update --init --recursive && ./install-deps.sh && ./do_cmake.sh && cd build && make

@yuyuyu101 yuyuyu101 force-pushed the wip-rdma branch 2 times, most recently from 1d212b0 to 5828f28 Compare October 23, 2016 08:44
@yuyuyu101
Copy link
Member Author

@avnerbh sorry, the later tests failed to adopt new interface, fixed !

@avnerbh
Copy link
Contributor

avnerbh commented Oct 23, 2016

10x - compilation now succeeded!

@Adirl
Copy link

Adirl commented Oct 23, 2016

Looks great!

i see in Infiniband.cc MAX_SHARED_RX_SGE_COUNT = 1;

is scatter gather supported?

Thanks, Adir

@liewegas
Copy link
Member

Can you squash down the 'make sure enough buffert' commit?

@yuyuyu101
Copy link
Member Author

@liewegas done

@yuyuyu101
Copy link
Member Author

hmm, actually we don't need this because sender memory won't be multi sge

On Sun, Oct 23, 2016 at 5:27 PM, Adir lev notifications@github.com wrote:

Looks great!

i see in Infiniband.cc MAX_SHARED_RX_SGE_COUNT = 1;

is scatter gather supported?

Thanks, Adir


You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
#11531 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/AA72YWz01obAaUT8ENwSEfUsvgxLnxRQks5q2yiCgaJpZM4KZyu4
.

Best Regards,

Wheat

@mattbenjamin
Copy link
Contributor

@yuyuyu101 @avnerbh @Adirl independent of the specific merits of this change, what do we all think (esp., Mellanox) the larger concentration of effort should be in the codebase for infiniband/roce support over time? That is, I'm not wedded to every detail of XioMessenger, but it feels like there's a case to be made that libxio's solutions for buffer management and credit management, in particular, are best of class; more broadly, is encapsulation within AsyncMessenger rather than as a distinct Messenger implementation actually the best layering going forward? OTOH, question for the Mellanox folks: is libxio no longer a preferred approach for you all? from an architecture viewpoint, at the moment, I'm tempted to say that the general approaches being taken (esp. buiding on RC) are mostly going the same direction as libxio, but with less maturity. OTOH, areas for improvement in XioMessenger integration involved overall resource management and buffer lifecycle--how much easier, if any, does pulling verbs into a single Messenger make things, and in what ways? many of the problems appear to me to be convergent, like different buffer lifecycle and 0-copy strategies; What are we missing?

@yuyuyu101
Copy link
Member Author

@mattbenjamin from my current view, I have two thoughts:

  1. native ibverbs support is expected to behavior simple and quick to use. Obviously the current ibverbs backend is short of memory management and other features. Since it's direct and simple, we can make it much suitable for osd needs
  2. another idea I think is integrate libxio into AsyncMessenger to become another backend, because asyncmessenger did little in io path but spent a lot in connection lifecycle mananement, most of times is consumed by underlying backend. Since dpdkstack supports zerocopy, there is no barrier to make libxio uses zero-copy under async messenger. This way maybe better for the xio&ceph integration I think.

@yuyuyu101
Copy link
Member Author

Via developing async msgr, I think the most difficult thing for a new msgr needs to go though enough qa rounds. Most magic things are coming from session management not network io path. That's why I thought multi backend under async msgr is better.

@Adirl
Copy link

Adirl commented Oct 26, 2016

@mattbenjamin @yuyuyu101
We are currently evaluating the new code
hope to come back with some insights soon.

Adir

@avnerbh
Copy link
Contributor

avnerbh commented Oct 26, 2016

@yuyuyu101

I get link errors when building with "-DWITH_RDMA=ON" about unresolved symbols for symbols that should come from libibverbs.so. ( The location of libibverbs.so in my installation is /usr/lib/ )
It can be that osd/mon are all built correctly and that the errors are just in compilation of test applications.
I tried to fix the cmake files, but unfortunately, I don't master cmake enough yet.

Below, is an example of the link errors:

Scanning dependencies of target unittest_osd_types
[ 66%] Building CXX object src/test/osd/CMakeFiles/unittest_osd_types.dir/types.cc.o
[ 66%] Linking CXX executable ../../../bin/ceph_test_rados_api_misc
../../../lib/libcommon.a(RDMAStack.cc.o): In function `Infiniband::CompletionChannel::ack_events()':
/mnt/data/avnerb/ceph/src/msg/async/rdma/Infiniband.h:439: undefined reference to `ibv_ack_cq_events'

@mattbenjamin
Copy link
Contributor

@yuyuyu101 there doesn't seem to be a likelihood that Ceph will say "no" to anything a Messenger implementation wants to carefully do, but I do think the Messenger and Dispatcher abstractions are valuable, and it would be worth examining how to strengthen it/remove magical aspects, if we know of them.

@yuyuyu101
Copy link
Member Author

@mattbenjamin sorry, I agree your point. But I'm more focus on that we can uses this shortly. Hmm, I guess you mean Messenger and Dispatcher is valuable to pontential backend? Yes, but I don't have clear mind how to redefine interface to separate session semantics and io path. Maybe I can have a try in msgr2 define. We can move magical things into a common component and make backend can have a more rich semantics.

@yuyuyu101
Copy link
Member Author

@avnerbh ah, I tried locally, it seemed ok to me:
Scanning dependencies of target ceph_test_rados_api_misc
[100%] Building CXX object src/test/librados/CMakeFiles/ceph_test_rados_api_misc.dir/misc.cc.o
[100%] Linking CXX executable ../../../bin/ceph_test_rados_api_misc
[100%] Built target ceph_test_rados_api_misc

@avnerbh
Copy link
Contributor

avnerbh commented Oct 26, 2016

@yuyuyu101
Hi
i am trying to figure out the difference between our environments?

  1. can you tell me the location of libibverbs.so in your environment? is it only in /usr/lib as I have it?
  2. for cmake, I only provide "-DWITH_RDMA=ON", can you tell if you provide additional arguments?
  3. particularly, running make ceph_test_librbd fails for me. Does this succeed for you?

TIA,
Avner

@markhpc
Copy link
Member

markhpc commented Oct 26, 2016

I just wanted to chime in regarding the xio vs async messenger question. I'd say let the people that are motivated develop each approach and let's see where we end up. There's enough complexity here that we might find unexpected benefits/downsides to either and encouraging people to follow their own instincts can only help us in the long run.

@yuyuyu101
Copy link
Member Author

@avnerbh

  1. could you check your /etc/ld.so.conf? I think you can add a new line "/usr/lib" to this file then run ldconfig
    /usr/lib64/libibverbs.so.1
    /usr/lib64/libibverbs.so.1.0.0
    /usr/lib64/libibverbs.so
    /usr/lib64/mlnx_ofed/valgrind/libibverbs.so.1
    /usr/lib64/mlnx_ofed/valgrind/libibverbs.so.1.0.0
    /usr/lib64/mlnx_ofed/valgrind/libibverbs.so
    /usr/lib64/libibverbs.a
  2. yes, only this
  3. passed
    [100%] Linking CXX executable ../../../bin/ceph_test_librbd
    [100%] Built target ceph_test_librbd

@mattbenjamin
Copy link
Contributor

mattbenjamin commented Oct 26, 2016

@markhpc I realize that's where we'll end up. So my constructive thinking is, a) as already written, let's not neglect Messenger, Dispatcher, etc, abstractions; b) from RDMA viewpoint, maybe we haven't done as much as we could to discuss the underlying implementation choices and techniques--if we have some time for that, it may help all efforts.

@avnerbh
Copy link
Contributor

avnerbh commented Oct 26, 2016

@yuyuyu101
Thanks!
I tried the /etc/ld.so.conf and ldconfig things that you wrote, and still compilation fails on the same place.
BTW, I am using Ubuntu 16.04 - can it be related?

anyhow, "make ceph-osd" does succeed in my environment; Hence, the problem is related only to test applications

Also notice, that even on machine in which I didn't add /usr/lib to /etc/ld.so.conf I still succeed to see libibverbs with ldconfig:

avnerb@clx-ssp-055 ceph git:pull_ibverbs$ sudo ldconfig -v 2> /dev/null | grep ibverbs
        libibverbs.so.1 -> libibverbs.so.1.0.0

@yuyuyu101
Copy link
Member Author

yuyuyu101 commented Oct 27, 2016

Oh, big sorry!

Because my coding env and build env is two different machine, I
checked in my build env and found this:

[root@localhost ceph]# git diff
diff --git a/src/CMakeLists.txt b/src/CMakeLists.txt
index 4617c70..482e855 100644
--- a/src/CMakeLists.txt
+++ b/src/CMakeLists.txt
@@ -555,7 +555,7 @@ endif (WITH_MGR)
 set(librados_config_srcs
   librados-config.cc)
 add_executable(librados-config ${librados_config_srcs})
-target_link_libraries(librados-config librados global
${BLKID_LIBRARIES} ${IBVERBS_LIBRARIES}
+target_link_libraries(librados-config librados global
${BLKID_LIBRARIES} ${RDMA_LIBRARIES}
   ${CMAKE_DL_LIBS})

 install(TARGETS librados-config DESTINATION bin)
@@ -683,7 +683,7 @@ set(ceph_osd_srcs
 add_executable(ceph-osd ${ceph_osd_srcs}
   $<TARGET_OBJECTS:common_util_obj>)
 add_dependencies(ceph-osd erasure_code_plugins)
-target_link_libraries(ceph-osd osd os global ${BLKID_LIBRARIES}
${IBVERBS_LIBRARIES})
+target_link_libraries(ceph-osd osd os global ${BLKID_LIBRARIES}
${RDMA_LIBRARIES})
 if(WITH_FUSE)
   target_link_libraries(ceph-osd ${FUSE_LIBRARIES})
 endif()

I forget to submit this change to PR!

Signed-off-by: Zhi Wang <wangzhi@xsky.com>
Signed-off-by: Haomai Wang <haomai@xsky.com>
Signed-off-by: Haomai Wang <haomai@xsky.com>
…uilt

Signed-off-by: Haomai Wang <haomai@xsky.com>
Signed-off-by: Haomai Wang <haomai@xsky.com>
Signed-off-by: Haomai Wang <haomai@xsky.com>
Signed-off-by: Haomai Wang <haomai@xsky.com>
@liewegas liewegas added this to the kraken milestone Nov 1, 2016
@yuriw yuriw merged commit ef087a4 into ceph:master Nov 4, 2016
@yuyuyu101 yuyuyu101 deleted the wip-rdma branch November 4, 2016 14:44
@avnerbh
Copy link
Contributor

avnerbh commented Nov 6, 2016

@yuriw
great for merging. please notice that tests results do not don't really reflect the new code since you probably compiled it without "-DWITH_RDMA=ON". (Otherwise, I believe you should see that compilation fails in Ubuntu)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
7 participants