Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

msg/async/rdma: add an option for choosing different RoCE protocol #31517

Merged
merged 2 commits into from Nov 16, 2019

Conversation

changchengx
Copy link
Contributor

Checklist

  • References tracker ticket
  • Updates documentation if necessary
  • Includes tests for new functionality or reproducer for bug

Show available Jenkins commands
  • jenkins retest this please
  • jenkins test crimson perf
  • jenkins test signed
  • jenkins test make check
  • jenkins test make check arm64
  • jenkins test submodules
  • jenkins test dashboard
  • jenkins test dashboard backend
  • jenkins test docs
  • jenkins render docs
  • jenkins test ceph-volume all
  • jenkins test ceph-volume tox

@changchengx
Copy link
Contributor Author

@rouming Please help review.

@changchengx
Copy link
Contributor Author

@tchaikov Please help review.

@rouming
Copy link
Contributor

rouming commented Nov 11, 2019

Looks good to me.

@@ -39,7 +40,7 @@ Port::Port(CephContext *cct, struct ibv_context* ictxt, uint8_t ipn): ctxt(ictxt
}

lid = port_attr.lid;

assert(gid_idx < port_attr.gid_tbl_len);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'd suggest use ceph_assert() here instead. because assert() will be optimized out on any Release build. and we do need to fail no matter running Release or Debug build.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree. Thanks a lot

@@ -1148,6 +1148,10 @@ std::vector<Option> get_global_options() {
.set_default(1000)
.set_description(""),

Option("ms_async_rdma_gid_idx", Option::TYPE_INT, Option::LEVEL_ADVANCED)
.set_default(0)
.set_description(""),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you add a short description?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll refine it.

@@ -153,6 +153,7 @@ OPTION(ms_async_rdma_receive_queue_len, OPT_U32)
OPTION(ms_async_rdma_support_srq, OPT_BOOL)
OPTION(ms_async_rdma_port_num, OPT_U32)
OPTION(ms_async_rdma_polling_us, OPT_U32)
OPTION(ms_async_rdma_gid_idx, OPT_INT)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if this option is not read on critical path, probably we don't need to add it to legacy_config_opts.h.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll remove it from legacy_config_opts.h since rdma is not on critical path.

src/msg/async/rdma/Infiniband.cc Outdated Show resolved Hide resolved
1. use gid_idx as parameter to get gid instead of getting the first(0) gid
by default.
int ibv_query_gid(struct ibv_context *context, uint8_t port_num, int
	index, union ibv_gid *gid)
2. gid is initialized in class type declaration, there's no need to
initialized in the member construction list.

Signed-off-by: Changcheng Liu <changcheng.liu@aliyun.com>
With Mellanox NIC, different gid_idx maps to different gid combined with
different RoCE protocol e.g. RoCEv1 or RoCEv2
1. Add configuration ms_async_rdma_gid_idx to set gid_idx.
1. Use "0" as default gid_idx to keep compatibility.
2. Initialize gid_idx in class member construction list.

Signed-off-by: Changcheng Liu <changcheng.liu@aliyun.com>
@tchaikov tchaikov changed the title add gid_idx configuration option to select gid for different RoCE protocol msg/async/rdma: add an option for choosing different RoCE protocol Nov 11, 2019
@tchaikov tchaikov added messenger Issues involving one of the Ceph messenger implementations needs-qa wip-kefu-testing labels Nov 11, 2019
@tchaikov tchaikov merged commit 06811fc into ceph:master Nov 16, 2019
@changchengx changchengx deleted the rdma_gid branch December 16, 2019 01:13
@ugiwgh
Copy link

ugiwgh commented Nov 10, 2021

Is there any documents about RoCE protocol of ceph?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
messenger Issues involving one of the Ceph messenger implementations needs-qa wip-kefu-testing
Projects
None yet
4 participants