Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

prov/verbs: refactor initialization and fi_getinfo #5259

Merged
merged 6 commits into from Sep 12, 2019

Conversation

a-ilango
Copy link
Contributor

@a-ilango a-ilango commented Sep 9, 2019

This PR removes some technical debt and refactors verbs provider initialization and fi_getinfo to reduce overhead and improve logging.

This patch refactors how the different EP types are handled without changing
the behavior. If hints specified one of the two endpoints fi_ibv_get_matching_info
would have filtered out the unmatching EP type and only one of fi_ibv_handle_sock_addr
or fi_ibv_handle_ib_ud_addr would be run.

Signed-off-by: Arun C Ilango <arun.ilango@intel.com>
@a-ilango a-ilango requested a review from shefty September 9, 2019 22:04
- define a new function which gets rdma info (if available) given a struct
  ifaddrs. This function doesn't have failure logging unlike fi_ibv_get_rai_id.
  Failure are expected when trying out different interfaces and so it's good
  to keep the logging verbosity low.
- also return ENODATA when no valid IPoIB interfaces are found.

Signed-off-by: Arun C Ilango <arun.ilango@intel.com>
@swelch
Copy link
Contributor

swelch commented Sep 9, 2019

@a-ilango - With this PR I'm seeing 2 Node, 1 PPN XRC test fail with the following assert (RC seems fine):

prov/rxm/src/rxm.h:271: rxm_cmap_acquire_handle: Assertion `fi_addr < cmap->num_allocated' failed.

backtrace:
(gdb) bt
#0 0x00007fb3c5a5a160 in raise () from /lib64/libc.so.6
#1 0x00007fb3c5a5b741 in abort () from /lib64/libc.so.6
#2 0x00007fb3c5a5275a in __assert_fail_base () from /lib64/libc.so.6
#3 0x00007fb3c5a527d2 in __assert_fail () from /lib64/libc.so.6
#4 0x00007fb3c549488e in rxm_cmap_acquire_handle (cmap=0x771180, fi_addr=18446744073709551615) at prov/rxm/src/rxm.h:271
#5 0x00007fb3c54952f9 in rxm_ep_prepare_tx (rxm_ep=0x642100, dest_addr=18446744073709551615, rxm_conn=0x7fff1bbd38f8) at prov/rxm/src/rxm.h:982
#6 0x00007fb3c549a41e in rxm_ep_tinject_fast (ep_fid=0x642100, buf=0x0, len=0, dest_addr=18446744073709551615, tag=17592186044417)
at prov/rxm/src/rxm_ep.c:1819

I don't know why the destination if_addr is crazy with XRC and not RC.

@a-ilango
Copy link
Contributor Author

a-ilango commented Sep 9, 2019

@swelch That's weird. Do you know which commit cause in this series causes the issue?

@swelch
Copy link
Contributor

swelch commented Sep 10, 2019

@swelch That's weird. Do you know which commit cause in this series causes the issue?

I'll check.

@swelch
Copy link
Contributor

swelch commented Sep 10, 2019

@a-ilango - looks like it is commit 8d0b230, the earlier ones were working.

@a-ilango
Copy link
Contributor Author

looks like it is commit 8d0b230, the earlier ones were working.

Thanks. Does the test node have some verbs devices which don't have an IP address configured? If possible please verify fi_info -p verbs -v doesn't list any fi_info with src_addr empty, especially for the ones with XRC enabled.

Copy link
Member

@shefty shefty left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One question on the old code, and one concern on a possible memory leak.

if (ret)
continue;

ret = fi_ibv_add_rai(verbs_devs, id, rai);
ret = fi_ibv_add_rai(verbs_devs, dev_name, rai);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think dev_name leaks memory on failure here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, fixed it.

static void fi_ibv_remove_nosrc_info(struct fi_info **info)
{
struct fi_info **fi = info, *next;
while (*fi && ((*fi)->ep_attr->type == FI_EP_MSG)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This loop assumes that all fi_info structures associated with msg endpoints are grouped together. Is that always the case, even when XRC is used?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. It seems the MSG endpoints may not be grouped together (based on the loop in fi_ibv_init_info) and this function would miss to remove some "no src fi_info". I think the issue wasn't caught since apps are most likely to use the first matching fi_info.

The newer patch would not have this problem as we iterate through the complete list of devices and endpoint types to remove any fi_info that doesn't have IPoIB address configured.

@swelch
Copy link
Contributor

swelch commented Sep 10, 2019

Thanks. Does the test node have some verbs devices which don't have an IP address configured? If possible please verify fi_info -p verbs -v doesn't list any fi_info with src_addr empty, especially for the ones with XRC enabled.

All of the RDMA devices are configured with IPs. If I do a fi_info -p verbs -v I see the RDM endpoint with XRC domain has a NULL src address. If I do fi_info -p "ofi_rxm;verbs" -v the XRC domain is listed first, and the src address null.

Arun C Ilango added 4 commits September 11, 2019 15:42
Signed-off-by: Arun C Ilango <arun.ilango@intel.com>
Rather than detecting RDMA capable interfaces on every fi_getinfo call do it
once instead as part of the verbs provider initialization (fi_ini). This reduces
unnecessary processing.

Signed-off-by: Arun C Ilango <arun.ilango@intel.com>
Removing fi_info which doesn't have src_addr set is not needed anymore since the
previous commit made a change which avoids adding such info for FI_EP_MSG endpoint
type.

Signed-off-by: Arun C Ilango <arun.ilango@intel.com>
The previous code wasn't handling IPv6 or other address types.

Signed-off-by: Arun C Ilango <arun.ilango@intel.com>
@a-ilango
Copy link
Contributor Author

@shefty this PR should be ready to merge once the CI completes.

@a-ilango
Copy link
Contributor Author

LANL CI failure seems to be an irrelevant issue. Merging this.

@a-ilango a-ilango merged commit 24fab94 into ofiwg:master Sep 12, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants