Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use hybrid MR mode with the cxi provider #19148

Merged
merged 2 commits into from
Mar 7, 2022
Merged

Conversation

jhh67
Copy link
Contributor

@jhh67 jhh67 commented Jan 31, 2022

As described by @gbtitus:

The libfabric cxi provider for Cassini-based networks now supports the
so-called "hybrid" local memory registration mode. For functions that
accept a local MR descriptor, this allows the client to pass the
non-NULL MR descriptor (pointer) if the local address is already
registered, or NULL to indicate that the local memory is not registered
and the provider should use its internal MR cache. (Absent this hybrid
mode the provider always uses its internal MR cache, which is a source
of contention for multi-threaded clients such as Chapel and in certain
cases SHMEM.)

Resolves Cray/chapel-private#2957.

Signed-off-by: John H. Hartman jhh67@users.noreply.github.com

@jhh67
Copy link
Contributor Author

jhh67 commented Feb 4, 2022

Regarding rdma/fi_cxi_ext.h, it can't/won't be included in a core libfabric include file so we'll have to conditionally include it ourselves.

I confirmed that fi_open_ops returns "Invalid argument" (-22) if called with an invalid name argument, so that we will correctly fall back to non-hybrid MR mode if using a cxi provider that doesn't support it. I tested this by compiling against a version of rdma/fi_cxi_ext.h that defined FI_CXI_DOM_OPS_3 and linking against a version of libfabric that did not implement this option.

As described by @gbtitus:

The libfabric cxi provider for Cassini-based networks now supports the
so-called "hybrid" local memory registration mode. For functions that
accept a local MR descriptor, this allows the client to pass the
non-NULL MR descriptor (pointer) if the local address is already
registered, or NULL to indicate that the local memory is not registered
and the provider should use its internal MR cache. (Absent this hybrid
mode the provider always uses its internal MR cache, which is a source
of contention for multi-threaded clients such as Chapel and in certain
cases SHMEM.)

Signed-off-by: John H. Hartman <jhh67@users.noreply.github.com>
The CHPL_RT_COMM_OFI_CXI_HYBRID_MR environment variable controls whether
or not the cxi hybrid MR mode is used. It is enabled by default.

Signed-off-by: John H. Hartman <jhh67@users.noreply.github.com>
Copy link
Contributor

@ronawho ronawho left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

Full -multilocale-only testing was clean. See https://github.com/Cray/chapel-private/issues/3153 for instructions on running testing going forward.

Performance also looks good. This significantly improves thread scaling for fetching atomics and GETs. Here's results for the comm-ops microbenchmark:

Fetching AMO:

cores main hybrid
1 1.55s (0.32 Mops/s) 1.47s ( 0.34 Mops/s)
4 2.32s (0.86 Mops/s) 1.50s ( 1.33 Mops/s)
16 8.56s (0.93 Mops/s) 1.53s ( 5.23 Mops/s)
64 45.73s (0.70 Mops/s) 1.60s (19.98 Mops/s)

GET:

cores main hybrid
1 1.47s (0.34 Mops/s) 1.42s ( 0.35 Mops/s)
4 2.28s (0.88 Mops/s) 1.47s ( 1.36 Mops/s)
16 8.43s (0.95 Mops/s) 1.50s ( 5.35 Mops/s)
64 44.79s (0.71 Mops/s) 1.56s (20.54 Mops/s)

And here's results for 16-node fine-grained indexgather:

cores main hybrid
1 0.008 GB/s/node 0.008 GB/s/node
4 0.014 GB/s/node 0.018 GB/s/node
16 0.016 GB/s/node 0.071 GB/s/node
64 0.011 GB/s/node 0.236 GB/s/node

@jhh67 jhh67 merged commit d9079b0 into chapel-lang:main Mar 7, 2022
@jhh67 jhh67 deleted the T2957 branch March 7, 2022 14:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants