-
Notifications
You must be signed in to change notification settings - Fork 275
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ch4/rma: enable native RMA for dynamic window #4804
Conversation
6ffd67c
to
36d91d8
Compare
test:mpich/ch4/most |
70ee44b
to
bd9bdeb
Compare
test:mpich/ch4/most |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First round of review, looks good so far.
MPL_DBG_MSG_FMT(MPIDI_CH4_DBG_GENERAL, VERBOSE, | ||
(MPL_DBG_FDEST, "Query atomics support: max_count 0x%lx, dtsize 0x%lx", *count, | ||
*dtsize)); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am against permanently adding too much debug messages. It is relatively simple to manually insert the log or printf during debugging when needed. The example here will add a log entry for every acc
atomic operations, right? I think it can easily drown out other debug messages if an application is atomic-heavy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It generates this line only when log is on. For each acc
path, it already generates many lines to track the calling stack (e.g., MPIR_FUNC_VERBOSE_ENTER, MPIR_FUNC_VERBOSE_EXIT). Thus, I don't think adding this message will pollute log. But it provides very useful info for us to figure out the cause why it does not use native atomics (at least it takes me long time to figure it out when testing on a new platform).
The log functionality will not be needed if we always manually add such prints only at debugging time :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The log functionality will not be needed if we always manually add such prints only at debugging time :-)
That is my opinion. :) I think we should delete all debug logging from repository unless those logging output makes sense to a non-mpich user.
The log functionality is needed so we can manually add it (printf sometime is too slow or too interfering). But with permanent embedded debug output , when we do need to debug, we'll have to parse the huge log, and potentially getting confused by unrelated output. In theory, the class can be used to filter output. But they are just added complexity. The time I needed to understand what class to filter and how to filter, I could just manually insert my desired debug output and be done with it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do believe the log can be useful. One example is that, a user reported to me that he cannot get native atomics on his platform for unknown reason. Then I can easily figure out the cause if the user can send me the log.
Anyway, this is really personally preference. I hope you will not block this PR because of the log preference:-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Anyway, this is really personally preference. I hope you will not block this PR because of the log preference:-)
Of course not :). I am just curious that you insisted.
bd9bdeb
to
72bd21d
Compare
test:mpich/ch4/most |
af7a4cb
to
402d0ef
Compare
test:mpich/ch4/most |
@hzhou I have addressed all comments, except the log preference one (please see my response under the comment). Can you please check again? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good over all. Here is my final review with some nitpicks.
src/mpid/ch4/netmod/ofi/ofi_win.c
Outdated
0ULL, /* In: flags */ | ||
&MPIDI_OFI_WIN(win).mr, /* Out: memregion object */ | ||
NULL), rc); /* In: context */ | ||
} else if (win->create_flavor == MPI_WIN_FLAVOR_DYNAMIC) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is visually disturbing when partial branches omits braces.
738cdcc
to
342930e
Compare
test:mpich/ch4/most |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me.
It indicates that all win_attach|detach calls for this window is collective over all processes in the communicator. Valid only for dynamic window. False by default.
Previous code always skips memory registration for dynamic window and disables native RMA for it. This patch allows native RMA in such a case through three steps: 1. Instead of always skip fi_mr_reg when base is NULL at window creation, try register if FI_MR_ALLOCATED is not set. If it successes, native RMA can be enabled as it registers the entire virtual address space. 2. If FI_MR_ALLOCATED is set, we have to register only the allocated buffers. For dynamic windows, we can register it and collectively exchange the mr_keys if coll_attach hint is set. 3. At RMA issuing path, we check whether a remote mr_key can be found to cover the entire range of target buffer(disp:disp+extent). If yes, we can leverage native RMA/atomics; otherwise, we fallback to AM path. This ensure correctness for operations that access to multiple attached segments. Two additional avl trees are needed per window: - dwin_mrs, storing local MRs. Used to close MR at win_detach. - dwin_target_mems: storing remote <addr, mr_key>. Used to prepare parameters for native calls. Co-Author: Hui Zhou <hzhou321@anl.gov>
The single test file generates three types of tests: 1. Separate test for each of put, acc, get_acc, fop, and cas accessing to a single attached region with dynamic window (arg=-count=4096 or 1). 2. Seperate test for each of put, acc, and get_acc accessing across two attached regions (with arg=-count=9216). 3. The same set of the above tests but with coll_attach hint.
If all nodes contain only single local process, we can enable ACCU_NO_SHM. Then netmod can control whether all atomics use AM or all use native calls.
342930e
to
8e6a4b7
Compare
Pull Request Description
For dynamic window, we always fallback to ch4 active message in current main branch. However, it can be optimized by using native RMA in certain situations. This PR implements it and improves the MPICH test suite for more dynamic window tests.
This patch enables native RMA through three steps:
Instead of always skip fi_mr_reg when base is NULL at window creation, try register if FI_MR_ALLOCATED is not set. If it successes, native RMA can be enabled as it registers the entire virtual address space.
If FI_MR_ALLOCATED is set, we have to register only the allocated buffers. For dynamic windows, we can register it and collectively exchange the mr_keys if coll_attach hint is set.
At RMA issuing path, we check whether a remote mr_key can be found to cover the entire range of target buffer(disp:disp+extent). If yes, we can leverage native RMA/atomics; otherwise, we fallback to AM path. This ensure correctness for operations that access to multiple attached segments.
Expected Impact
Author Checklist
module: short description
and follows good practice