-
Notifications
You must be signed in to change notification settings - Fork 407
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
UCP/AM: Set up the new API for ucp active messages in ucp.h #4593
base: master
Are you sure you want to change the base?
Conversation
Can one of the admins verify this patch? |
Not ready yet. Closing. |
Actually it was ready. Reopening. |
retest |
Trying to get a retest |
This PR obsoletes #4256 |
0c5aa3b
to
8f43149
Compare
@shamisp @sssharka Here is the PR incorporating the API revisions from the face-to-face meeting. I still don't understand how the AM is supposed to succeed if the server function gets 'out-of-memory'; for the moment,I'm assuming the server will send a message to the client indicating the problem, and the client will then redrive the AM using the 'old' protocol; but this will likely then get a failure in the server when a ucs_malloc tries to allocate enough store to buffer the whole message. |
I have a version of the implementation here https://github.com/tjcw/ucx/tree/tjcw-am-rendezvous-implementation . At commit 4808554 , it compiles and runs |
8f43149
to
051dfca
Compare
@yosefe what code style guideline have I missed ? I can make the API compatible by adding an extra registration function in the server and leaving the existing one as-is; will that be acceptable ? |
in ucp.h : Indentation, spacing, comment style, newlines .. pls look on other code in ucp.h for examples.
Yes, and move the old one to compat.h |
Given upcoming branching for v1.8.0, I'm curious what is the plan for this API change. |
I would love to see it go in so we can focus on implementation... |
src/ucp/api/ucp.h
Outdated
@@ -2225,7 +2198,7 @@ ucs_status_t ucp_worker_set_am_handler(ucp_worker_h worker, uint16_t id, | |||
* @param [in] datatype Datatype descriptor for the elements in the buffer. | |||
* @param [in] cb Callback that is invoked upon completion of the | |||
* data transfer if it is not completed immediately. | |||
* @param [in] flags For Future use. | |||
* @param [in] flags see enum ucp_send_am_flags. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use doxygen format for referencing
src/ucp/api/ucp.h
Outdated
*/ | ||
enum ucp_am_params_field { | ||
/* Flafs field */ | ||
UCP_AM_FIELD_FLAGS = UCS_BIT(0), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor: align by =
src/ucp/api/ucp.h
Outdated
|
||
/* | ||
* The ucp library will drive a function on the arrival of each fragment | ||
* of data from the client. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would use initiator/target terminology. I'm not sure what client means in this context
@@ -418,12 +418,16 @@ enum ucp_am_cb_flags { | |||
* @brief Flags for sending a UCP Active Message. | |||
* | |||
* Flags dictate the behavior of ucp_am_send_nb | |||
* currently the only flag tells UCP to pass in | |||
* UCP_AM_SEND_REPLY tells UCP to pass in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pls document each flag behavior near the flag definition, see other examples of flags doumentation in UCP
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unhandled
src/ucp/api/ucp.h
Outdated
*/ | ||
enum ucp_send_am_flags { | ||
UCP_AM_SEND_REPLY = UCS_BIT(0) | ||
UCP_AM_SEND_REPLY = UCS_BIT(0) , |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
redundant space before ,
src/ucp/api/ucp.h
Outdated
ucs_status_t ucp_worker_set_am_handler( | ||
ucp_worker_h worker, | ||
const ucp_am_params_t *params, | ||
uint16_t id, | ||
ucp_am_callback_t cb, | ||
void *arg | ||
); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- this breaks backward compatibility
- lines are not wrapped according to code style, pls align the style with other functions in UCP
src/ucp/api/ucp.h
Outdated
typedef ucs_status_t (*ucp_am_callback_t)( | ||
void *arg, | ||
void *data, | ||
size_t length, | ||
ucp_ep_h reply_ep, | ||
unsigned flags, | ||
size_t remaining_length, | ||
ucp_am_recv_t *recv | ||
); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- indentation/line wrapping, see other function for an example
- breaks API
@tjcw the code style is still not aligned with UCX code style and other functions in ucp.h (it resembles PAMI H file though..). There are too many places to leave a comment for each of them. The bigger problem however is that it still breaks API/ABI by changing existing functions. |
f9f0e7e
to
5d6d2f6
Compare
src/ucp/api/ucp.h
Outdated
* present. It is used to enable backward compatibility support. | ||
*/ | ||
enum ucp_am_params_field { | ||
/* Flags field */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
indentation
* UCP active message function | ||
*/ | ||
typedef struct ucp_am_params { | ||
uint64_t field_mask; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
indentation
src/ucp/api/ucp.h
Outdated
typedef void (*ucp_am_data_function_t)( void *target, | ||
void *source, | ||
size_t bytes, | ||
void *cookie | ||
); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wrapping, spaces
src/ucp/api/ucp.h
Outdated
typedef ucs_status_t (*ucp_am_local_function_t)( void *arg, | ||
void *cookie, | ||
ucp_dt_iov_t *iovec, | ||
size_t iovec_length | ||
); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wrapping, spaces
src/ucp/api/ucp.h
Outdated
typedef ucs_status_t (*ucp_am_rndv_callback_t)( | ||
void *arg, | ||
void *data, | ||
size_t length, | ||
ucp_ep_h reply_ep, | ||
unsigned flags, | ||
size_t remaining_length, | ||
ucp_am_recv_t *recv | ||
); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wrapping, indentation
@@ -418,12 +418,16 @@ enum ucp_am_cb_flags { | |||
* @brief Flags for sending a UCP Active Message. | |||
* | |||
* Flags dictate the behavior of ucp_am_send_nb | |||
* currently the only flag tells UCP to pass in | |||
* UCP_AM_SEND_REPLY tells UCP to pass in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unhandled
src/ucp/api/ucp.h
Outdated
@@ -2225,7 +2198,7 @@ ucs_status_t ucp_worker_set_am_handler(ucp_worker_h worker, uint16_t id, | |||
* @param [in] datatype Datatype descriptor for the elements in the buffer. | |||
* @param [in] cb Callback that is invoked upon completion of the | |||
* data transfer if it is not completed immediately. | |||
* @param [in] flags For Future use. | |||
* @param [in] flags See enum ucp_send_am_flags. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use @ref
src/ucp/api/ucp.h
Outdated
*/ | ||
enum ucp_am_params_field { | ||
/* Flags field */ | ||
UCP_AM_FIELD_FLAGS = UCS_BIT(0), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
UCP_AM_PARAMS_FIELD_FLAGS
src/ucp/api/ucp.h
Outdated
/* Flags field */ | ||
UCP_AM_FIELD_FLAGS = UCS_BIT(0), | ||
/* Max size of the iovec returned by the initial AM */ | ||
UCP_AM_FIELD_IOVEC_SIZE = UCS_BIT(1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
UCP_AM_PARAMS_FIELD_IOVEC_SIZE
src/ucp/api/ucp.h
Outdated
/* Size of iovec filled in by the callback */ | ||
size_t iovec_length; | ||
/* iovec indicating where the data from the initiator is to be placed */ | ||
ucp_dt_iov_t iovec[1]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the struct is not extendable because it does not have field mask and also this array is in the end
5d6d2f6
to
bc62c40
Compare
Thanks @yosefe for the detailed review. I have attended to the items; it builds on my laptop. Let's see what CI says, and please re-review. |
@tjcw still, not all documentation and code style comments are addressed. can you pls go over the code again and fix? |
ac1bf8c
to
2648322
Compare
OK, I have gone over the changes again and fixed a number of code style issues. @yosefe how does it look now ? |
We cannot have something called "header" on the receiver side and no "header" on the sender side. This makes the API very counter intuitive and hard to program. Users would have to make an assumption that first entry of IOVEC is the "header" which is not the right assumption to make. You also make an assumption that first element is small enough is not getting fragmented. I think we need to introduce the "header" as a separate argument on both sender and receiver side. The API should be explicit about the fact that header is not getting fragmented. We also have to define an API that will let user to set pre-defined size of the header. @yosefe - what do you think ? |
enum ucp_am_params_recv { | ||
UCP_AM_PARAMS_RECV_FIELD_FLAGS = UCS_BIT(0) /* Flags field */ | ||
}; | ||
/* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not doxygen formatted. Everything should be doxygen formated in order to generate nice document
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you point me to an example that I should follow ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This a good reference http://www.doxygen.nl/manual/docblocks.html
For the single liner like above, you want to do /**< xxxx */
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, there is bunch of similar declarations ucp.h itself.
/* Function to be driven when data transfer is complete */ | ||
ucp_am_local_function_t local_fn; | ||
/* Argument to be passed to local_fn */ | ||
void *cookie; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cookie not described
src/ucp/api/ucp.h
Outdated
/* Function to be driven when each fragment of data transfer is complete | ||
* In the initial implementation, all data is transferred in one fragment, | ||
* so the data_fn is called once just before the local_fn . | ||
* This implementation is to ease porting of applicaitons currently |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are designing generic API that has to target multiple applications. Easy porting is nice to have but not an explicit goal. I would remove the last sentence.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, will do.
src/ucp/api/ucp.h
Outdated
ucp_am_data_function_t data_fn; | ||
/* Argument to be passed to data_fn */ | ||
void *data_cookie; | ||
/* Sise of iovec allocated by the ucp library */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sise ->Size
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch.
src/ucp/api/ucp.h
Outdated
* coded to use IBM PAMI. | ||
*/ | ||
ucp_am_data_function_t data_fn; | ||
/* Argument to be passed to data_fn */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another cookie. We need more description for each one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair enough. This is a carry-over from PAMI, which has a data movement callback and a completion callback. One cookie is for each callback. I will try to describe them better.
Re @shamisp s comment above. I haven't wanted to specify more parameters on the sending call for a header because I want to leave the door open for someone else to enhance the implementation for e.g. 3 or more iovec elements or a longer iovec[0] without needing to change the interface again |
Revised in respect of comments from @brminich move content from ucpx.h to ucp.h typo typo in @ref Revise AM client api to specify a flag for rendezvous call Revise interface to AM rendezvous from December 2019 f2f Add comment about replaced interface Remove reference to ucp_am_callback_t am rendezvous fields fix ucp_worker.h in respect of am rendezvous API. Rename removing 'rendezvous'. fix build breaks fix build breaks fix build breaks fix build breaks typo fix signature of am callbacks fix build breaks rearrangement of am registration typos cleanup fix doc typo (cherry picked from commit 22da00444cac7b3a59e79533347e8465d7f4910d) UCP/AM: Attempting to resolve shamisp's comments UCP/AM: Provicde the AM RDMA interface, updated with respect to review comments UCP/AM: Resolving review issues UCP/AM: Fixing typos review items wip (cherry picked from commit 5b99f49) UCP/AM: fix compatibility interface (cherry picked from commit c28fcdc70af23b98d02026e5bd2f77629caf76fb) UCP/AM: Move out compatibility interface UCP/AM: Compatibility fixes UCP/AM: Don't yet have implementation for ucp_worker_set_am_rdma_handler UCP/AM: Fixes after code review UCP/AM: withdraw test of am rdma, there is no implementation yet UCP/AM: fix trailing comma in enum UCP/AM: fixing conformance to coding standards Revisions in respect of review comments
2648322
to
7c006e1
Compare
I have revised the code comments as per the review. I haven't changed the comments about the cookie and the data_cookie because I think the commentary is sufficient as-is. When I merge the implementation of this there will be an example of usage in the 'gtest' ; this is currently in branch tjcw-am-rendezvous-implementation of https://github.com/tjcw/ucx . @shamisp @yosefe please re-review. |
We would have to come up with full implementation that is not limited to any particular iovec size. I suggest to have in-depth discussion on ucx bi-weekly call. |
Arranging to have separate parameters for 'header' and 'payload' complicates things. With the interface I have set up, if the implementation determines that a particular AM is not suitable for sending with RDMA, I can just call the 'old' API. Unsuitable AMs are those where the length of the iovec is not 2, or where the data length for iovec[0] is greater than the amount of space for it in the initial packet. If we change to pass the header separately (and I wanted to handle it), I would have to allocate store for a resultant 'iovec', and intercept the completion callback to free the 'iovec'. Rather than do this, I would probably fail an assert in the initiator. |
The active message API should not tie itself to 'iovec' datatype. The operation data has to be described in the way (buffer, datatype, count) as for other functions. Also. "32 bytes in the initial implementation" does not seem to me as a good way to define a communication api. How would a user write an application on top of such API and have it work with a future UCX version?
|
If someone makes a call on the initiator using the 'new' API (i.e. with the flag for RDMA set) but with parameters such that there is no 'new' implementation, the UCP library code falls back to carrying the data with the 'old' implementation (and on the target all the data will be in the buffer at the time the target AM handler is driven). This handles cases where the data is not described by an iovec, cases where iovec[0].length exceeds the threshold, and cases where there is some number other than 2 elements in the iovec. |
@tjcw support other collective library wants to use a 64-byte header for RDMA active message. would it be possible? |
If an application wanted to use a 64-byte header, it would be necessary to change the source code for UCP and rebuild. It would change the format of the initial packet sent from initiator to target, which contains an array which would change from char[32] to char[64], so an 'old' ucx would not interoperate with a 'new' ucx; this is OK for hpc usage where we know that the ucx versions on the initiator and target are the same. |
The initial packet from initiator to target is structured as
( https://github.com/tjcw/ucx/blob/tjcw-am-rendezvous-implementation/src/ucp/core/ucp_am.h#L61 ) where UCP_AM_RENDEZVOUS_IOVEC_0_MAX_SIZE is currently a constant 32. If this was changed to 64, the revised UCP would still work but old and new would not be interoperable and the new UCP would be (slightly) less efficient in use of network bandwidth. |
|
|
And response from @yosefe
|
re ucp_rkey : If I'm going to have to cope with ucp rkeys of indeterminate size, it doesn't help to know the max size up front, so don't add this API on my account. |
Revised in respect of comments from @brminich
move content from ucpx.h to ucp.h
typo
typo in @ref
Revise AM client api to specify a flag for rendezvous call
Revise interface to AM rendezvous from December 2019 f2f
Add comment about replaced interface
Remove reference to ucp_am_callback_t
am rendezvous fields
fix ucp_worker.h in respect of am rendezvous API. Rename removing 'rendezvous'.
fix build breaks
fix build breaks
fix build breaks
fix build breaks
typo
fix signature of am callbacks
fix build breaks
rearrangement of am registration
typos
cleanup
What
This PR revises the API for ucp active messages, to allow the client to request that an active message should flow over RDMA, and to allow the server to register an active message function for the message to be received into an application-provided buffer rather than a ucp-provided buffer.
This PR does not contain the implementation of the new API; there is only sufficient change to the implementation to allow ucp to build and the 'gtest' to pass.
This PR includes revisions suggested by the face-to-face meeting in December 2019.
Why ?
AM over RDMA reduces the number of times that the application must call the progress function in client and server. This improves the performance for applications which have concurrent work to perform while one or more AMs are in progress; specifically for an MPI application calling non-blocking collective operations when using a collectives library over ucp.
Receiving the message into an application-provided buffer reduces the amount of memory copying in the server application, which improves performance.
These feature is already in IBM PAMI.
How ?
An additional parameter is added to the AM server function, like the pami_recv_t * in IBM PAMI. The server function is driven after the first part of the AM is received, and indicates a buffer for the remainder of the data and a callback to be driven when the whole of the data is received.
The December 2019 f2f requested that this should be done by changing the API for ucp active messages, rather than adding a new API; this means that existing code using ucp active messages has to be revised.
There will be follow-up PRs providing the implementation. A prototype of the implementation is here https://github.com/tjcw/ucx/tree/tjcw-am-rdma-get .