Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UCP/AM: Set up the new API for ucp active messages in ucp.h #4593

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

tjcw
Copy link

@tjcw tjcw commented Dec 18, 2019

Revised in respect of comments from @brminich

move content from ucpx.h to ucp.h

typo

typo in @ref

Revise AM client api to specify a flag for rendezvous call

Revise interface to AM rendezvous from December 2019 f2f

Add comment about replaced interface

Remove reference to ucp_am_callback_t

am rendezvous fields

fix ucp_worker.h in respect of am rendezvous API. Rename removing 'rendezvous'.

fix build breaks

fix build breaks

fix build breaks

fix build breaks

typo

fix signature of am callbacks

fix build breaks

rearrangement of am registration

typos

cleanup

What

This PR revises the API for ucp active messages, to allow the client to request that an active message should flow over RDMA, and to allow the server to register an active message function for the message to be received into an application-provided buffer rather than a ucp-provided buffer.
This PR does not contain the implementation of the new API; there is only sufficient change to the implementation to allow ucp to build and the 'gtest' to pass.
This PR includes revisions suggested by the face-to-face meeting in December 2019.

Why ?

AM over RDMA reduces the number of times that the application must call the progress function in client and server. This improves the performance for applications which have concurrent work to perform while one or more AMs are in progress; specifically for an MPI application calling non-blocking collective operations when using a collectives library over ucp.
Receiving the message into an application-provided buffer reduces the amount of memory copying in the server application, which improves performance.
These feature is already in IBM PAMI.

How ?

An additional parameter is added to the AM server function, like the pami_recv_t * in IBM PAMI. The server function is driven after the first part of the AM is received, and indicates a buffer for the remainder of the data and a callback to be driven when the whole of the data is received.
The December 2019 f2f requested that this should be done by changing the API for ucp active messages, rather than adding a new API; this means that existing code using ucp active messages has to be revised.
There will be follow-up PRs providing the implementation. A prototype of the implementation is here https://github.com/tjcw/ucx/tree/tjcw-am-rdma-get .

@mellanox-github
Copy link
Contributor

Can one of the admins verify this patch?

@tjcw
Copy link
Author

tjcw commented Dec 18, 2019

Not ready yet. Closing.

@tjcw tjcw closed this Dec 18, 2019
@tjcw
Copy link
Author

tjcw commented Dec 18, 2019

Actually it was ready. Reopening.

@tjcw tjcw reopened this Dec 18, 2019
@tjcw tjcw changed the title Set up the new API in ucpx.h UCP/AM Set up the new API for ucp active messages in ucp.h Dec 18, 2019
@tjcw
Copy link
Author

tjcw commented Dec 18, 2019

retest

@tjcw
Copy link
Author

tjcw commented Dec 18, 2019

Trying to get a retest

@tjcw tjcw closed this Dec 18, 2019
@tjcw tjcw reopened this Dec 18, 2019
@tjcw
Copy link
Author

tjcw commented Dec 18, 2019

This PR obsoletes #4256

@tjcw tjcw changed the title UCP/AM Set up the new API for ucp active messages in ucp.h UCP/AM: Set up the new API for ucp active messages in ucp.h Dec 18, 2019
@tjcw tjcw closed this Dec 18, 2019
@tjcw tjcw reopened this Dec 18, 2019
@tjcw tjcw force-pushed the tjcw-am-rendezvous-api-with-flag branch 2 times, most recently from 0c5aa3b to 8f43149 Compare December 18, 2019 13:42
@tjcw
Copy link
Author

tjcw commented Dec 18, 2019

@shamisp @sssharka Here is the PR incorporating the API revisions from the face-to-face meeting. I still don't understand how the AM is supposed to succeed if the server function gets 'out-of-memory'; for the moment,I'm assuming the server will send a message to the client indicating the problem, and the client will then redrive the AM using the 'old' protocol; but this will likely then get a failure in the server when a ucs_malloc tries to allocate enough store to buffer the whole message.
My understanding is that the 'old' protocol could flow message fragments over multiple lanes between client and server; code in the server handles the possibility that fragments will arrive out of order, but this means that the message isn't in general properly pipelined.

@tjcw
Copy link
Author

tjcw commented Dec 18, 2019

I have a version of the implementation here https://github.com/tjcw/ucx/tree/tjcw-am-rendezvous-implementation . At commit 4808554 , it compiles and runs gtest --gtest_filter="*_am*" when configured with configure-devel ; if I configure with configure-release then gtest gets undefined symbol ucs_malloc, which I think is a breakage unrelated to my code.
I haven't yet tried it with my development version of IBM Spectrum MPI and an MPI application; that comes next. And I haven't tried splitting the implementation into 500-line chunks for PRs.

@tjcw tjcw force-pushed the tjcw-am-rendezvous-api-with-flag branch from 8f43149 to 051dfca Compare December 19, 2019 11:39
@tjcw
Copy link
Author

tjcw commented Dec 19, 2019

@brminich I think I have addressed all the issues with #4256 that were raised at the face-to-face, and this PR passes CI now. I think it's ready for review and possible merge.

@yosefe
Copy link
Contributor

yosefe commented Dec 20, 2019

@tjcw pls fix according to UCX code style , also must not break API and cannot change existing functions
@shamisp pls take a look as well

@tjcw
Copy link
Author

tjcw commented Dec 21, 2019

@yosefe what code style guideline have I missed ? I can make the API compatible by adding an extra registration function in the server and leaving the existing one as-is; will that be acceptable ?

@yosefe
Copy link
Contributor

yosefe commented Dec 21, 2019

@yosefe what code style guideline have I missed ?

in ucp.h : Indentation, spacing, comment style, newlines .. pls look on other code in ucp.h for examples.

I can make the API compatible by adding an extra registration function in the server and leaving the existing one as-is; will that be acceptable ?

Yes, and move the old one to compat.h

@shamisp
Copy link
Contributor

shamisp commented Jan 25, 2020

Given upcoming branching for v1.8.0, I'm curious what is the plan for this API change.

@shamisp shamisp added the API label Jan 26, 2020
@sssharka
Copy link

I would love to see it go in so we can focus on implementation...

@@ -2225,7 +2198,7 @@ ucs_status_t ucp_worker_set_am_handler(ucp_worker_h worker, uint16_t id,
* @param [in] datatype Datatype descriptor for the elements in the buffer.
* @param [in] cb Callback that is invoked upon completion of the
* data transfer if it is not completed immediately.
* @param [in] flags For Future use.
* @param [in] flags see enum ucp_send_am_flags.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use doxygen format for referencing

*/
enum ucp_am_params_field {
/* Flafs field */
UCP_AM_FIELD_FLAGS = UCS_BIT(0),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor: align by =


/*
* The ucp library will drive a function on the arrival of each fragment
* of data from the client.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would use initiator/target terminology. I'm not sure what client means in this context

src/ucp/api/ucp.h Outdated Show resolved Hide resolved
src/ucp/api/ucp.h Outdated Show resolved Hide resolved
@@ -418,12 +418,16 @@ enum ucp_am_cb_flags {
* @brief Flags for sending a UCP Active Message.
*
* Flags dictate the behavior of ucp_am_send_nb
* currently the only flag tells UCP to pass in
* UCP_AM_SEND_REPLY tells UCP to pass in
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pls document each flag behavior near the flag definition, see other examples of flags doumentation in UCP

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unhandled

*/
enum ucp_send_am_flags {
UCP_AM_SEND_REPLY = UCS_BIT(0)
UCP_AM_SEND_REPLY = UCS_BIT(0) ,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

redundant space before ,

Comment on lines 3286 to 3292
ucs_status_t ucp_worker_set_am_handler(
ucp_worker_h worker,
const ucp_am_params_t *params,
uint16_t id,
ucp_am_callback_t cb,
void *arg
);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • this breaks backward compatibility
  • lines are not wrapped according to code style, pls align the style with other functions in UCP

Comment on lines 3255 to 3263
typedef ucs_status_t (*ucp_am_callback_t)(
void *arg,
void *data,
size_t length,
ucp_ep_h reply_ep,
unsigned flags,
size_t remaining_length,
ucp_am_recv_t *recv
);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • indentation/line wrapping, see other function for an example
  • breaks API

@yosefe
Copy link
Contributor

yosefe commented Feb 8, 2020

@tjcw the code style is still not aligned with UCX code style and other functions in ucp.h (it resembles PAMI H file though..). There are too many places to leave a comment for each of them.

The bigger problem however is that it still breaks API/ABI by changing existing functions.
So overall my comment #4593 (comment) is still not addressed..

@tjcw tjcw force-pushed the tjcw-am-rendezvous-api-with-flag branch from f9f0e7e to 5d6d2f6 Compare February 9, 2020 10:16
* present. It is used to enable backward compatibility support.
*/
enum ucp_am_params_field {
/* Flags field */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

indentation

* UCP active message function
*/
typedef struct ucp_am_params {
uint64_t field_mask;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

indentation

Comment on lines 3153 to 3157
typedef void (*ucp_am_data_function_t)( void *target,
void *source,
size_t bytes,
void *cookie
);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wrapping, spaces

Comment on lines 3171 to 3175
typedef ucs_status_t (*ucp_am_local_function_t)( void *arg,
void *cookie,
ucp_dt_iov_t *iovec,
size_t iovec_length
);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wrapping, spaces

Comment on lines 3252 to 3260
typedef ucs_status_t (*ucp_am_rndv_callback_t)(
void *arg,
void *data,
size_t length,
ucp_ep_h reply_ep,
unsigned flags,
size_t remaining_length,
ucp_am_recv_t *recv
);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wrapping, indentation

@@ -418,12 +418,16 @@ enum ucp_am_cb_flags {
* @brief Flags for sending a UCP Active Message.
*
* Flags dictate the behavior of ucp_am_send_nb
* currently the only flag tells UCP to pass in
* UCP_AM_SEND_REPLY tells UCP to pass in
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unhandled

@@ -2225,7 +2198,7 @@ ucs_status_t ucp_worker_set_am_handler(ucp_worker_h worker, uint16_t id,
* @param [in] datatype Datatype descriptor for the elements in the buffer.
* @param [in] cb Callback that is invoked upon completion of the
* data transfer if it is not completed immediately.
* @param [in] flags For Future use.
* @param [in] flags See enum ucp_send_am_flags.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use @ref

*/
enum ucp_am_params_field {
/* Flags field */
UCP_AM_FIELD_FLAGS = UCS_BIT(0),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

UCP_AM_PARAMS_FIELD_FLAGS

/* Flags field */
UCP_AM_FIELD_FLAGS = UCS_BIT(0),
/* Max size of the iovec returned by the initial AM */
UCP_AM_FIELD_IOVEC_SIZE = UCS_BIT(1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

UCP_AM_PARAMS_FIELD_IOVEC_SIZE

/* Size of iovec filled in by the callback */
size_t iovec_length;
/* iovec indicating where the data from the initiator is to be placed */
ucp_dt_iov_t iovec[1];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the struct is not extendable because it does not have field mask and also this array is in the end

@tjcw tjcw force-pushed the tjcw-am-rendezvous-api-with-flag branch from 5d6d2f6 to bc62c40 Compare February 11, 2020 10:43
@tjcw
Copy link
Author

tjcw commented Feb 11, 2020

Thanks @yosefe for the detailed review. I have attended to the items; it builds on my laptop. Let's see what CI says, and please re-review.

@yosefe
Copy link
Contributor

yosefe commented Feb 11, 2020

@tjcw still, not all documentation and code style comments are addressed. can you pls go over the code again and fix?

@tjcw tjcw force-pushed the tjcw-am-rendezvous-api-with-flag branch from ac1bf8c to 2648322 Compare February 12, 2020 07:47
@tjcw
Copy link
Author

tjcw commented Feb 12, 2020

OK, I have gone over the changes again and fixed a number of code style issues. @yosefe how does it look now ?

@shamisp
Copy link
Contributor

shamisp commented Feb 14, 2020

So I call the first element of the iovec the 'header'.

We cannot have something called "header" on the receiver side and no "header" on the sender side. This makes the API very counter intuitive and hard to program. Users would have to make an assumption that first entry of IOVEC is the "header" which is not the right assumption to make. You also make an assumption that first element is small enough is not getting fragmented.

I think we need to introduce the "header" as a separate argument on both sender and receiver side. The API should be explicit about the fact that header is not getting fragmented. We also have to define an API that will let user to set pre-defined size of the header.

@yosefe - what do you think ?

enum ucp_am_params_recv {
UCP_AM_PARAMS_RECV_FIELD_FLAGS = UCS_BIT(0) /* Flags field */
};
/*
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not doxygen formatted. Everything should be doxygen formated in order to generate nice document

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you point me to an example that I should follow ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This a good reference http://www.doxygen.nl/manual/docblocks.html
For the single liner like above, you want to do /**< xxxx */

Copy link
Contributor

@shamisp shamisp Feb 14, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, there is bunch of similar declarations ucp.h itself.

/* Function to be driven when data transfer is complete */
ucp_am_local_function_t local_fn;
/* Argument to be passed to local_fn */
void *cookie;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cookie not described

/* Function to be driven when each fragment of data transfer is complete
* In the initial implementation, all data is transferred in one fragment,
* so the data_fn is called once just before the local_fn .
* This implementation is to ease porting of applicaitons currently
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are designing generic API that has to target multiple applications. Easy porting is nice to have but not an explicit goal. I would remove the last sentence.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, will do.

ucp_am_data_function_t data_fn;
/* Argument to be passed to data_fn */
void *data_cookie;
/* Sise of iovec allocated by the ucp library */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sise ->Size

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch.

* coded to use IBM PAMI.
*/
ucp_am_data_function_t data_fn;
/* Argument to be passed to data_fn */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another cookie. We need more description for each one.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough. This is a carry-over from PAMI, which has a data movement callback and a completion callback. One cookie is for each callback. I will try to describe them better.

@tjcw
Copy link
Author

tjcw commented Feb 14, 2020

Re @shamisp s comment above. I haven't wanted to specify more parameters on the sending call for a header because I want to leave the door open for someone else to enhance the implementation for e.g. 3 or more iovec elements or a longer iovec[0] without needing to change the interface again

Revised in respect of comments from @brminich

move content from ucpx.h to ucp.h

typo

typo in @ref

Revise AM client api to specify a flag for rendezvous call

Revise interface to AM rendezvous from December 2019 f2f

Add comment about replaced interface

Remove reference to ucp_am_callback_t

am rendezvous fields

fix ucp_worker.h in respect of am rendezvous API. Rename removing 'rendezvous'.

fix build breaks

fix build breaks

fix build breaks

fix build breaks

typo

fix signature of am callbacks

fix build breaks

rearrangement of am registration

typos

cleanup

fix doc typo

(cherry picked from commit 22da00444cac7b3a59e79533347e8465d7f4910d)

UCP/AM: Attempting to resolve shamisp's comments

UCP/AM: Provicde the AM RDMA interface, updated with respect to review
comments

UCP/AM: Resolving review issues

UCP/AM: Fixing typos

review items wip

(cherry picked from commit 5b99f49)

UCP/AM: fix compatibility interface

(cherry picked from commit c28fcdc70af23b98d02026e5bd2f77629caf76fb)

UCP/AM: Move out compatibility interface

UCP/AM: Compatibility fixes

UCP/AM: Don't yet have implementation for ucp_worker_set_am_rdma_handler

UCP/AM: Fixes after code review

UCP/AM: withdraw test of am rdma, there is no implementation yet

UCP/AM: fix trailing comma in enum

UCP/AM: fixing conformance to coding standards

Revisions in respect of review comments
@tjcw tjcw force-pushed the tjcw-am-rendezvous-api-with-flag branch from 2648322 to 7c006e1 Compare February 15, 2020 10:26
@tjcw
Copy link
Author

tjcw commented Feb 15, 2020

I have revised the code comments as per the review. I haven't changed the comments about the cookie and the data_cookie because I think the commentary is sufficient as-is. When I merge the implementation of this there will be an example of usage in the 'gtest' ; this is currently in branch tjcw-am-rendezvous-implementation of https://github.com/tjcw/ucx . @shamisp @yosefe please re-review.

@shamisp
Copy link
Contributor

shamisp commented Feb 15, 2020

Re @shamisp s comment above. I haven't wanted to specify more parameters on the sending call for a header because I want to leave the door open for someone else to enhance the implementation for e.g. 3 or more iovec elements or a longer iovec[0] without needing to change the interface again

We would have to come up with full implementation that is not limited to any particular iovec size.
I think the clean way to do it, is to introduce a header which explicitly guarantee no fragmentation on the implementation level. Payload itself can be represented as iovec.

I suggest to have in-depth discussion on ucx bi-weekly call.

@tjcw
Copy link
Author

tjcw commented Feb 15, 2020

Arranging to have separate parameters for 'header' and 'payload' complicates things. With the interface I have set up, if the implementation determines that a particular AM is not suitable for sending with RDMA, I can just call the 'old' API. Unsuitable AMs are those where the length of the iovec is not 2, or where the data length for iovec[0] is greater than the amount of space for it in the initial packet. If we change to pass the header separately (and I wanted to handle it), I would have to allocate store for a resultant 'iovec', and intercept the completion callback to free the 'iovec'. Rather than do this, I would probably fail an assert in the initiator.
When is the next bi-weekly call ? I think I will be unavailable on 19 February, but I should be available on Wednesdays after that.

@yosefe
Copy link
Contributor

yosefe commented Feb 16, 2020

The active message API should not tie itself to 'iovec' datatype. The operation data has to be described in the way (buffer, datatype, count) as for other functions.

Also. "32 bytes in the initial implementation" does not seem to me as a good way to define a communication api. How would a user write an application on top of such API and have it work with a future UCX version?

 * If the UCP_AM_SEND_RENDEZVOUS flag is set, and the AM is for an iovec with
 * exactly 2 elements, and the first element length does not exceed a
 * threshold (32 bytes in the initial implementation), then the AM gets
 * carried partially by RDMA. The first element of the iovec

@tjcw
Copy link
Author

tjcw commented Feb 16, 2020

If someone makes a call on the initiator using the 'new' API (i.e. with the flag for RDMA set) but with parameters such that there is no 'new' implementation, the UCP library code falls back to carrying the data with the 'old' implementation (and on the target all the data will be in the buffer at the time the target AM handler is driven). This handles cases where the data is not described by an iovec, cases where iovec[0].length exceeds the threshold, and cases where there is some number other than 2 elements in the iovec.
If the initiator is using the flag for RDMA, the handler on the target needs to handle both the case where only iovec[0] is carried before the handler is driven, and the case where all the data is carried before the handler is driven. This should enable the application to continue to work with future UCX versions.
I'm interested in the high-performance-computing case (e.g. IBM Spectrum MPI, or other MPIs with collective libraries which can benefit from UCX); in this case it will be the same version of UCX on the initiator and the target. So I'm not too fussed about what happens in the general client/server case where you might have different versions of UCX on initiator and target; for this case I would suggest that the call at the initiator should not set the RDMA flag.

@yosefe
Copy link
Contributor

yosefe commented Feb 16, 2020

@tjcw support other collective library wants to use a 64-byte header for RDMA active message. would it be possible?
also, what if the initiator datatype is "generic", would it still be able to use custom function to process streaming data?

@tjcw
Copy link
Author

tjcw commented Feb 19, 2020

If an application wanted to use a 64-byte header, it would be necessary to change the source code for UCP and rebuild. It would change the format of the initial packet sent from initiator to target, which contains an array which would change from char[32] to char[64], so an 'old' ucx would not interoperate with a 'new' ucx; this is OK for hpc usage where we know that the ucx versions on the initiator and target are the same.
I haven't used 'generic' initiator datatype; what does this specify ? The rendezvous API is intended to transfer the data from initiator to target by issuing a single RDMA 'get' on the target. Would this work for 'generic' datatype ?

@tjcw
Copy link
Author

tjcw commented Feb 19, 2020

The initial packet from initiator to target is structured as

typedef struct {
  size_t            total_size; /* length of buffer needed for all data */
  uint64_t          msg_id;     /* method to match parts of the same AM */
  uintptr_t         ep_ptr;         /* end point ptr, used for maintaing list
                                   of arrivals */
  char              rkey_buffer[UCP_PACKED_RKEY_MAX_SIZE] ; /* Packed
                                                             * remote key
                                                             */
  uintptr_t         address;     /* Address for RENDEZVOUS */
  size_t            iovec_0_length ; /* Amount of data transferred
                                      * by iovec[0]
                                      */
  /* iovec[0] carried in request */
  char              iovec_0[UCP_AM_RENDEZVOUS_IOVEC_0_MAX_SIZE] ;
  uint16_t          am_id;      /* index into callback array */
#if defined(UCP_AM_RENDEZVOUS_VERIFY)
  char              iovec_1_first_byte; /* First byte of iovec[1],
                                         * for checking
                                         */
  char              iovec_1_last_byte; /* Last byte of iovec[1],
                                        * for checking
                                        */
#endif
} UCS_S_PACKED ucp_am_rendezvous_header_t ;

( https://github.com/tjcw/ucx/blob/tjcw-am-rendezvous-implementation/src/ucp/core/ucp_am.h#L61 ) where UCP_AM_RENDEZVOUS_IOVEC_0_MAX_SIZE is currently a constant 32. If this was changed to 64, the revised UCP would still work but old and new would not be interoperable and the new UCP would be (slightly) less efficient in use of network bandwidth.

@yosefe
Copy link
Contributor

yosefe commented Feb 19, 2020

@tjcw

  1. The UCP API should work for wide range of applications and we should not limit ourselves to a particular message or header size. Why not let application define header size for every message?
  2. UCX should work on wide range of hardware and we cannot limit ourselves to only hardware which supports RDMA. Same API semantics have to work also on TCP, for example.

@tjcw
Copy link
Author

tjcw commented Feb 19, 2020

  1. Fair enough. I can make this change in the implementation, it does
    not require a further change to the API. I'm also a little suspicious
    about having a fixed maximum length for the packed rkey which I carry in
    the initial packet. Does this need to be variable too, or do packed
    rkeys have a known maximum size ?
  2. Does 'ucp_get_nb' work over a transport such as TCP which doesn't
    support RDMA ? If so, my code will work too. If not, I need a function
    which will tell me whether a handle is for an RDMA-capable transport, so
    I can ask (in the initiator) and fall back to the existing AM code path
    if I am running on a non-RDMA transport. Again this will be a matter for
    my implementation, not a change to the API.

@tjcw
Copy link
Author

tjcw commented Feb 19, 2020

And response from @yosefe
Today ucp_rkey does not expose maximal size but we can add API to ucp_context_query() to calculate and return it, based on all memory domains.

Yes, ucp_get_nb() works over TCP, it’s using uct_am_bcopy

@tjcw
Copy link
Author

tjcw commented Feb 19, 2020

re ucp_rkey : If I'm going to have to cope with ucp rkeys of indeterminate size, it doesn't help to know the max size up front, so don't add this API on my account.
re ucp_get_nb() : That's good news. I think we are already calling the API 'rendezvous' rather than 'rdma'.

@yosefe yosefe added the WIP-DNM Work in progress / Do not review label Mar 31, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API Waiting for Author Response WIP-DNM Work in progress / Do not review
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants