-
Notifications
You must be signed in to change notification settings - Fork 419
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
UCP/RNDV: Refactor rendezvous protocol to use rma_bw lanes. #2092
Conversation
Build finished. |
Test FAILed. |
@@ -34,6 +34,26 @@ size_t ucp_rkey_packed_size(ucp_context_h context, ucp_md_map_t md_map) | |||
return size; | |||
} | |||
|
|||
void ucp_rkey_packed_copy(ucp_context_h context, ucp_md_map_t md_map, | |||
void *rkey_buffer, const void* uct_rkeys[]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
const void* uct_rkeys[] is interpreted as void**, as I can see incoming parameter is just non-initialized array, not array of pointers (and coverity agree with me :) )
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it should be a const array of pointers
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, I see
the simplest way to suppress error (without hack with suppress coverity by comment) is create separate array of pointers, like
const void *array[2] = {key_buf, NULL};
and pass it to this function (technically compiler makes same things, should not affect performane)
src/ucp/core/ucp_rkey.c
Outdated
|
||
rkey_buffer += md_size; | ||
|
||
snprintf(p, endp - p, md_map ? "," : ""); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
md_map is always NOT NULL here (coverity report)
src/ucp/core/ucp_rkey.c
Outdated
ucs_for_each_bit(md_index, md_map) { | ||
md_size = *((uint8_t*)rkey_buffer); | ||
rkey_buffer += sizeof(uint8_t); | ||
|
||
if (!first) { | ||
snprintf(p, endp - p, ","); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not p+= snprintf
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://linux.die.net/man/3/snprintf
The functions snprintf() and vsnprintf() do not write more than size bytes (including the terminating null byte ('\0')). If the output was truncated due to this limit then the return value is the number of characters (excluding the terminating null byte) which would have been written to the final string if enough space had been available. Thus, a return value of size or more means that the output was truncated. (See also below under NOTES.)
Build finished. |
Test PASSed. |
Test PASSed. |
src/ucp/core/ucp_request.h
Outdated
uct_rkey_bundle_t rkey_bundle; | ||
ucp_request_t *sreq; /* send request on the send side */ | ||
uint64_t remote_address; /* address of the receiver's data buffer */ | ||
uintptr_t remote_request; /* pointer to the receiver's send request */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it is receiver's recv
request
src/ucp/tag/offload.c
Outdated
ucp_lane_index_t lane = ucp_ep_get_tag_lane(ep); | ||
ucp_ep_t *ep = sreq->send.ep; | ||
|
||
ucs_assert(sreq->send.lane == ucp_ep_get_tag_lane(ep)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
where do you set a send.lane
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will add comment
@@ -28,7 +23,6 @@ typedef struct { | |||
ucp_request_hdr_t sreq; /* send request on the rndv initiator side */ | |||
uint64_t address; /* holds the address of the data buffer on the sender's side */ | |||
size_t size; /* size of the data for sending */ | |||
uint16_t flags; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How do we check on receiver whether we have rkeys or not now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
according to md_map in the packed rkey
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but we send no rkey in case of SW rndv, non-contig RNDV, etc
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in this case, we can tell it by address==NULL
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW, for SW rndv, we do send rkeys (if the send buffer is contig but large, for example)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
aa, right, we check the address. Yes, SW rndv may carry the rkey for large messages, but in other cases it won't.
src/ucp/core/ucp_rkey.c
Outdated
ucs_assert_always(md_size <= UINT8_MAX); | ||
*((uint8_t*)p++) = md_size; | ||
memcpy(p, *uct_rkeys, md_size); | ||
p += sizeof(uint8_t) + md_size; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seems like need to do p += md_size
, because you already made p++
before
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
right, it worked because offload currently has only 1 key
Test FAILed. |
Build finished. |
Test PASSed. |
Build finished. |
Test FAILed. |
src/ucp/tag/rndv.c
Outdated
} | ||
} | ||
|
||
sreq->send.lane = ucp_ep_get_am_lane(ep); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks like no need to set lane here, can do assert that it is AM lane (for consistency with offload branch)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but here we set to am_lane, and in offload.c we expect tag_lane
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean that request lane is set in ucp_tag_send_req_init()
. No need to set it here again.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but ucp_tag_send_req_init() always sets it to tag_lane, and in case of rndv we may need to switch to am_lane.. do you think they would have be the same in this case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if (!ep->flags & UCP_EP_FLAG_TAG_OFFLOAD_ENABLED), tag_lane should be am_lane
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Line 831 in bc72a8f
config->tag.lane = lane; |
bot:retest |
Build finished. |
Test FAILed. |
Test FAILed. |
Build finished. |
bot:bgate:retest |
Test FAILed. |
Test PASSed. |
Test FAILed. |
Test PASSed. |
No description provided.