New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
UCP/WIREUP: Increase UCP_MAX_LANES to 64 #9814
base: master
Are you sure you want to change the base?
UCP/WIREUP: Increase UCP_MAX_LANES to 64 #9814
Conversation
Would it make sense to have this parameter as a configure option? |
I think not, we should be able to find a solution to set upper limit to 64 w/o extra overheads when the actual number of lanes is small |
src/ucp/rndv/rndv_rkey_ptr.c
Outdated
|
||
ucs_assertv(rpriv->spriv.super.lane != UCP_NULL_LANE, "lane=%d", | ||
rpriv->spriv.super.lane); | ||
attr->lane_map = UCS_BIT(rpriv->spriv.super.lane); | ||
if (rpriv->ack.lane != UCP_NULL_LANE) { | ||
attr->lane_map |= UCS_BIT(rpriv->ack.lane); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change is required since ack.lane
can be equal to 255 (UCP_NULL_LANE
) so after increasing lane_map
size to 64 it started to set 63rd bit in lane_map.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why did it set 63d bit?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think (1ul << 256)
is anyway undefined behavior as 256 >= 64.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why did it set 63d bit?
To be honest, I don't know. I just detected this behavior and when I applied this change it gone. Do you want me to check it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i was experimenting on similar case and got unpredictable result, i guess this should be ok
@@ -57,7 +57,7 @@ UCS_TEST_F(test_obj_size, size) { | |||
EXPECTED_SIZE(ucp_rkey_t, 32); | |||
#endif | |||
/* TODO reduce request size to 240 or less after removing old protocols state */ | |||
EXPECTED_SIZE(ucp_request_t, 264); | |||
EXPECTED_SIZE(ucp_request_t, 280); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we avoid that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can consider increasing number of lanes to 32 instead of 64. Or allowing creation of 64 lanes only for wireup and restrict the max lanes number by 32 or 16 for RNDV protocols. What sounds better for you?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
both sounds ok. Maybe it is better to restrict protocol;s to use up to 16 lanes? Then no need to keep large maps in the request
ASSERT_LE(16, (int)ucp_ep_num_lanes(sender().ep())); | ||
ASSERT_LE(16, (int)ucp_ep_num_lanes(receiver().ep())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because we have some platform which can create more than 16 lanes in that particular case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i'd suggest to change this test or wtite a new one which tests creation of 64 lanes
@@ -1247,19 +1247,19 @@ UCS_TEST_P(multi_rail_max, max_lanes, "IB_NUM_PATHS?=16", "TM_SW_RNDV=y", | |||
|
|||
ucp_lane_index_t num_lanes = ucp_ep_num_lanes(sender().ep()); | |||
ASSERT_EQ(ucp_ep_num_lanes(receiver().ep()), num_lanes); | |||
ASSERT_EQ(num_lanes, max_lanes); | |||
ASSERT_GE(num_lanes, max_lanes); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because we have some platform which can create more than 16 lanes in that particular case.
@@ -239,7 +239,7 @@ static inline ucp_rsc_index_t | |||
ucp_ep_config_get_dst_md_cmpt(const ucp_ep_config_key_t *key, | |||
ucp_md_index_t dst_md_index) | |||
{ | |||
unsigned idx = ucs_popcount(key->reachable_md_map & UCS_MASK(dst_md_index)); | |||
unsigned idx = ucs_bitmap2idx(key->reachable_md_map, dst_md_index); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unrelated fix?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, a little :)
I just discovered that there is a helper function for that logic, so decided to included it to the patch.
src/ucp/rndv/rndv_rkey_ptr.c
Outdated
|
||
ucs_assertv(rpriv->spriv.super.lane != UCP_NULL_LANE, "lane=%d", | ||
rpriv->spriv.super.lane); | ||
attr->lane_map = UCS_BIT(rpriv->spriv.super.lane); | ||
if (rpriv->ack.lane != UCP_NULL_LANE) { | ||
attr->lane_map |= UCS_BIT(rpriv->ack.lane); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why did it set 63d bit?
…-max-lanes-to-64 Conflicts: src/ucp/proto/proto_select.c
…-max-lanes-to-64 Conflicts: src/ucp/proto/proto.h
@@ -129,6 +130,7 @@ class mem_buffer { | |||
size_t length, size_t offset, | |||
const void *buffer, const void *orig_ptr) | |||
{ | |||
ucs_assertv(length <= 8, "length=%zu", length); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since new implementation of UCS_MASK will still do undefined behavior if input value > 64. In that particular case length is size of one element that should be compared.
What
Increase number of maximal lanes per EP to 64.
Why ?
To allow collecting information about all lanes on systems with many transports/devices.