Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UCP/WIREUP: Increase UCP_MAX_LANES to 64 #9814

Open
wants to merge 13 commits into
base: master
Choose a base branch
from

Conversation

ivankochin
Copy link
Contributor

What

Increase number of maximal lanes per EP to 64.

Why ?

To allow collecting information about all lanes on systems with many transports/devices.

@ivankochin ivankochin self-assigned this Apr 11, 2024
@edgargabriel
Copy link
Contributor

Would it make sense to have this parameter as a configure option?

@yosefe
Copy link
Contributor

yosefe commented Apr 15, 2024

Would it make sense to have this parameter as a configure option?

I think not, we should be able to find a solution to set upper limit to 64 w/o extra overheads when the actual number of lanes is small

src/ucs/sys/compiler_def.h Outdated Show resolved Hide resolved
@ivankochin ivankochin marked this pull request as draft April 15, 2024 10:14
Comment on lines 103 to 109

ucs_assertv(rpriv->spriv.super.lane != UCP_NULL_LANE, "lane=%d",
rpriv->spriv.super.lane);
attr->lane_map = UCS_BIT(rpriv->spriv.super.lane);
if (rpriv->ack.lane != UCP_NULL_LANE) {
attr->lane_map |= UCS_BIT(rpriv->ack.lane);
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change is required since ack.lane can be equal to 255 (UCP_NULL_LANE) so after increasing lane_map size to 64 it started to set 63rd bit in lane_map.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why did it set 63d bit?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think (1ul << 256) is anyway undefined behavior as 256 >= 64.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why did it set 63d bit?

To be honest, I don't know. I just detected this behavior and when I applied this change it gone. Do you want me to check it?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i was experimenting on similar case and got unpredictable result, i guess this should be ok

@ivankochin ivankochin marked this pull request as ready for review April 19, 2024 06:32
src/ucp/wireup/select.c Outdated Show resolved Hide resolved
src/ucp/core/ucp_ep.inl Show resolved Hide resolved
src/ucs/sys/compiler_def.h Show resolved Hide resolved
src/ucp/core/ucp_ep.inl Show resolved Hide resolved
src/ucp/proto/proto.h Outdated Show resolved Hide resolved
src/ucp/rndv/rndv_rkey_ptr.c Outdated Show resolved Hide resolved
@@ -57,7 +57,7 @@ UCS_TEST_F(test_obj_size, size) {
EXPECTED_SIZE(ucp_rkey_t, 32);
#endif
/* TODO reduce request size to 240 or less after removing old protocols state */
EXPECTED_SIZE(ucp_request_t, 264);
EXPECTED_SIZE(ucp_request_t, 280);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we avoid that?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can consider increasing number of lanes to 32 instead of 64. Or allowing creation of 64 lanes only for wireup and restrict the max lanes number by 32 or 16 for RNDV protocols. What sounds better for you?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

both sounds ok. Maybe it is better to restrict protocol;s to use up to 16 lanes? Then no need to keep large maps in the request

Comment on lines +1463 to +1464
ASSERT_LE(16, (int)ucp_ep_num_lanes(sender().ep()));
ASSERT_LE(16, (int)ucp_ep_num_lanes(receiver().ep()));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because we have some platform which can create more than 16 lanes in that particular case.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'd suggest to change this test or wtite a new one which tests creation of 64 lanes

@@ -1247,19 +1247,19 @@ UCS_TEST_P(multi_rail_max, max_lanes, "IB_NUM_PATHS?=16", "TM_SW_RNDV=y",

ucp_lane_index_t num_lanes = ucp_ep_num_lanes(sender().ep());
ASSERT_EQ(ucp_ep_num_lanes(receiver().ep()), num_lanes);
ASSERT_EQ(num_lanes, max_lanes);
ASSERT_GE(num_lanes, max_lanes);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because we have some platform which can create more than 16 lanes in that particular case.

@@ -239,7 +239,7 @@ static inline ucp_rsc_index_t
ucp_ep_config_get_dst_md_cmpt(const ucp_ep_config_key_t *key,
ucp_md_index_t dst_md_index)
{
unsigned idx = ucs_popcount(key->reachable_md_map & UCS_MASK(dst_md_index));
unsigned idx = ucs_bitmap2idx(key->reachable_md_map, dst_md_index);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unrelated fix?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, a little :)

I just discovered that there is a helper function for that logic, so decided to included it to the patch.

Comment on lines 103 to 109

ucs_assertv(rpriv->spriv.super.lane != UCP_NULL_LANE, "lane=%d",
rpriv->spriv.super.lane);
attr->lane_map = UCS_BIT(rpriv->spriv.super.lane);
if (rpriv->ack.lane != UCP_NULL_LANE) {
attr->lane_map |= UCS_BIT(rpriv->ack.lane);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why did it set 63d bit?

@@ -129,6 +130,7 @@ class mem_buffer {
size_t length, size_t offset,
const void *buffer, const void *orig_ptr)
{
ucs_assertv(length <= 8, "length=%zu", length);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since new implementation of UCS_MASK will still do undefined behavior if input value > 64. In that particular case length is size of one element that should be compared.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants