Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Topic/topo subgrouping #266

Merged
merged 2 commits into from
Sep 6, 2021
Merged

Conversation

vspetrov
Copy link
Collaborator

What

Adds basic subgrouping functionality to UCC. Ranks participating in a team can now be partitioned into groups (1) belonging to same node, (2) socket, (3) nodeleaders, etc.

Why ?

These groupings are used to build hierarchical collective schedules

How ?

  1. Each rank encodes local process information (currently host_hash, socket_id, pid) into ctx address header.
  2. This info is exchanged at ucc_context_create (with OOB) as part of address_exchange flow.
  3. CTX level topo datastructure is initialized: it stores the global view of processes data (global with respect to context)
  4. When team is created the ucc_team_topo_t is initialized from global ucc_topo_t -> this gives the team level subgrouping information.
  5. Subgroups (SBGPS) are created "locally" (w/o communication) and on-demand.
  6. The whole topo flow is only enabled if any CL/TL reports that it needs it.

@vspetrov vspetrov requested review from Sergei-Lebedev, bureddy and manjugv and removed request for Sergei-Lebedev July 21, 2021 15:59
@manjugv
Copy link
Contributor

manjugv commented Jul 21, 2021

UCC WG July 21st:

  1. Add performance number do understand the overhead of subgroup creation.
  2. Create issue for sharing topology info with other processes on the node.
  3. Create issue for API change to input hwloc topology information.

@vspetrov
Copy link
Collaborator Author

32 nodes, ppn 32 (max 1024 ranks)
ucc_context_create breakdown in usec: so it is ~0.2 sec. "bound socket id detection" takes 0.013 sec in worst case (looks noisy, probably due to reading from FS). ctx_topo data structure init is nothing in terms of time (though scales linearly the absolute number is way lower). Note, sock id detection can be optimized if we pass that info from runtime via additional lib params.

nnodes nranks total sock id detection ctx topo init
2 64 198757 13375.094 6.633
4 128 204162 13349.99 10.773
8 256 208647 10379.154 20.85
16 512 214565 9993.638 48.438
32 1024 220230 2311.789 119.392

this is ucc_team creation breakdown in usec. There is still 1 allgather that takes most of the time (i also discuss it in #274 how we should eliminate it). Again the cost of team_topo_init and actual subgroups creation (pure local math operations w/o communication) is nothing.

nnodes nranks total team_topo_init sbgp_node sbgp_socket sbgp_node_leaders
2 64 8611.22 0.157 0.309 0.27 0.423
4 128 8977.55 0.091 0.555 0.269 0.561
8 256 10650.6 0.131 0.834 0.299 1.042
16 512 12868.4 0.147 1.653 0.301 2.317
32 1024 15373.4 0.176 4.027 0.435 3.53

Final note, keep in mind that "topo" structs and sbgps are allocated only when TL/CL requires them. For this evaluation i specifically forced those flows (although cl/basic + tl/ucp does not ask for topo).

@vspetrov
Copy link
Collaborator Author

@manjugv @bureddy @Sergei-Lebedev i added perf numbers and fixed minor linter issues. plz review.

src/components/cl/basic/cl_basic_context.c Outdated Show resolved Hide resolved
src/core/ucc_sbgp.c Outdated Show resolved Hide resolved
src/core/ucc_topo.c Show resolved Hide resolved
src/core/ucc_topo.c Outdated Show resolved Hide resolved
src/core/ucc_topo.c Outdated Show resolved Hide resolved
src/core/ucc_topo.c Show resolved Hide resolved
src/core/ucc_topo.c Outdated Show resolved Hide resolved
src/core/ucc_sbgp.c Show resolved Hide resolved
src/core/ucc_sbgp.c Show resolved Hide resolved
src/core/ucc_sbgp.c Outdated Show resolved Hide resolved
src/core/ucc_sbgp.h Outdated Show resolved Hide resolved
@vspetrov vspetrov force-pushed the topic/topo_subgrouping branch 3 times, most recently from 1a19e30 to 182276d Compare August 3, 2021 15:22
@vspetrov vspetrov force-pushed the topic/topo_subgrouping branch 2 times, most recently from 2871eea to 9bd48e5 Compare August 27, 2021 07:16
src/components/cl/basic/cl_basic_context.c Outdated Show resolved Hide resolved
src/core/ucc_sbgp.c Show resolved Hide resolved
int _tmp = (_x); \
(_x) = (_y); \
(_y) = _tmp; \
} while (0)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move to math.h?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@vspetrov vspetrov merged commit d6eb20d into openucx:master Sep 6, 2021
@vspetrov vspetrov deleted the topic/topo_subgrouping branch September 6, 2021 07:38
kingchc pushed a commit to facebookresearch/ucc that referenced this pull request Jul 20, 2022
* CORE: basic topo/subgrouping

* TEST: sbgp tests
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants