Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UCG needs local/global rank/size information #19

Closed
alex--m opened this issue Oct 15, 2020 · 1 comment
Closed

UCG needs local/global rank/size information #19

alex--m opened this issue Oct 15, 2020 · 1 comment
Milestone

Comments

@alex--m
Copy link
Contributor

alex--m commented Oct 15, 2020

This is a special case of topology information (#13), likely easier to accomplish.

For reference, UCG parameters require the following:

    /**
     * Information about other processes running UCX on the same node, used for
     * the UCG - Group operations (e.g. MPI collective operations). This includes
     * both the total number of processes (including myself) and a zero-based
     * index of my process, guaranteed to be unique among the local processes
     * which this process will contact. One such pair refers strictly to the
     * peers on the same host, and the other pair refers to the total amount
     * of peers for communication across the network. Typically the process with
     * index #0 (in either pair) performs special duties in group-aware
     * transports, and those transports need this information on every process.
     *
     * @note Both fields are indicated be the same bit in @ref field_mask.
     */
    struct {
    uint32_t                           num_local;
    uint32_t                           local_idx;
    uint32_t                           num_global;
    uint32_t                           global_idx;
    } peer_info;

Full disclose: this is NOT part of the upstream UCP version, but rather a modified UCP I've been using for UCG.

Currently, the OMPI-based implementation satisfies this requirement as follows:

    ucp_params.peer_info.num_local  = ompi_process_info.num_local_peers + 1;
    ucp_params.peer_info.local_idx  = ompi_process_info.my_local_rank;
    ucp_params.peer_info.num_global = ompi_process_info.num_procs;
    ucp_params.peer_info.global_idx = ompi_process_info.myprocid.rank;

To clarify, the reason this code has ucp_params is that this information is passed to UCP (and UCT), but is used exclusively for collective operations and not P2P.

@alex--m alex--m added this to the UCG support milestone Oct 15, 2020
This was referenced Oct 15, 2020
@vspetrov
Copy link
Collaborator

outdated

artemry-nv pushed a commit to artemry-nv/ucc that referenced this issue Nov 1, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants