Theoretical time of collectives when reduction in network enabled #1242

yupatrick22 · 2024-04-01T06:41:59Z

Given a communicator of size p, each participant has unidirectional bandwidth B, and message size is S.

The theoretical time (if latency and computation are both ignored) for gather, scatter, all_gather, reduce-scatter is (p-1)/p*(S/B), and for all_reduce, it should be 2*(p-1)/p*S/B, if reduction in network is disabled.

However, if reduction in network is enabled, what is the theoretical time for the above collectives?

I think, for the case of all_reduce, it should be S/B, since for each participant, it only needs to send S, and receive S, and send receive can happen simultaneously (i.e. pipelining). This also means that reduction in network essentially increases the bandwidth B by a factor 2X.

But, how to calculate the theoretical time for the others? how to evaluate the benefit of reduction in network for the others?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Theoretical time of collectives when reduction in network enabled #1242

Theoretical time of collectives when reduction in network enabled #1242

yupatrick22 commented Apr 1, 2024 •

edited

Theoretical time of collectives when reduction in network enabled #1242

Theoretical time of collectives when reduction in network enabled #1242

Comments

yupatrick22 commented Apr 1, 2024 • edited

yupatrick22 commented Apr 1, 2024 •

edited