New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

[Kineto][NCCL][5/n] Populate in/out split size info for all_to_all from CPU to CUDA kernel #112308

Closed

yoyoyocmu wants to merge 1 commit into pytorch:main from yoyoyocmu:export-D50762093

Commits on Nov 6, 2023

[Kineto][NCCL][5/n] Populate in/out split size info for all_to_all fr…

…om CPU to CUDA kernel (pytorch#112308)

Summary:

X-link: pytorch/kineto#822

This diff populates all_to_all input and out split size from CPU op to GPU kernel when valid.

Test Plan:
**Trace example**:
- For non all_to_all collective functions: https://fburl.com/perfdoctor/4nobsu15
https://pxl.cl/3GNVb

- For all_to_all: https://fburl.com/perfdoctor/f418goys

https://pxl.cl/3H2nd

Reviewed By: aaronenyeshi, idning

Differential Revision: D50762093

yoyoyocmu authored and facebook-github-bot committed Nov 6, 2023