Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Kineto][NCCL][5/n] Populate in/out split size info for all_to_all from CPU to CUDA kernel #112308

Closed
wants to merge 1 commit into from

Commits on Nov 6, 2023

  1. [Kineto][NCCL][5/n] Populate in/out split size info for all_to_all fr…

    …om CPU to CUDA kernel (pytorch#112308)
    
    Summary:
    
    X-link: pytorch/kineto#822
    
    This diff populates all_to_all input and out split size from CPU op to GPU kernel when valid.
    
    Test Plan:
    **Trace example**:
    - For non all_to_all collective functions: https://fburl.com/perfdoctor/4nobsu15
    https://pxl.cl/3GNVb
    
    - For all_to_all: https://fburl.com/perfdoctor/f418goys
    
    https://pxl.cl/3H2nd
    
    Reviewed By: aaronenyeshi, idning
    
    Differential Revision: D50762093
    yoyoyocmu authored and facebook-github-bot committed Nov 6, 2023
    Configuration menu
    Copy the full SHA
    b53fc36 View commit details
    Browse the repository at this point in the history