Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TEST: sync up upstream #1

Merged
merged 254 commits into from
Jul 20, 2022
Merged

TEST: sync up upstream #1

merged 254 commits into from
Jul 20, 2022

Conversation

kingchc
Copy link

@kingchc kingchc commented Jul 20, 2022

sync up upstream UCC

bureddy and others added 30 commits July 24, 2021 18:27
MC: Fix build with profiling
* UTIL: host hash

    avoid using gethostid()

* TL/UCP: fix ep hash for 64 bit node_hash

* CORE: check for proc info uniqness

* TL/UCP: array ep storage
Signed-off-by: artemry-nv <artemry@nvidia.com>
Co-authored-by: Lior Paz <liorpa@mellanox.com>
TL/UCP: Fix bug in ep close
Co-authored-by: Lior Paz <liorpa@mellanox.com>
Signed-off-by: artemry-nv <artemry@nvidia.com>

Co-authored-by: valentin petrov <valentinp@nvidia.com>
* TL/NCCL: add reduce scatter and reduce

* TL/NCCL: fix reduce scatter count
* CORE: basic topo/subgrouping

* TEST: sbgp tests
Signed-off-by: artemry-nv <artemry@nvidia.com>

Co-authored-by: valentin petrov <valentinp@nvidia.com>
* UCP: Implementing Reduce Knomial

* CORE: Fixed check coll type for reduce case

* REVIEW: Fixes to first code review

* REVIEW: Fixes to second code review

* REVIEW: Fixes to third code review

Co-authored-by: valentin petrov <valentinp@nvidia.com>
Co-authored-by: Lior Paz <liorpa@mellanox.com>
Co-authored-by: valentin petrov <valentinp@nvidia.com>
Re-implement internally to keep backward bin-compat.
* CORE: optimize team create

      Don't perform service_team creation or team_id allocation if it
      is not required by CL/TLs

* UTIL: service colls

    - Adds service allgather tl iface
    - Adds ucc_service_coll convenience layer in core
    - Adds option UCC_INTERNAL_OOB

* TEST: service coll tests

* CODESTYLE: apply clang

* CORE: fixes service coll map usage

* CORE: ucc_team_subset_t to common place

* CORE: service coll progress fix

* TL/UCP: service ag fix

    Fix afterrebase on top of "count" definition change

* API: oob_ep field to oob
#273)

* API: Clarifying semantics for coll args flags

* API: Clarfying persistent flag semantic
Change coll args rules for allreduce, reduce_scatter and reduce
* SCHEDULE: pipelined schedule iface

* TL/UCP: pipelined sra

* TEST: gtest for sra pipelined

* TL/UCP: fix linter warn
valentin petrov and others added 27 commits June 14, 2022 22:14
* UTIL: new ucs profiler config

* UTIL: code review fixes
Co-authored-by: Edgar Gabriel <edgar.gabriel@amd.com>
Co-authored-by: Min Si <msi@fb.com>

Co-authored-by: Min Si <msi@fb.com>
Co-authored-by: valentin petrov <valentinp@nvidia.com>
Co-authored-by: valentin petrov <valentinp@nvidia.com>
* API: add float128 and float32(64,128)_complex dt

* TEST: update mpi_tests with new dt

* TEST: update Gtest with new dt

* BUILD: check dt size during preprocessing
explicitely disabeling tl/rccl was broken due to some misplaced parenthesis
in the rccl.m4 file.
Co-authored-by: Wes Bland <wbland@fb.com>
This commits resync the rocm components with the ucc code base.
Specifically, it adds support (or disqualifies itself) for various
float(32,64,128)_complex datatypes.

Co-authored-by: Sergey Lebedev <sergeyle@nvidia.com>
move the invocation of event_destroy() from the coll_finalize() function
to the free_task() function to avoid leaving events in the
mpool if a test is skipped because of an unsupported datatype.
attr flags are not initialized. Context fot these TLs is not created since core
context assumes service team is required.
Co-authored-by: valentin petrov <valentinp@nvidia.com>
Co-authored-by: valentin petrov <valentinp@nvidia.com>
Co-authored-by: valentin petrov <valentinp@nvidia.com>
Co-authored-by: valentin petrov <valentinp@nvidia.com>
* TL/CUDA: allgather(v) linear alg

* REVIEW: fix review comments

* addeed algorithm description
* fixed alignment
* fixed copyright

* REVIEW: apply clang format

Co-authored-by: valentin petrov <valentinp@nvidia.com>
* MC/CUDA: add uint16(32,64) support in reduce

* TEST: add CUDA reduce gtest with uint16(32,64) dt

* TEST: add reduce mpi tests with uint16(32,64) dt

Co-authored-by: valentin petrov <valentinp@nvidia.com>
@kingchc kingchc merged commit 531a02b into facebookresearch:master Jul 20, 2022
This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.