Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RCCL support #93

Merged
merged 26 commits into from
Jul 25, 2022
Merged

RCCL support #93

merged 26 commits into from
Jul 25, 2022

Commits on Jul 25, 2022

  1. Initial support for RCCL

    jrmadsen committed Jul 25, 2022
    Configuration menu
    Copy the full SHA
    bed55ad View commit details
    Browse the repository at this point in the history
  2. OMNITRACE_USE_RCCLP + sampling tweaks

    - also OMNITRACE_SAMPLING_KEEP_INTERNAL option
    - minor modifications to sampling to use keep internal option + discard funlockfile
    jrmadsen committed Jul 25, 2022
    Configuration menu
    Copy the full SHA
    8099c48 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    2d61e18 View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    5450b24 View commit details
    Browse the repository at this point in the history
  5. Rework rccl into library and library/components folder

    - add tpls/rccl/rccl/rccl.h
    jrmadsen committed Jul 25, 2022
    Configuration menu
    Copy the full SHA
    13f5a71 View commit details
    Browse the repository at this point in the history
  6. Fix timemory includes

    jrmadsen committed Jul 25, 2022
    Configuration menu
    Copy the full SHA
    4b6468b View commit details
    Browse the repository at this point in the history
  7. Configuration menu
    Copy the full SHA
    a0a5a5f View commit details
    Browse the repository at this point in the history
  8. Tweaks to ubuntu-focal-external-rocm

    - disable ompt
    - enable building testing
    jrmadsen committed Jul 25, 2022
    Configuration menu
    Copy the full SHA
    ea7f4d4 View commit details
    Browse the repository at this point in the history
  9. Tweaks to ubuntu-focal-external-rocm

    - ctest exclude
    jrmadsen committed Jul 25, 2022
    Configuration menu
    Copy the full SHA
    0c63a45 View commit details
    Browse the repository at this point in the history
  10. Tweak ubuntu-focal.yml

    - remove source /.../setup-env.sh, replace with $GITHUB_ENV
    jrmadsen committed Jul 25, 2022
    Configuration menu
    Copy the full SHA
    673870c View commit details
    Browse the repository at this point in the history
  11. Configuration menu
    Copy the full SHA
    a743cce View commit details
    Browse the repository at this point in the history
  12. Improved rocm-smi error handling

    - Recover from rocm-smi errors
    - Disabling rocm-smi after recovering from errors
    - Werror in developer mode
    - Remove State::DelayedInit
    - Add State::Disabled
    jrmadsen committed Jul 25, 2022
    Configuration menu
    Copy the full SHA
    a4e94e2 View commit details
    Browse the repository at this point in the history
  13. formatting

    jrmadsen committed Jul 25, 2022
    Configuration menu
    Copy the full SHA
    37f8353 View commit details
    Browse the repository at this point in the history
  14. Configuration menu
    Copy the full SHA
    c5850e5 View commit details
    Browse the repository at this point in the history
  15. Update RCCL include directory

    - based on ROCm version we need with <rccl/rccl.h> or <rccl.h>
    jrmadsen committed Jul 25, 2022
    Configuration menu
    Copy the full SHA
    35105df View commit details
    Browse the repository at this point in the history
  16. RCCL Testing

    - updated tests to use configuration files
    - many tests generate a configuration file
    - tests how have GPU option
    - enable ncclCommCount, disable ncclGetVersion
    - add testing for RCCLP via rccl-tests
    - working directory of tests is PROJECT_BINARY_DIR
    - add nccl/rccl functions to get_whole_function_names
    - some clang compiler fixes
    jrmadsen committed Jul 25, 2022
    Configuration menu
    Copy the full SHA
    755e16c View commit details
    Browse the repository at this point in the history
  17. Handle RCCL include w/o HIP

    jrmadsen committed Jul 25, 2022
    Configuration menu
    Copy the full SHA
    f4e1c7f View commit details
    Browse the repository at this point in the history
  18. RCCL requires HIP

    jrmadsen committed Jul 25, 2022
    Configuration menu
    Copy the full SHA
    97d8cb4 View commit details
    Browse the repository at this point in the history
  19. Configuration menu
    Copy the full SHA
    0231f23 View commit details
    Browse the repository at this point in the history
  20. Update tests/CMakeLists.txt

    jrmadsen committed Jul 25, 2022
    Configuration menu
    Copy the full SHA
    a031b7e View commit details
    Browse the repository at this point in the history
  21. Debug settings

    jrmadsen committed Jul 25, 2022
    Configuration menu
    Copy the full SHA
    679dff8 View commit details
    Browse the repository at this point in the history
  22. Configuration menu
    Copy the full SHA
    df9c874 View commit details
    Browse the repository at this point in the history
  23. exclude printf

    jrmadsen committed Jul 25, 2022
    Configuration menu
    Copy the full SHA
    2781e63 View commit details
    Browse the repository at this point in the history
  24. Configuration menu
    Copy the full SHA
    105aecf View commit details
    Browse the repository at this point in the history
  25. update ubuntu rocm workflow

    jrmadsen committed Jul 25, 2022
    Configuration menu
    Copy the full SHA
    f4cfeca View commit details
    Browse the repository at this point in the history
  26. Configuration menu
    Copy the full SHA
    1f8037c View commit details
    Browse the repository at this point in the history