Skip to content

RCCL 2.12.10 for ROCm 5.2.3

Compare
Choose a tag to compare
@lawruble13 lawruble13 released this 18 Aug 16:59

Added

  • Compatibility with NCCL 2.12.10
  • Packages for test and benchmark executables on all supported OSes using CPack.
  • Adding custom signal handler - opt-in with RCCL_ENABLE_SIGNALHANDLER=1
    • Additional details provided if Binary File Descriptor library (BFD) is pre-installed
  • Adding support for reusing ports in NET/IB channels
    • Opt-in with NCCL_IB_SOCK_CLIENT_PORT_REUSE=1 and NCCL_IB_SOCK_SERVER_PORT_REUSE=1
    • When "Call to bind failed : Address already in use" error happens in large-scale AlltoAll
      (e.g., >=64 MI200 nodes), users are suggested to opt-in either one or both of the options
      to resolve the massive port usage issue
    • Avoid using NCCL_IB_SOCK_SERVER_PORT_REUSE when NCCL_NCHANNELS_PER_NET_PEER is tuned >1

Removed

  • Removed experimental clique-based kernels