Release v2.13.4-1: 2.13.4-1 · NVIDIA/nccl

v2.13.4-1
19ab67d
Compare

Choose a tag to compare

Loading

View all tags

v2.13.4-1
19ab67d
Compare

Choose a tag to compare

Loading

View all tags

sjeaugey tagged this 11 Jul 15:10

Optimize CUDA graph launch; avoid launching a CPU callback for
intra-node operations.
Simplify kernel common code to improve the latency of send/recv
operations.
Strengthen CUDA streams semantics.
Change NET API to v6, to add dmabuf support.
Add ncclGetLastError() function.
Add ncclRemoteError code and use it for remote network errors.
Support the use of a different NCCL_NET parameter per communicator.
Add support for SHM and P2P transfers using cudaMemcpy.

Assets 2

Source code (zip)

2022-07-11T15:10:34Z
Source code (tar.gz)

2022-07-11T15:10:34Z
Loading

Provide feedback

Saved searches

Use saved searches to filter your results more quickly