Skip to content

Commit

Permalink
Added a precision for AllGather and ReduceScatter sizes since NCCL us…
Browse files Browse the repository at this point in the history
…es the size per rank.
  • Loading branch information
sjeaugey committed Aug 17, 2018
1 parent eb4c43f commit dcf8189
Showing 1 changed file with 4 additions and 0 deletions.
4 changes: 4 additions & 0 deletions doc/PERFORMANCE.md
Expand Up @@ -78,6 +78,8 @@ And the Bus Bandwidth is therefore computed as :

`B = S/t * (n-1)/n = algbw * (n-1)/n`

Note that here, S is the size in bytes of the total array, which for NCCL is equal to `recvcount*sizeof(datatype)*n` as the `recvcount` argument is the count per rank.

### AllGather

The AllGather operation requires only to perform the assignation part of the allReduce operation :
Expand All @@ -94,6 +96,8 @@ And the Bus Bandwidth is therefore computed as :

`B = S/t * (n-1)/n = algbw * (n-1)/n`

Note that here, S is the size in bytes of the total array, which for NCCL is equal to `sendcount*sizeof(datatype)*n` as the `sendcount` argument is the count per rank.

### Broadcast

The broadcast operation representation is similar to allGather :
Expand Down

0 comments on commit dcf8189

Please sign in to comment.