You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When assigning blocks, SDPB assumes that first procsPerNode MPI ranks belong to the first node.
In SLURM, this corresponds to --distribution=block option for srun, see https://slurm.schedmd.com/srun.html#OPT_distribution
For other distributions, this is not true. For example, if --distribution=cyclic option is set instead, then first rank goes to the first node, second rank - to the second node etc.
In that case, if some block is shared among rank 0 and rank 1, its data is spread among different nodes.
For SDPB 2.6.1, this could lead to slow DistMatrix operations (e.g. cholesky decomposition) due to slow communication between nodes.
For future versions using shared memory window (e.g. after merging #142), this will lead to errors and/or failures.
The text was updated successfully, but these errors were encountered:
When assigning blocks, SDPB assumes that first
procsPerNode
MPI ranks belong to the first node.In SLURM, this corresponds to
--distribution=block
option forsrun
, see https://slurm.schedmd.com/srun.html#OPT_distributionFor other distributions, this is not true. For example, if
--distribution=cyclic
option is set instead, then first rank goes to the first node, second rank - to the second node etc.In that case, if some block is shared among rank 0 and rank 1, its data is spread among different nodes.
For SDPB 2.6.1, this could lead to slow
DistMatrix
operations (e.g. cholesky decomposition) due to slow communication between nodes.For future versions using shared memory window (e.g. after merging #142), this will lead to errors and/or failures.
The text was updated successfully, but these errors were encountered: