Skip to content

CPU Binding

Raghu Raja edited this page Feb 16, 2024 · 1 revision

When running workloads using NCCL and this plugin, we recommend disabling CPU binding, so that every process on the node has a CPU mask that includes all CPUs.

When running with Open MPI, this is achieved by using the --bind-to none argument to mpirun. In Slurm, this is achieved using a combination of two things:

  1. Request enough processors-per-task for your job, using either -c $((TOTAL_PROCS/PROCS_PER_NODE)), or using the --exclusive flag, which requests all processors on all nodes of the job

  2. Disabling CPU binding, using the --cpu-bind=none option to srun, or the --bind-to-none option to mpirun

Clone this wiki locally