- 
                Notifications
    You must be signed in to change notification settings 
- Fork 929
Closed
Description
I am testing 1.10.3rc3 on our local cluster (cents 6.7). The map-by behavior is generally consistent, except for cores+1 deployments:
with 120 cores total, requesting for 122 processes gives the expected error message about oversubscribing being a bad idea. However, requesting for 121 processes does not give an error, which I believe is incorrect.
See below the replicator.
/opt/ompi-1.10.3rc3/bin/mpirun -hostfile /opt/etc/nd.machinefile.ompi -np 122 --display-allocation   -map-by node hostname 
======================   ALLOCATED NODES   ======================
    nd01: slots=20 max_slots=0 slots_inuse=0 state=UNKNOWN
    nd02: slots=20 max_slots=0 slots_inuse=0 state=UNKNOWN
    nd03: slots=20 max_slots=0 slots_inuse=0 state=UNKNOWN
    nd04: slots=20 max_slots=0 slots_inuse=0 state=UNKNOWN
    nd05: slots=20 max_slots=0 slots_inuse=0 state=UNKNOWN
    nd06: slots=20 max_slots=0 slots_inuse=0 state=UNKNOWN
=================================================================
--------------------------------------------------------------------------
A request was made to bind to that would result in binding more
processes than cpus on a resource:
   Bind to:     NONE:IF-SUPPORTED
   Node:        nd02
   #processes:  11
   #cpus:       10
You can override this protection by adding the "overload-allowed"
option to your binding directive.
--------------------------------------------------------------------------
12:10:12@dancer:~/ompi/imb/4.1/src                                                                                                                                       Wed May. 25; 21 users, load5,15: 2.71,2.07
$ /opt/ompi-1.10.3rc3/bin/mpirun -hostfile /opt/etc/nd.machinefile.ompi -np 121 --display-allocation   -map-by node hostname 
======================   ALLOCATED NODES   ======================
    nd01: slots=20 max_slots=0 slots_inuse=0 state=UNKNOWN
    nd02: slots=20 max_slots=0 slots_inuse=0 state=UNKNOWN
    nd03: slots=20 max_slots=0 slots_inuse=0 state=UNKNOWN
    nd04: slots=20 max_slots=0 slots_inuse=0 state=UNKNOWN
    nd05: slots=20 max_slots=0 slots_inuse=0 state=UNKNOWN
    nd06: slots=20 max_slots=0 slots_inuse=0 state=UNKNOWN
=================================================================
nd01
nd01
nd01