Skip to content

Commit

Permalink
Fix issue where job steps wouldn't run if the first node was full
Browse files Browse the repository at this point in the history
In a multi-node job it was possible to be in a situation where there
were more CPUs available for steps to use but steps would not launch.
For example, if a node has 2 cores and 1 thread per core and this job is
submitted:

sbatch -N2 --ntasks-per-node=2 --mem=1000 job.bash

And job.bash contains the following:

for i in {1..4}
do
	srun --exact --mem=100 -N1 -c1 -n1 sleep 60 &
done
wait

In this case, two steps would run on the first node and one step would
run on the second node, but the fourth step would not run until the
first step completed, even though there is an available task and CPU on
the second node in the allocation. Why does this happen?

If the step requests CPUs <= number of nodes, then when _pick_step_nodes()
calls _pick_step_nodes_cpus:

            node_tmp = _pick_step_nodes_cpus(job_ptr, nodes_avail,
                             nodes_needed,
                             cpus_needed,
                             usable_cpu_cnt);

it will simply return the first N nodes from the nodes_avail bitmap,
where N is the number of nodes that the step requested.

In this example job, all the CPUs on the first node are allocated, but
the first node remains in the nodes_avail bitmap. Then
_pick_step_nodes_cpus() selects the first node  and adds it to the
nodes_picked bitmap. Right after that, _pick_step_nodes() gets the
number of CPUs from nodes in the nodes_picked bitmap, which is 0 CPUs.

The fix is to remove fully allocated nodes from nodes_avail bitmap. But
this also creates a problem where once all the nodes are fully allocated
and another valid step request comes, then an incorrect error message of
ESLURM_REQUESTED_NODE_CONFIG_UNAVAILABLE would happen, when the correct
error message is ESLURM_NODES_BUSY. So we increment job_blocked_nodes if
there are no available cpus on the node.

Bug 11357
  • Loading branch information
MarshallGarey authored and gaijin03 committed Apr 29, 2021
1 parent 63e94c2 commit 6a2c99e
Show file tree
Hide file tree
Showing 2 changed files with 19 additions and 0 deletions.
3 changes: 3 additions & 0 deletions NEWS
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,9 @@ documents those changes that are of interest to users and administrators.
indefinitely.
-- select/cons_tres - fix Dragonfly topology not selecting nodes in the same
leaf switch when it should as well as requests with --switches option.
-- Fix issue where certain step requests wouldn't run if the first node in the
job allocation was full and there were idle resources on other nodes in
the job allocation.

* Changes in Slurm 20.11.6
==========================
Expand Down
16 changes: 16 additions & 0 deletions src/slurmctld/step_mgr.c
Original file line number Diff line number Diff line change
Expand Up @@ -1126,12 +1126,15 @@ static bitstr_t *_pick_step_nodes(job_record_t *job_ptr,
cpus_used[node_inx];
job_blocked_cpus += job_resrcs_ptr->
cpus_used[node_inx];
if (!total_cpus)
job_blocked_nodes++;
}
}

if (!total_cpus) {
log_flag(STEPS, "%s: %pJ Skipping node. Not enough CPUs to run step here.",
__func__, job_ptr);
bit_clear(nodes_avail, i);
continue;
}

Expand Down Expand Up @@ -1432,6 +1435,12 @@ static bitstr_t *_pick_step_nodes(job_record_t *job_ptr,
usable_cpu_cnt[i] =
job_resrcs_ptr->cpus[node_inx];

log_flag(STEPS, "%s: %pJ Currently running steps use %d of allocated %d CPUs on node %s",
__func__, job_ptr,
job_resrcs_ptr->cpus_used[node_inx],
usable_cpu_cnt[i],
node_record_table_ptr[i].name);

if (step_spec->flags & SSF_EXCLUSIVE) {
/*
* If whole is given and
Expand All @@ -1453,8 +1462,15 @@ static bitstr_t *_pick_step_nodes(job_record_t *job_ptr,
usable_cpu_cnt[i] -=
job_resrcs_ptr->
cpus_used[node_inx];
if (!usable_cpu_cnt[i])
job_blocked_nodes++;
}
}
if (!usable_cpu_cnt[i]) {
log_flag(STEPS, "%s: %pJ Skipping node. Not enough CPUs to run step here.",
__func__, job_ptr);
bit_clear(nodes_avail, i);
}
}

}
Expand Down

0 comments on commit 6a2c99e

Please sign in to comment.